九十七學年度下學期 類神經網路 研究計畫書

電機碩一    M9707503    張家豪

一、研究計畫中英文摘要:請就本計畫要點作一概述,並依本計畫性質自訂關鍵詞。(五百字以內)

In neural network curriculum, teacher asks students to execute data mining work in KDD CUP 2009. The annual ACM SIGKDD conference is the premier international forum for data mining researchers and practitioners from academia, industry, and government to share their ideas, research results and experiences. KDD-09 will feature keynote presentations, oral paper presentations, poster sessions, workshops, tutorials, panels, exhibits, demonstrations, and the KDD Cup competition.

    In this KDD CUP competition, we will to analyze CMR data. Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn), buy new products or services (appetency), or buy upgrades or add-ons proposed to them to make the sale more profitable (up-selling).

    In this plane, we will use neural network tool (ex: WEKA or MATLAB) to execute this work, and propose a research result in May.

Key word: KDD Cup 2009, CRM, Neural network, Data mining



二、研究計畫內容:

(一)研究計畫之背景及目的。請詳述本研究計畫之背景、目的、重要性及國內外有關本計畫之研究情況、重要參考文獻之評述等。

KDD CUP 2009 Dataset Introduction:
Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn), buy new products or services (appetency), or buy upgrades or add-ons proposed to them to make the sale more profitable (up-selling).

    The most practical way, in a CRM system, to build knowledge on customer is to produce scores. A score (the output of a model) is an evaluation for all instances of a target variable to explain (i.e. churn, appetency or up-selling). Tools which produce scores allow to project, on a given population, quantifiable information. The score is computed using input variables which describe instances. Scores are then used by the information system (IS), for example, to personalize the customer relationship. An industrial customer analysis platform able to build prediction models with a very large number of input variables has been developed by Orange Labs. This platform implements several processing methods for instances and variables selection, prediction and indexation based on an efficient model combined with variable selection regularization and model averaging method. The main characteristic of this platform is its ability to scale on very large datasets with hundreds of thousands of instances and thousands of variables. The rapid and robust detection of the variables that have most contributed to the output prediction can be a key factor in a marketing application.

    The challenge is to beat the in-house system developed by Orange Labs. It is an opportunity to prove that you can deal with a very large database, including heterogeneous noisy data (numerical and categorical variables), and unbalanced class distributions. Time efficiency is often a crucial point. Therefore part of the competition will be time-constrained to test the ability of the participants to deliver solutions quickly.

Artificial Intelligence (AI) Introduction:
Artificial intelligence (AI) is the intelligence of machines and the branch of computer science which aims to create it. Major AI textbooks define the field as "the study and design of intelligent agents," where an intelligent agent is a system that perceives its environment and takes actions which maximize its chances of success. John McCarthy, who coined the term in 1956, defines it as "the science and engineering of making intelligent machines."

    The field was founded on the claim that a central property of human beings, intelligence—the sapience of Homo sapiens—can be so precisely described that it can be simulated by a machine. This raises philosophical issues about the nature of the mind and limits of scientific hubris, issues which have been addressed by myth, fiction and philosophy since antiquity. Artificial intelligence has been the subject of breathtaking optimism, has suffered stunning setbacks and, today, has become an essential part of the technology industry, providing the heavy lifting for many of the most difficult problems in computer science.

    AI research is highly technical and specialized, so much so that some critics decry the "fragmentation" of the field. Subfields of AI are organized around particular problems, the application of particular tools and around long standing theoretical differences of opinion. The central problems of AI include such traits as reasoning, knowledge, planning, learning, communication, perception and the ability to move and manipulate objects. General intelligence (or "strong AI") is still a long term goal of (some) research.

    In the middle of the 20th century, a handful of scientists began a new approach to building intelligent machines, based on recent discoveries in neurology, a new mathematical theory of information, an understanding of control and stability called cybernetics, and above all, by the invention of the digital computer, a machine based on the abstract essence of mathematical reasoning.

    The field of modern AI research was founded at a conference on the campus of Dartmouth College in the summer of 1956. Those who attended would become the leaders of AI research for many decades, especially John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon, who founded AI laboratories at MIT, CMU and Stanford. They and their students wrote programs that were, to most people, simply astonishing: computers were solving word problems in algebra, proving logical theorems and speaking English. By the middle 60s their research was heavily funded by the U.S. Department of Defense and they were optimistic about the future of the new field:

● 1965, H. A. Simon: "Machines will be capable, within twenty years, of doing any work a man can do."
● 1967, Marvin Minsky: "Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved."

    These predictions, and many like them, would not come true. They had failed to recognize the difficulty of some of the problems they faced. In 1974, in response to the criticism of England's Sir James Lighthill and ongoing pressure from Congress to fund more productive projects, the U.S. and British governments cut off all undirected, exploratory research in AI. This was the first AI winter.

    In the early 80s, AI research was revived by the commercial success of expert systems, a form of AI program that simulated the knowledge and analytical skills of one or more human experts. By 1985 the market for AI had reached more than a billion dollars and governments around the world poured money back into the field. However, just a few years later, beginning with the collapse of the Lisp Machine market in 1987, AI once again fell into disrepute, and a second, more lasting AI winter began.

    In the 90s and early 21st century AI achieved its greatest successes, albeit somewhat behind the scenes. Artificial intelligence is used for logistics, data mining, medical diagnosis and many other areas throughout the technology industry. The success was due to several factors: the incredible power of computers today (see Moore's law), a greater emphasis on solving specific sub problems, the creation of new ties between AI and other fields working on similar problems, and above all a new commitment by researchers to solid mathematical methods and rigorous scientific standards.

Neural Network Introduction:
An artificial neural network (ANN), often just called a "neural network" (NN), is a mathematical model or computational model based on biological neural networks. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase.

    There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. Usually any given type of network architecture can be employed in any of those tasks.

● Supervise Learning: Using the known samples train neural network to expected the answer, ex: Perceptron, BPN, PNN, LVQ, CPN
● Unsupervised Learning: Through many input samples fix and learn system to train neural network, ex: SOM, ART
● Associative memory Learning: System train and remember all samples to expected the answer, ex: Hopfield, Bidirectional Associative Memory(BAM), Hopfield-Tank
● Optimization Application: Fine optimal solution, ex: ANN, HTN

(二)研究方法、進行步驟及執行進度。請列述:1.本計畫採用之研究方法與原因。2.預計可能遭遇之困難及解決途徑。

I will study about NN data before to execute the work. And I will know the KDD CUP data format before I use this data. However, the KDD CUP dataset is very huge. The data have 50000 samples data, and every data have 15000 variables. We must to reduce the KDD CUP data, because they are not work in a computer. How to reduce the KDD CUP samples data? I will to analyze the KDD CUP data, and I removal I think not important data variables. It can reduce training variables. Then I will randomly select about 1000 samples data to train the NN system. Because the dataset is vary huge, so if randomly select samples data is confidence. Now I must to know NN software (ex: MATLAB or WEKA), and how to use it. And I must to know KDD CUP dataset’s meaning. These problems are to be solved. Next I will introduce my NN framework; I use the Back Propagation Network (BPN) to solve the KDD CUP problem. The BPN introduction is below:

BPN Introduction:
Back Propagation Network (BPN) is the most representative in neural network; it is belong to supervise learning. Because the BPN is better than ancient neural network. The method reform the perceptron cannot solve XOR problem. Its theorem is base on Gradient steepest descent method to minimize error.

BPN Framework:
A BPN network is including many layer, every layer is include many processing units.

● Input Layer: The network input variables; the processing number is decided processing units. It uses linear transform function.
● Hidden Layer: Input processing unit interaction, the processing number is not standard to decide. It requires experiment in many times to decide the processing number. It uses non-linear transform function, and the network have the hidden layer is not only one layer. It can have no any hidden layer.
● Output Layer: The network output variables; the processing number is decided processing units. It uses non-linear transform function.
● Weights: Every layer processing units is connecting. Every connecting has a weight to memory strength.




Fig. 1. BPN Framework.

    Finally, I will introduce NN software. In general, we often use MATLAB or WEKA to implement NN, Two of the NN software introduction is below:

Matlab:
MATLAB (meaning "matrix laboratory") was invented in the late 1970s by Cleve Moler, then chairman of the computer science department at the University of New Mexico. He designed it to give his students access to LINPACK and EISPACK without having to learn Fortran. It soon spread to other universities and found a strong audience within the applied mathematics community. Jack Little, an engineer, was exposed to it during a visit Moler made to Stanford University in 1983. Recognizing its commercial potential, he joined with Moler and Steve Bangert. They rewrote MATLAB in C and founded The MathWorks in 1984 to continue its development. These rewritten libraries were known as JACKPAC. In 2000, MATLAB was rewritten to use a newer set of libraries for matrix manipulation, LAPACK.

MATLAB was first adopted by control design engineers, Little's specialty, but quickly spread to many other domains. It is now also used in education, in particular the teaching of linear algebra and numerical analysis, and is popular amongst scientists involved with image processing.

WEKA:
WEKA is a comprehensive toolbench for machine learning and data mining. Its main strengths lie in the classification area, where all current ML approaches -- and quite a few older ones -- have been implemented within a clean, object-oriented Java class hierarchy. Regression, Association Rules and clustering algorithms have also been implemented.

(三)預期完成之工作項目及成果。請列述:1.預期完成之工作項目。

We use NN software to train a NN system. Use the NN system we can prediction final results. Finally, we compare the final results with other to estimate NN system score.

Reference

[1] KDD Cup 2009, http://www.kddcup-orange.com/

[2] Wikipedia - Neural network, http://en.wikipedia.org/wiki/Neural_Network

[3] Wikipedia - Artificial intelligence, http://en.wikipedia.org/wiki/Artificial_intelligence

[4] 朝陽科技大學, 資訊管理系, 李麗華教授, Artificial Neural Network

[5] 羅華強, 類神經網路 – MATLAB的應用, 高立圖書

[6] 張斐章, 張麗秋, 類神經網路, 東華書局