九十七學年度下學期 類神經網路 研究計畫書
This proposal presents an intelligent inference system to predict the propensity of customers, where artificial intelligence (AI) techniques such as back propagation neural network (BPN) will be combined to make good predictions of the target variables.
CRM (customer relationship management) is an information industry term for methodologies, software, and usually Internet capabilities that help an enterprise manage customer relationships in an organized way. For example, an enterprise might build a database about its customers that described relationships in sufficient detail so that management, salespeople, people providing service, and perhaps the customer directly could access information, match customer needs with product plans and offerings, remind customers of service requirements, know what other products a customer had purchased, and so forth. We focus on three properties: churn, appetency ,and up-selling.
In the first phase, we use several kinds of
sampling methods to analyze the value of input nodes and choose better
features. In the second phase, we try to find other potential features.
Finally, we adopt all proper features to implement our BPNs. After training,
the system performance was tested on test data sets provided by the French
Customer Relationship Management (CRM), churn, appetency, up-selling, Back Propagation Network (BPN)
Customer Relationship Management (CRM) is a key element of modern marketing strategies. CRM consists of the processes a company uses to track and organize its contacts with its current and prospective customers. CRM software is used to support these processes; information about customers and customer interactions can be entered, stored and accessed by employees in different company departments. The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn), buy new products or services (appetency), or buy upgrades or add-ons proposed to them to make the sale more profitable (up-selling).
In a CRM system, to build knowledge on customer is to produce scores. A score (the output of a model) is an evaluation for all instances of a target variable to explain (i.e. churn, appetency or up-selling).Tools which produce scores allow to project, on a given population, quantifiable information. The score is computed using input variables which describe instances. Scores are then used by the information system (IS), for example, to personalize the customer relationship. The rapid and robust detection of the variables that have most contributed to the output prediction can be a key factor in a marketing application. The task is to estimate the churn, appetency and up-selling probability of customers, hence there are three target values to be predicted. A large number of variables (15,000) are made available for prediction.
The method of this proposal is Back Propagation Network (BPN), which is the most representative of the artificial neural networks. It is a kind of supervised learning network and very powerful in terms of assessment and prediction. Many practical applications are conducted to demonstrate the detection potential of the BPN. It has the learning procedure and the recalling procedure.
BPN algorithm applies the basic principle of the gradient steepest descent method to minimize the error function. It compares the outputs of the processing units in the output layer with desired outputs to adjust the connecting weights. The weights between two neurons in two adjacent layers are adjusted through an iterative training process while training samples are presented to the network. A closely approximation of the transformation function which is compared the outputs of the processing units in the output layer with desired outputs can be acquired.
There are seven steps in BPN algorithm:
Step 1: Set network parameters.
Step2: Randomly generate the initial weight matrix and bias matrix for input and hidden layers and weight matrix and bias matrix for hidden and output layers.
Step 3: Input the training patterns X and desired output T.
Step 4: Compute the inferred output Y.
(1) Compute the outputs of hidden layer H,
(2) Compute the inferred output Y,
Step5: Compute δ.
(1) Compute δ of output layer.
(2) Compute δ of hidden layer.
Step 6: Adjust the weight matrix and the bias matrix ∆θ .
(1) Compute the weight matrix of output layer and bias matrix of output layer, where η is the learning rate.
(2) Compute the weight matrix of hidden layer and bias matrix of hidden layer, where η is the learning rate.
Step 7: Update the weight matrix and the adjusted bias matrix.
(1) Update the weight matrix of output layer and bias matrix of output layer,
(2) Update the weight matrix of hidden layer and bias matrix of hidden layer,
Step 8: Repeat steps 3 to 7 until convergence E or the number of training iterations exceeds the predefined threshold.
After having been trained, the network can be used to classify target data. The target data are then fed into the network, and the output with highest value will be taken as the prediction.
The network construction is as below：
There are a large number of variables (15,000) is made available for prediction.
The first 14,740 variables are numerical and the last 260 are categorical, so data selection plays an important role in prediction. It is defined as the process of determining the appropriate data type and source, as well as suitable instruments to collect data. The process of selecting suitable data for a research project can impact data integrity. A variety of sampling procedures are available to reduce the likelihood of drawing a biased sample, and some of them are listed below:
1. Simple random sampling
2. Stratified sampling
3. Cluster sampling
4. Systematic sampling
These methods of sampling try to ensure the representativeness from the entire population by incorporating an element of randomness to the selection procedure, and thus a greater ability to generalize findings to the targeted population.
A normalization process of the input data is necessary. It encodes the data to fill into input and output layers of BPN model. We have designed proper input and output nodes to utilize the architecture of NNs accompanying with the characteristics of data.
There are a large number of variables (15,000) is made available for prediction. In all of them, which variables should be extracted is a main problem. At the same time, how can we find all useful variables and abandon noise variables? We probably spend a lot of time on data selection, in which we choose Cluster sampling as the main method to get a small set of variables that is enough to stand for customer properties. However, if we can’t come to a good conclusion through this way, second choice is Simple random sampling.
1. Data selection
2. Data transformation
3. The result of Back Propagation Network by matlab