**Neural network homework 2- KDD Cup 2009 ****CRM**** ****Prediction**

Instructor¡GDr. Hahn-Ming

Wei,Sin-Jong D9704008

**Abstract**

KDD Cup
2009 focuses on predicting customer relationship management for churn, appetency,
up-selling. there are 15000 variables and
50000 instances with a very large database which
are from French Telecom Company Orange .the datasets are including numerical and categorical variables, and unbalanced class
distributions. It consisted of three accuracy prediction tasks with
time efficiency.

**Method**

I have to present my prediction method
for the tasks as following listed.

A. System

1. Implemented by Matlab.

2. Network: Back-propagation

3. Layer: 1

4. Epchos: 10000

5. learning rate:0.8

6. Momentum: 0.5

B. Processing

1. Feature
selection- Remove missing values, outlier
and noise in order to remain less than 5000
features.

2.
Selected datasets by random latest two times.

3. Normalize
for reduce the repeating data

4. Running
the Back-propagation neural network for data until
a optimal parameter returns to system.

5. The
predicted number of ratings for each is based on time series analysis, the test datasets also predicted with the
models.

6. To use 10-fold
cross-validation to find the best model with the accuracy rate between them.

no

yes

System
framework

**Analyzing the Prediction Result**

The datasets have split into 10 files, first work is to feature
selection, I need to know the instances of types the
variable, and the frequencies if approximate the data feature is not present,then
discard, like missing values, outlier and noise. Also discard the
amount if over 40000 instances zero values. but, they are still a big dataset as 5000 variables, so
I have random to selected some variables of them for normalize with process matrices by mapping row
minimum and maximum values to [-1 1].It has been take too much time on pre-process. And it¡¦s unlucky as fail
result. Spread my processing as below

Prediction by using the similarity matrix, the neural network parameters design as following,

`NodeNum = 100; % ÁôÂÃ¼h¸`ÂI¼Æ `

`TypeNum = 3; % ¿é¥Xºû¼Æ`

P1= load('C:\Program
Files\MATLAB\R

T1 = load('C:\Program
Files\MATLAB\R

P2 = load('C:\Program
Files\MATLAB\R

`TF1 = 'tansig';TF2 = 'purelin'; % §P§O¨ç¼Æ`

`net = newff(minmax(PN1),[NodeNum TypeNum],{TF1 TF2});`

_{ }

`net.trainFcn = 'traingd'; % ±è«×¤U°ºâªk`

`net.trainFcn = 'traingdm'; % °Ê¶q±è«×¤U°ºâªk`

`net.trainParam.show = 1; % °V½mÅã¥Ü¶¡¹j`

net.trainParam.lr = 0.8; % ¾Ç²ß³t²v - traingd,traingdm

net.trainParam.mc = 0.5; % °Ê¶q¶µ¨t¼Æ - traingdm,traingdx

net.trainParam.mem_reduc = 10; % ¤À¶ôpºâHessian¯x°}(¶È¹ïLevenberg-Marquardtºâªk¦³®Ä)

`net.trainParam.epochs = 10000; % ³Ì¤j°V½m¦¸¼Æ`

`net.trainParam.goal = 1e-8; % ³Ì¤p§¡¤è»~®t`

`net.trainParam.min_grad = 1e-20; % ³Ì¤p±è«×`

`net.trainParam.time = inf; % ³Ì¤j°V½m®É¶¡`

`net = train(net,PN1,T1); % °V½m`

`%---------------------------------------------------`

`% Testing`

`Y1 = sim(net,PN1); % °V½m¼Ë¥»¹ê»Ú¿é¥X`

`Y2 = sim(net,PN2); % ´ú¸Õ¼Ë¥»¹ê»Ú¿é¥X`

`Y1 = full(compet(Y1)); % Ävª§¿é¥X`

`Y2 = full(compet(Y2)); `

`%---------------------------------------------------`

Analyzing Result

Result = ~sum(abs(T1-Y1)) % ¥¿½T¤ÀÃþÅã¥Ü¬°1

`Percent1 = sum(Result)/length(Result) % °V½m¼Ë¥»¥¿½T¤ÀÃþ²v`

Result = ~sum(abs(T2-Y2)) % ¥¿½T¤ÀÃþÅã¥Ü¬°1

` `

**Conclusion**

The final result is not running out as failed.
The problem might is I did not select a good parameters for the moment. Either
the data is too large to rating of accuracy prediction, and possibly remove important
vectors while feature selection. I guess the main problem is I have been spent
too much time on pre-process and training with convenient training data to use on lower the system
resources. Another is the learning time should take shorter. Last, I thought it¡¦s a hard work to me
without lucky.

**References**

1. Peelen,
Ed. Customer Relationship Management, Pearson Education Limited, 2005

2. ÃÃ¥Ð¾Ë¤@,Customer Relationship Management,¥ý¾W, 2001

3. Anton, J. Customer Relationship Management, Prentice-Hall,Inc. 1996

4.
Brown, Stanley A. Customer Relationship Management, John Willey & Sons
Canada, Ltd. 2000

5. Dyche,Jill.
«È¤áÃö«YºÞ²z¤â¥U, Pearson Education Taiwan
Ltd., 2003

6. Customer
Relationship Management insight£S,»·ÀººÞ²zÅU°Ý¤½¥q, 2001

7. ÅU«ÈÃö«Y¥i¥H¦A¾aªñ¤@ÂI,¤Ñ¤U½s¿è,2004

8. ¸©É¦¨,Ãþ¯«¸gºô¸ô¼Ò¦¡À³¥Î»P¹ê§@,¾§ªL¹Ï®Ñ¦³¤½¥q,2002