THE TASK
The assignment is to build a predictive model for data supplied by an insurance company, who are interested in identifying customers likely to purchase an insurance policy for a mobile home. The prediction task is motivated by the decision to include customers in mailing. A mail will be sent only to the customers with a high probability of becoming mobile home insurance policy holders.
You have been given a database of existing customers, which you will use for building a predictive model. The client wants you to predict whether a customer will have a caravan insurance policy from other data about the customer. Data about customers consists of 51 variables and includes product usage data and socio-demographic data derived from postcodes. The training set contains 5300 descriptions of customers, including the information of whether or not they have a mobile home insurance policy. A test set contains 4000 customers of whom you don’t know if they have a mobile home insurance policy.
The deliverable for the assignment is a written report describing and justifying the steps you have taken and their results, including charts and numerical results where appropriate.
SOFTWARE TO USE: WEKA
ASSESSMENT CRITERIA/MARKING SCHEME
a) 20% for the response to comments
b) 50% for the first report, including:
a. 5% for a clear and substantial executive summary,
b. 10% for the style of the report (structure, non-technical language, charts),
c. 10% for demonstrating good understanding of the business problem and its scale,
d. 10% for identifying the potential processes/areas of improvement,
e. 10% for proposing, describing and justifying the solutions to the identified problems,
f. 5% for a clear conclusion to your report, consistent with the preceding discussion and summarising your main arguments.
c) 30% for the second report and the developed model, including:
a. 5% for the style of the report,
b. 10% for following a logical model development process,
c. 10% for the description of the process, where it should be made clear how your findings at a preceding step inform the next steps you take.
REFERENCES
1. Witten, I. and Frank, E. and Hall, M. (2011) “Data Mining: Practical Machine Learning Tools
REFERENCING STYLE
Harvard Referencing
SUBMISSION DETAILS
Normal
0
false
false
false
EN-GB
X-NONE
X-NONE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:”Table Normal”;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:””;
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin-top:0cm;
mso-para-margin-right:0cm;
mso-para-margin-bottom:8.0pt;
mso-para-margin-left:0cm;
line-height:107%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:”Calibri”,sans-serif;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-ansi-language:EN-GB;}
Part I. Report (indicative maximum 1,500 words)