It is important to understand the data mining process. This hands-ons consists of two steps.
In the Customer Relationship Management system, you will find a data set about our customers, the customer orders and the data about the ZIP codes.
Please note, that there is only information about the customer. Right click on Explore data to see some sample data.
We want to identify the Bike Buyer. How do we do that?
We create a flag BikeBuyerFlag as a new column.
Now we are ready.
Let’s go with Data Mining!!!
You will now train, test, and explore a data mining models. Then, you are ready to use the models to identify the people most likely to respond to the new targeted mailing campaign.
Let us create a new mining structure.
The Data Mining Wizard starts.
The method we want to use to define the mining structure is from existing relational database.
There is only one data source view available. Select it.
The Specify the Training Data – window is very important. Select
The next question is, which attributes determines a sales of a bike?
Let us discuss this together…
If you don’t want to wait, select
In the Specify Columns‘ Content and Data Type Window, it is important to set the correct Content Type.
It is important, that the BikeBuyerFlag is a discrete data type.
Before you complete the wizard, name the Mining structure and Mining model.
Before you deploy your mining model on the server, check under Project → Properties if the deployment server is set to NO MDB
Let’s go! Run Build → Deploy.
You should get
To analyse your data in the Mining Model Viewer, set the background to yes. Study the decision tree. Click on a dark box and read the legend. Do you understand it?
Use the clustering algorithm to have a second algorithm.
Then, in the Input Selection Window, set the Prediction Value to yes.
In the Mining Accuracy Chart, look at the Lift Chart. Do you understand it?
Close Visual Studio.
Idea: get the data from the CRM system, apply the profiling on it, calculate the probability and write the data back to CRM.
We use Integration Services to define a data flow.
In Connection Manager, create an OLEDB Connection to the CRM system. We get the prospective customers from there.
Create a new SSIS Package, or rename the default package to Lead_Generation.dtsx
Create a Data Flow Task and name it Lead Generation.
Tipp: if the SSIS Toolbox does not show up, you can display it via Menu → SSIS → SSIS Toolbox.
Add an OLE DB Source to get the prospects from the CRM system.
Next, we add a new column BikeBuyerFlag to our prospective data. The BikeBuyerFlag is used to set the prediction and your username will help us later to identify your data.
Next, we will add the attributes from the ZIP data to the data set.
Create a connection to the ODS and select the object that contains the ZIP codes. In the columns window, create a relationship between ZIP and City and select all attributes starting with ZIP .
Apply Data Mining to your Data Set
Next, we have to apply the data mining transformation to our data.
Add a Data Mining Query to your data flow.
In the input selection window, select lookup match output.
Open the Data Mining Query Transformation.
The connection should point to your aCRM_Lead_Generation project.
Create a new connection.
Select your mining structure and the mining model.
Next, we have to build a mining query.
Next, we want to calculate the probability that indicates the percent value that a prospect is likely to buy a bike.
Next, we have to check the mapping between the mining model and the input model.
Click on any line between the models and select Modify Mapping . The mapping should connect
How to write data back to CRM? A good CRM or ERP system contains retractors (versus extractors). Retractor programs insert or update data in a system. The hide the complexity of a data struction.
Our retractor is called crm_retractor_leads.
It required to following parameter:
Add an OLEDB command transformation and name it CRM retractor Leads.
In the connection manager window, select your CRM system under “connection manager”
In the Component Properties, set the SQLCommand to
The ? is used as a parameter place holder.
In the column mapping, map
Run the package.
Login to the CRM system and check your leads.
Open Customer Relationship Management → SD - Your Leads.
Select ZIP-Codes that start with a 7…
Click on Predict BikeBuyer.
The prediction may run a few seconds. Have a look at the results. The probability column shows the probability that a prospective customer is likely to buy a bike.