User Tools

Site Tools


bicn01:game_datamining

Analytical CRM

It is important to understand the data mining process. This hands-ons consists of two steps.

  • Step 1: train and test a data mining algorithmn.
  • Step 2: deploy the trained algorithmn.

Step 1: Train and Test a Mining Algorithmn

  • Open Visual Studio
  • Create a new Analysis Services and Data Mining Project
  • Name it NO SID Customer Profiling and change the project location directory to c:\temp.

Create a Data Source Connection

In the Customer Relationship Management system, you will find a data set about our customers, the customer orders and the data about the ZIP codes.

Create a Connection to the CRM System

  • Database Server: crm.hdm-server.eu
  • User: CRM_ETL_USER
  • Passwort: password123
  • Don't forget to check Save Password :!:
  • Select the database BIGAME from the drop-down menu
  • Next
  • Important :!: Set the impersonalisation information to inherit
  • Rename the connection to CRM.ds

034

Create a Data Source View

Get Actual Customer Data

  • Get ETL_CRM_ACTUAL_CUSTOMER from the CRM system.

Click to Enhance

Please note, that there is only information about the customer. Right click on Explore data to see some sample data.

035

Add a Bike Buyer Flag

We want to identify the Bike Buyer. How do we do that?

We create a flag BikeBuyerFlag as a new column.

  • Create a New Named Calculation.
  • Column name: BikeBuyerFlag
  • Description: Indicates, if a customer bought a bike in the past
  • Expression:
CASE WHEN [Product Category] like '%Bikes' THEN 'yes' ELSE 'no' END

FIXME

Now we are ready.

Let’s go with Data Mining!!!

037

Data Mining

You will now train, test, and explore a data mining models. Then, you are ready to use the models to identify the people most likely to respond to the new targeted mailing campaign.

Let us create a new mining structure.

The Data Mining Wizard starts.

The method we want to use to define the mining structure is from existing relational database.

  • Create mining structure with Microsoft Decision Tree.

There is only one data source view available. Select it.

  • There is also only one table - the actual customer - for our analysis available. Select it.

The Specify the Training Data – window is very important. Select

  • the BikeBuyerFlag as the predictable field.
  • all attributes to be part of the mining model structure (checkbox upper left). This allows us to look at the data fields after the mining process.
  • the attribute CUSTOMER_ID as a key column.

The next question is, which attributes determines a sales of a bike?

Let us discuss this together…

If you don’t want to wait, select

  • CustomerAgeGroup
  • CustomerGender
  • MonthOfSales
  • ZIP_BikeFriendlyArea
  • ZIP_Landscape
  • ZIP_PublicTransportQuality
  • ZIP_SocialClass
  • ZIP_DistanceToSalesOffice

In the Specify Columns‘ Content and Data Type Window, it is important to set the correct Content Type.

Note: Discrete data that can only take certain values, such as “red”, “yellow” or “green”. If a variable can take on any value between two specified values, it is called a continuous variable.

It is important, that the BikeBuyerFlag is a discrete data type.

Before you complete the wizard, name the Mining structure and Mining model.

  • Mining Struction: NO SID CustomerProfiling
  • Mining Model: NO SID CustomerProfiling DS
  • Set Allow Drill-Through

Before you deploy your mining model on the server, check under ProjectProperties if the deployment server is set to NO MDB

Let’s go! Run Build → Deploy.

You should get

To analyse your data in the Mining Model Viewer, set the background to yes. Study the decision tree. Click on a dark box and read the legend. Do you understand it?

Use the clustering algorithm to have a second algorithm.

Analyse it.

Then, in the Input Selection Window, set the Prediction Value to yes.

In the Mining Accuracy Chart, look at the Lift Chart. Do you understand it?

Close Visual Studio.

038

Step 2: Lead Generation with Prospect Customers

Idea: get the data from the CRM system, apply the profiling on it, calculate the probability and write the data back to CRM.

We use Integration Services to define a data flow.

Steps:

  • Get prospects from CRM
  • apply data on the mining structure
  • write data into CRM

Create an ETL project

  • Open Visual Studio
  • Create a new Integration Services Project
  • Name it NO SID Lead_Generation and change the project location directory to c:\temp.

Create Connections

Create a Connection to the CRM-System

In Connection Manager, create an OLEDB Connection to the CRM system. We get the prospective customers from there.

  • CRM Server: crm.hdm-server.eu
  • User: CRM_ETL_USER
  • Password: password123
  • Don't forget to check Save Password
  • Select the database BIAGAME from the drop-down menu
  • Save your connection.
  • Rename the connection to CRM.conmgr

Create a Data Flow

Create a new SSIS Package, or rename the default package to Lead_Generation.dtsx

Create a Data Flow Task and name it Lead Generation.

Tipp: if the SSIS Toolbox does not show up, you can display it via Menu → SSIS → SSIS Toolbox.

Get Prospects from CRM

Add an OLE DB Source to get the prospects from the CRM system.

  • Add a OLE DB Source from the SSIS Toolbox.
  • Rename the OLE DB Source to Get Prospects from CRM.
  • Edit the OLE DB Source
  • Add your CRM connection manager to the OLE DB Source Editor.
  • Add the ETL_CRM_PROSPECTIVE_CUSTOMERS extractor as well.
  • Preview some data.
  • Save your configuration.

Add BikeBuyerFlag and Username

Next, we add a new column BikeBuyerFlag to our prospective data. The BikeBuyerFlag is used to set the prediction and your username will help us later to identify your data.

  • Add a derived column transformation and name it Add BikeBuyerFlag.
  • In the transformation editor, add the derived columns name BikeBuyerFlag and set the expression to “yes”.
  • Add a new derived column, name it UserName and set the expression to your CRM username, here: NO U01.
  • Add a new derived column, name it StudentID and set the expression to your BI Academy username, here: NO SID.

Add ZIP data

Next, we will add the attributes from the ZIP data to the data set.

  • Add a Lookup-Transformation.
  • Name this Object Lookup ZIP Attributes.

Create a connection to the ODS and select the object that contains the ZIP codes. In the columns window, create a relationship between ZIP and City and select all attributes starting with ZIP .

Add Data Mining Profiling

Apply Data Mining to your Data Set

Next, we have to apply the data mining transformation to our data.

Add a Data Mining Query to your data flow.

In the input selection window, select lookup match output.

  • The Input should be AnalysisServiceServerInput.

Open the Data Mining Query Transformation.

The connection should point to your aCRM_Lead_Generation project.

Create a new connection.

  • Server: NO MDB
  • Authentification: Use Windows NT Integrated Security
  • Select your mining model from the initial Catalog: #SID# Customer Profiling

Select your mining structure and the mining model.

Next, we have to build a mining query.

  • Open the build query window.

Next, we want to calculate the probability that indicates the percent value that a prospect is likely to buy a bike.

  • Click to an empty source field and select Prediction Function.
  • In the field field, select PredictProbability.
  • For the criteria, drag and drop the Bike Buyer Flag from the mining model to the criteria filed.
  • Set an alias to Probability.
  • Add the Prospective ID to the query.
  • Add the Username to the query.
  • Add the Campagin to the query.
  • Add the StudentID to the query.

Next, we have to check the mapping between the mining model and the input model.

Click on any line between the models and select Modify Mapping . The mapping should connect

  • Bike Buyer Flag → BikeBuyerFlag
  • Custmer Age Group → Age Group
  • Gender → Gender
  • Customer ID → Prospect ID

and

  • All ZIP Fields → ZIP fields

Write the Data back into CRM

How to write data back to CRM? A good CRM or ERP system contains retractors (versus extractors). Retractor programs insert or update data in a system. The hide the complexity of a data struction.

Our retractor is called crm_retractor_leads.

It required to following parameter:

  • Your CRM username
  • The prospect ID
  • The BikeBuyerFlag
  • The Probabilty
  • The campaign name

Add an OLEDB command transformation and name it CRM retractor Leads.

In the connection manager window, select your CRM system under “connection manager”

In the Component Properties, set the SQLCommand to

execute crm_retractor_leads ?, ?, ?

The ? is used as a parameter place holder.

In the column mapping, map

  • USERNAME → @USER
  • PROSPECT_ID → @PID
  • Probabiltiy → @PROB

Run the package.

039

Check your data

Login to the CRM system and check your leads.

Open Customer Relationship ManagementSD - Your Leads.

Select ZIP-Codes that start with a 7…

Click on Predict BikeBuyer.

The prediction may run a few seconds. Have a look at the results. The probability column shows the probability that a prospective customer is likely to buy a bike.

040

bicn01/game_datamining.txt · Last modified: 2020/06/28 22:01 by peter