User Tools

Site Tools


Analytical CRM

It is important to understand the data mining process. This hands-ons consists of two steps.

  • Step 1: train and test a data mining algorithmn.
  • Step 2: deploy the trained algorithmn.

Step 1: Train and Test a Mining Algorithmn

  • Open Visual Studio
  • Create a new Analysis Services and Data Mining Project
  • Name it NO SID Customer Profiling and change the project location directory to c:\temp.

Create a Data Source Connection

In the Customer Relationship Management system, you will find a data set about our customers, the customer orders and the data about the ZIP codes.

Create a Connection to the CRM System

  • Database Server:
  • User: susan.brown
  • Passwort: password123
  • Don't forget to check Save Password :!:
  • Select the database AdventureBikes CRM from the drop-down menu
  • Next
  • Don't forget to set the impersonalisation information to inherit
  • Rename the connection to CRM.ds


Create a Data Source View

Get Customer Data

  • Get V_CRM_ACTUAL_CUSTOMERS from the data store.

Click to Enhance

Please note, that there is only information about the customer. Right click on Explore data to see some sample data.


Enhance the Data Set

We need to add additional fields about social democatics about the zip codes to the data.

  • Click on customer, right-mouse → Replace Table → With new Named Query

This is a great thing. You can use the entire language of SQL and make really good things with it.

In a named query, you can specify an SQL expression to select rows and columns returned from one or more tables in one or more data sources. A named query is like any other table in a data source view with rows and relationships, except that the named query is based on an expression.

A named query lets you extend the relational schema of existing tables without modifying the underlying data source. For example, a named query can be used to join multiple database tables from a data source into a single data source view table.

In the free space of the query definition editor, select Add Tables.

:-A Named Query - Login failed for user



Upps! You cannot mark any column. Bug or feature? That’s how it’s done (Workaround):

  • Click OK, close window.
  • Right Mouse –> Edit Named Query.

Voila! Now the column names are there.

Build two joins. Just drag-and-drop the column names for the actual customer table to the orders table.



2. Join:
Please note, in Germany a city may have multiple ZIP codes. So, a combound join (ZIP and City) is neccessary ;-)
  • Then, select all columns from the related tables, but exclude the key columns.

Click to Resize

Now we have nearly created the data basis. What’s missing?


Add a Bike Buyer Flag

We want to identify the Bike Buyer. How do we do that?

We create a flag BikeBuyerFlag as a new field in the Named Query.

  • Edit the name query. Best, make the query window full screen!
  • Enter a new (calculated) column in the window pane with the columns.
  • Use the following SQL statement which is equal with an if statement.

:!: Do not just copy-paste the html text into the query editor. Use notepad to convert html to plain text.

CASE WHEN ProductCategory like '%Bike%' THEN 'yes' ELSE 'no' END

Use BikeBuyerFlag as the alias name for this new column.

:-? How to add the BikeBuyerFlag to the Data Source

Now we are ready.

Let’s go with Data Mining!!!


Data Mining

You will now train, test, and explore a data mining models. Then, you are ready to use the models to identify the people most likely to respond to the new targeted mailing campaign.

Let us create a new mining structure.

The Data Mining Wizard starts.

The method we want to use to define the mining structure is from existing relational database.

  • Create mining structure with Microsoft Decision Tree.

There is only one data source view available. Select it.

  • There is also only one table type for our analysis available. Select it.

The Specify the Training Data – window is very important. Select

  • the BikeBuyerFlag as the predictable field.
  • all attributes to be part of tht mining model structure (checkbox upper left). This allows us to look at the data fields after the mining process.
  • the attribute CUSTOMER_ID as a key column.

The next question is, which attributes determines a sales of a bike?

Let us discuss this together…

If you don’t want to wait, select

  • CustomerAgeGroup
  • CustomerGender
  • MonthOfSales
  • ZIP_BikeFriendlyArea
  • ZIP_Landscape
  • ZIP_PublicTransportQuality
  • ZIP_SocialClass
  • ZIP_DistanceToSalesOffice

In the Specify Columns‘ Content and Data Type Window, it is important to set the correct Content Type.

Note: Discrete data that can only take certain values, such as “red”, “yellow” or “green”. If a variable can take on any value between two specified values, it is called a continuous variable.

It is important, that the BikeBuyerFlag is a discrete data type.

Before you complete the wizard, name the Mining structure and Mining model.

  • Mining Struction: NO SID CustomerProfiling
  • Mining Model: NO SID CustomerProfiling DS
  • Set Allow Drill-Through

Before you deploy your mining model on the server, check under ProjectProperties if the deployment server is set to NO MDB

Let’s go! Run Build → Deploy.

You should get

To analyse your data in the Mining Model Viewer, set the background to yes. Study the decision tree. Click on a dark box and read the legend. Do you understand it?

Use the clustering algorithm to have a second algorithm.

Analyse it.

Then, in the Input Selection Window, set the Prediction Value to yes.

In the Mining Accuracy Chart, look at the Lift Chart. Do you understand it?

Close Visual Studio.


Step 2: Lead Generation with Prospect Customers

Idea: get the data from the CRM system, apply the profiling on it, calculate the probability and write the data back to CRM.

We use Integration Services to define a data flow.


  • Get prospects from CRM
  • apply data on the mining structure
  • write data into CRM

Create an ETL project

  • Open Visual Studio
  • Create a new Integration Services Project
  • Name it NO SID Lead_Generation and change the project location directory to c:\temp.

Create Connections

Create a Connection to the CRM-System

In Connection Manager, create an OLEDB Connection to the CRM system. We get the prospective customers from there.

  • CRM Server:
  • User: CRM_ETL_USER
  • Password: password123
  • Don't forget to check Save Password
  • Select the database BIAGAME from the drop-down menu
  • Save your connection.
  • Rename the connection to CRM.conmgr

Create a Data Flow

Create a new SSIS Package, or rename the default package to Lead_Generation.dtsx

Create a Data Flow Task and name it Lead Generation.

Tipp: if the SSIS Toolbox does not show up, you can display it via Menu → SSIS → SSIS Toolbox.

Get Prospects from CRM

Add an OLE DB Source to get the prospects from the CRM system.

  • Add a OLE DB Source from the SSIS Toolbox.
  • Rename the OLE DB Source to Get Prospects from CRM.
  • Edit the OLE DB Source
  • Add your CRM connection manager to the OLE DB Source Editor.
  • Add the ETL_CRM_PROSPECTIVE_CUSTOMERS extractor as well.
  • Preview some data.
  • Save your configuration.

Add BikeBuyerFlag and Username

Next, we add a new column BikeBuyerFlag to our prospective data. The BikeBuyerFlag is used to set the prediction and your username will help us later to identify your data.

  • Add a derived column transformation and name it Add BikeBuyerFlag.
  • In the transformation editor, add the derived columns name BikeBuyerFlag and set the expression to “yes”.
  • Add a new derived column, name it UserName and set the expression to your CRM username, here: NO U01.

Add ZIP data

Next, we will add the attributes from the ZIP data to the data set.

  • Add a Lookup-Transformation.
  • Name this Object Lookup ZIP Attributes.

Create a connection to the ODS and select the object that contains the ZIP codes. In the columns window, create a relationship between ZIP and City and select all attributes starting with ZIP .

Add Data Mining Profiling

Apply Data Mining to your Data Set

Next, we have to apply the data mining transformation to our data.

Add a Data Mining Query to your data flow.

In the input selection window, select lookup match output.

  • The Input should be AnalysisServiceServerInput.

Open the Data Mining Query Transformation.

The connection should point to your aCRM_Lead_Generation project.

Create a new connection.

  • Server: NO MDB
  • Authentification: Use Windows NT Integrated Security
  • Select your mining model from the initial Catalog: #SID# Customer Profiling

Select your mining structure and the mining model.

Next, we have to build a mining query.

  • Open the build query window.

Next, we want to calculate the probability that indicates the percent value that a prospect is likely to buy a bike.

  • Click to an empty source field and select Prediction Function.
  • In the field field, select PredictProbability.
  • For the criteria, drag and drop the Bike Buyer Flag from the mining model to the criteria filed.
  • Set an alias to Probability.
  • Add the Prospective ID to the query.
  • Add the Username to the query.
  • Add the Campagin to the query.

Next, we have to check the mapping between the mining model and the input model.

Click on any line between the models and select Modify Mapping . The mapping should connect

  • Bike Buyer Flag → BikeBuyerFlag
  • Custmer Age Group → Age Group
  • Gender → Gender
  • Customer ID → Prospect ID


  • All ZIP Fields → ZIP fields

Write the Data back into CRM

How to write data back to CRM? A good CRM or ERP system contains retractors (versus extractors). Retractor programs insert or update data in a system. The hide the complexity of a data struction.

Our retractor is called crm_retractor_leads.

It required to following parameter:

  • Your CRM username
  • The prospect ID
  • The BikeBuyerFlag
  • The Probabilty
  • The campaign name

Add an OLEDB command transformation and name it CRM retractor Leads.

In the connection manager window, select your CRM system under “connection manager”

In the Component Properties, set the SQLCommand to

execute crm_retractor_leads ?, ?, ?

The ? is used as a parameter place holder.

In the column mapping, map

  • Probabiltiy → @PROB

Run the package.


Check your data

Login to the CRM system and check your leads.

Open Customer Relationship ManagementSD - Your Leads.

Select ZIP-Codes that start with a 7…

Click on Predict BikeBuyer.

The prediction may run a few seconds. Have a look at the results. The probability column shows the probability that a prospective customer is likely to buy a bike.


bicn01/game_datamining2.txt · Last modified: 2019/08/28 16:21 by peter