User Tools

Site Tools


bia_dm:dm09

SQL Server Data Mining Add-Ins for Office

The Data Mining Client for Excel brings the data mining applications of SQL Server Analysis Services to the desktop. You can work with the same data mining algorithms, structures, and viewers that are available in an instance of SQL Server Analysis Services, but you can store source or test data in Excel tables. By using the Data Mining Client, you can analyze complex data sets and make predictions as easily as opening a table in Excel. Moreover, because your data and results can be stored in Excel, you can quickly leverage and present patterns discovered by the model.

Overview

The easiest way to install the add-ins is to use the Getting Started wizard, which walks you through the process of setting up a database and configuring it to enable data mining.

The Data Mining Client for Excel maintains an active connection to the server, so you can identify patterns in data stored in Excel data tables, and then save your data mining model to the server if you wish, to use in further testing or prediction. You can also apply the data in Excel to existing data mining models, and then reprocess the model to improve its accuracy, or try different mining models on the same data to develop more in-depth analyses.

After you have created a mining model, you can assess its effectiveness by using standard methods such as a lift chart or a classification matrix, or you can measure the return on profit for a data mining model that makes predictions.

The Data Mining Client for Excel also provides management tools that let you create, rename, delete or reprocess data mining models and structures that are stored either on the server or in temporary session files.

If the Data Mining Add-In does not appear in the ribbon, go to the menu, click on File | Options | Add-Ins. At the bottom ofthe View and manage Microsoft Office Add-ins window, select Disabled Items and click on Go. Mark the SQLServer.DMXLAddIn and click on Enable. You have to restart Excel 2010.

Create a Connection

Click the Connection button to select an existing connection, or to create a new connection to an instance of SQL Server 2008 Analysis on the Analyze or Data Mining ribbon.

To create a new connection to Analysis Services

  • Click the Connection button.

  • In the Analysis Services Connections dialog box, click New.
  • In the Connect to Analysis Services dialog box, type the analysis server name.
  • Specify the authentication method: Use Windows Authorisation.
  • Specify the catalog, or database, where you have stored your data mining models: Sales Mailing Campaign.
  • Type a friendly name for the new connection, for example Sales Mailing Campaign.
  • Click Test Connection to verify that the server is available.

Model Usage

The Model Usage and Management groups provide wizards to help you browse, modify, and manage existing mining models that are stored on an instance of SQL Server 2008 Analysis Services. If you have the necessary permissions, you can delete, modify, rename, or process existing mining models and structures without leaving Excel.

As part of their advertising campaign, the marketing department has decided that they would also like to send a mailer with a special offering for new and cool bike accessories to some high potential customers who purchased bikes from Adventure-Bikes in the past.

To use the Browse Model wizard

  • Click the Data Mining tab.
  • In the Model Usage group, click Browse.
  • In the Select Model dialog box, choose a mining model from the list. We use STM-Decision-Tree -above-45.
  • Click Next.

The wizard opens a Browse window that is appropriate for the type of model that you selected.

Depending on the data mining algorithm that you used when you created the model, the Browse window may include graphs to help interpret the results, legends that contain additional detail, and controls for interacting with the data.

  • Change the Background value to yes in order to get only those customers who have bought a bike (1).
  • Set the Show Level to 2 in order to see the high level potential users (2).
  • Right click the node with the highest probability to buy a bike and Select Drill Through Structure Column to open the Drill Through window with the all customer data in a new Excel spreadsheet.

You may use the data for the mailing processing.

Query a Model

The Query wizard helps you interact with existing mining models to make predictions based on data in an Excel table, an Excel range, or another data source. You can apply the patterns contained in the existing mining model to input data. First, you have to choose the model from the list of models available on the server, then choose the data to be evaluated and finally define the mappings between the input data columns and the model columns.

  • Click on Query in the Model Usage tab of the Data Mining Ribbon.
  • The query wizard opens. Click on Next.

The wizard presents a list of data mining models on the server to which you are connected.

  • Select In the Select Model dialog box, choose the STM-Decision-Tree data mining model from the list.

You have now selected an algorithm that will be applied to a dataset.

  • Click Next.

Then, you must specify a source for the dataset to use as input. You can use data in an existing Excel table or range, or you can specify a SQL statement to retrieve the data.

Susan wants to use the actual customer dataset.
  • Click on External Data Source.

Next, you have to create a query to select a table available in the current data source and select the columns to include in your dataset.

  • Click the Open the Data Source Query Editor button (1).

  • In the Data Source Query Editor, select AdventureBikes Sales Data Mart as the server data source.
  • Select DataSet Actual Customers from the available table and columns and add all columns to the query.
  • Click OK.

An SQL-query will be generated. You can see the query in the lower pane of the query wizard (1).

Depending on the type of model that you choose, you must now specify the column that contains the data to be evaluated, and define mappings between the input data columns and the model columns. Depending on the algorithm you choose, you can set other properties on the model.

  • Click Next.

Note, that the table column Age Group, which is a column of the Mining Model has no corresponding Table Column because the Age Group was introduced as a calculated field in the data source.

We have two choices:

  1. Ignore this column. Then, the attribute Age Group will not be used in the data mining process.
  2. Add a Age Group column. To do so, you have to change the SQL query statement in the query panel.

In order to add a new column,

  • click on back to return to the Select Source Data pane of the Select Source Data window and add the following code before the FROM statement to your SQL query:
, CASE
WHEN [Customer Age] <= 25 THEN 'Age under 25'
WHEN [Customer Age] BETWEEN 26 and 35 THEN 'Age between 26 and 35'
WHEN [Customer Age] BETWEEN 36 and 45 THEN 'Age between 36 and 45'
WHEN [Customer Age] BETWEEN 46 and 55 THEN 'Age between 46 and 55'
WHEN [Customer Age] BETWEEN 56 and 65 THEN 'Age between 56 and 65'
WHEN [Customer Age] > 65 THEN 'Age above 65'
END as [Age Group]

After we have added the new column Age Group to our data source, the relationships between data model columns and the input table columns should look like this:

  • Click Next.

Finally, the wizard also gives you the ability to choose one or more predictions, and specify an output column in which to store the results.

  • In the Choose Output window, open the Add Output window.
  • Rename Output 1 to Bike Buyer Probability and apply the PredictProbability function with the Function Parameter “yes” to the input column Bike Buyer Flag.

  • Click OK.
  • Open the Add Output window again, and click on the input column Customer First Name in order to get the customer name to the output data set.
  • Click OK.

Repeat the above step to choose some more output columns:

  • Customer City ZIP
  • Customer City
  • Customer Address
  • Click on Next and get the data-set to a new spreadsheet.
  • Click on Finish.

You can now use the data-set for further processing.

bia_dm/dm09.txt · Last modified: 2018/12/04 08:38 (external edit)