25
Hands-On Lab Building Your First Data Mining Model with SQL Server 2008 R2 Analysis Services Lab version: 1.0.0 Last updated: 3/8/2022

Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Embed Size (px)

Citation preview

Page 1: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Hands-On LabBuilding Your First Data Mining Model with SQL Server 2008 R2 Analysis Services

Lab version: 1.0.0

Last updated: 5/12/2023

Page 2: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

CONTENTS

OVERVIEW................................................................................................................................................. 3

EXERCISE 1: EMBEDDING DATA MINING RESULTS INTO A CUSTOM APPLICATION......................4Task 1 – Browsing the Adventure Works Online Shopping Application...............................................5

Task 2 – Opening the AdventureWorksBI Solution..............................................................................6

Task 3 – Creating the Basket Analysis Data Source View.....................................................................7

Task 4 – Configuring the Basket Analysis Data Source View................................................................8

Task 5 – Creating the Basket Analysis Mining Model...........................................................................9

Task 6 – Configuring the Basket Analysis Mining Model Algorithm Parameters................................10

Task 7 – Processing the Basket Analysis Mining Model.....................................................................11

Task 8 – Viewing the Basket Analysis Mining Model Content............................................................12

Task 9 – Querying the Basket Analysis Mining Model........................................................................15

Task 10 – Enhancing the Adventure Works Online Shopping Application.........................................19

Task 11 – Browsing the Enhanced Adventure Works Online Shopping Application..........................20

Task 12 – Finishing Up.......................................................................................................................20

SUMMARY................................................................................................................................................ 21

Page 3: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Overview

This lab will create a data mining model that uses the Microsoft Association Rules algorithm to identify patterns about models commonly purchased together. The data mining model will be used to provide relevant purchasing suggestions to online customers.

Note: Before you start with this exercise you must ensure that your machine meets the system requirements detailed in the next section. Additionally, you must complete the setup steps described in the next section.

Objectives

The objectives of this exercise are to:

Create a data source view

Create a Microsoft Association Rules data mining model

View the mining model content

Query the mining model

Embed the mining model query results into a Web application

System Requirements

You must have installed the following items to complete this lab:

Microsoft SQL Server 2008 R2:

◦ Database Engine

◦ Analysis Services

◦ SQL Server Business Intelligence Development Studio

SQL Server AdventureWorks2008 R2 sample databases

◦ AdventureWorks2008R2

◦ AdventureWorksDW2008R2

Microsoft Visual Studio 2010 SP1

◦ Visual Basic

◦ Visual Web Developer

Page 4: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Setup

All the requisites for this lab are verified using the Configuration Wizard. To make sure that everything is correctly configured, follow these steps.

Note: To perform the setup steps you need to run the scripts in a command window with administrator privileges.

1. Launch the Configuration Wizard for this lab by double-clicking the Dependencies.dep file located under the Source\Setup folder of this lab. Install any pre-requisites that are missing (rescanning if necessary) and complete the wizard.

Cleanup

There is no need to cleanup if you intend to continue the sequence of labs in this training kit.

1. To restore the original state of the AdventureWorks2008R2 and AdventureWorksDW2008R2 SQL Server databases and remove the Sales Analysis Analysis Services database, execute the Cleanup.cmd script located under the Setup folder in the Source folder of this lab.

Exercises

This Hands-On Lab comprises the following exercise:

1. Embedding Data Mining Results Into a Custom Application

Estimated time to complete this lab: 30 minutes.

Exercise 1: Embedding Data Mining Results Into a Custom Application

In this exercise, you will develop a data mining model that uses the Microsoft Association Rules algorithm to identify rules about models commonly purchased together. This type of data mining is called market basket analysis. The patterns discovered by the data mining model will be used by the

Page 5: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Adventure Works Online Shopping Web application to cross-promote models by suggesting relevant models during the shopping cart checkout.

Task 1 – Browsing the Adventure Works Online Shopping Application

In this task, you will explore the Adventure Works Online Shopping Web application to understand how it presently delivers suggestions during check out.

1. Open Visual Studio 2010 from Start | All Programs | Microsoft Visual Studio 2010 | Microsoft Visual Studio 2010.

2. If prompted to choose default environment settings (required the first time Visual Studio is launched), select Visual Basic Development Settings, and then click Start Visual Studio.

3. To open the AWOnlineShopping solution, on the File menu, select Open | Project/Solution.

4. In the Open Project window, navigate to the Ex1-EmbeddingDataMining\Begin\AWOnlineShopping folder located in the Source folder for this lab, select the AWOnlineShopping.sln file, and then click Open.

5. On the Debug menu, select Start Without Debugging.

6. When the Internet Explorer window opens, if required, maximize the window.

7. On the menu (located on the left), select Catalog by Category.

Figure 1Selecting the menu item

8. On the Catalog by Category page, in the Product list, click the Mountain-200 Black, 38 link.

9. On the Product Details page, click Add to Shopping Cart.

10. On the Shopping Cart page, notice the three suggestions at the bottom of the page.

Page 6: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

11. Click the Display Database Command label, and then review the database command.

Figure 2Reviewing the database command

Note: These suggestions were retrieved by a relational database stored procedure. They represent a static collection of suggestions, and as such they do not take into consideration items already added to the shopping cart. Clearly, the suggestion to purchase a Mountain-200 is no longer relevant.

12. Close the Internet Explorer window.

13. Leave Visual Studio open.

Task 2 – Opening the AdventureWorksBI Solution

In this task, you will open an existing solution that consists of the completed labs in this training course. You will then configure the deployment properties for the Sales Analysis Analysis Services project. In this exercise, you will be extending this project to include a new data source view and data mining structure.

1. Open SQL Server Business Intelligence Development Studio from Start | All Programs | Microsoft SQL Server 2008 R2 | SQL Server Business Intelligence Development Studio.

2. To open the AdventureWorksBI solution, on the File menu, select Open | Project/Solution.

3. In the Open Project window, navigate to the Ex1-EmbeddingDataMining\Begin\AdventureWorksBI folder located in the Source folder for this lab, select the AdventureWorksBI.sln file, and then click Open.

Note: This solution consists of all completed labs that precede this lab in the training course.

4. In Solution Explorer, if necessary, collapse the Populate DW and Sales Reports projects.

5. In Solution Explorer, right-click the Sales Analysis project, and then select Properties.

Page 7: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

6. In the Sales Analysis Property Pages window, select the Deployment page, set the Server property to <servername>, and then click OK.

Note: You will need to substitute <servername> for the name of the machine that hosts Analysis Services.

7. To save the solution, on the File menu, select Save All.

Task 3 – Creating the Basket Analysis Data Source View

In this task, you will you create the Basket Analysis data source view. The data source view will be the foundation upon which the data mining model in this exercise will be developed.

8. In Solution Explorer, expand the Sales Analysis project, right-click the Data Source Views folder, and then select New Data Source View.

9. In the Data Source View Wizard, read the welcome message, and then click Next.

10. In the Select a Data Source step, notice that the Adventure Works DW2008R2 data source is selected, and then click Next.

11. In the Select Tables and Views step, in the Available Objects list, scroll to the bottom of the list.

12. While pressing the Control key, select the v2008Order and v2008OrderLine views.

13. Click the arrow to add the selected tables to the Included Objects list.

Figure 3Adding the views to the Included Objects List

14. Click Next.

15. In the Completing the Wizard step, in the Name box, replace the text with Basket Analysis, and then click Finish.

Page 8: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

16. When the wizard completes, in Solution Explorer, notice the addition of the Basket Analysis data source view, and that the data source view designer opens automatically.

17. To save the solution, on the File menu, select Save All.

Task 4 – Configuring the Basket Analysis Data Source View

In this task, you will refine the design of the data source view. This will involve providing friendly names for each of the data source view tables, defining a logical primary key and establishing a relationship between the tables.

1. To rename the tables, in the data source view designer, in the Tables pane (located in the bottom left corner), select the v2008Order table, and then in the Properties window, modify the FriendyName property to Order.

Note: If the Properties window is not visible, on the View menu, select Properties Window.

2. Repeat the last step for the v2008OrderLine table, and modify the FriendlyName property to Basket.

Note: The purpose of this step is to create a user-friendly data model. It is important to configure friendly names at the data source view level so that they are consistently inherited throughout the objects (cubes, dimension and, mining models) created upon this view.

3. To define the primary key in the Order table, in the Order table, right-click the OrderNumber column, and then select Set Logical Primary Key.

4. To establish a relationship between the Basket table and the Order table, in the Basket table, drag the OrderNumber column on top of the OrderNumber column in the Order table.

Figure 4Establishing the relationship between the tables

5. To arrange the tables, right-click in a blank area of the diagram, and then select Arrange Tables.

6. To explore the data in the Basket table, in the Tables pane (or the diagram), right-click the Basket table, and then select Explore Data.

7. In the explorer window, notice that many orders include many models.

Page 9: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Note: The data mining model that you will develop in this exercise will produce a model to describe the relationships between models purchased together (in the same order).

8. To close the explorer window, on the File menu, select Close.

9. To close the data source view designer, on the File menu, select Close.

10. On the File menu, click Save All.

Task 5 – Creating the Basket Analysis Mining Model

In this task, you will use the Data Mining Wizard to create the BasketAnalysis_AR mining model.

1. In Solution Explorer, in the Sales Analysis project, right-click the Mining Structures folder, and then select New Mining Structure.

2. In the Data Mining Wizard, read the welcome message, and then click Next.

3. In the Select the Definition Method step, notice the default selection, and then click Next.

4. In the Create the Data Mining Structure step, in the dropdown list, select the Microsoft Association Rules data mining algorithm, and then click Next.

5. In the Select Data Source View step, in the Available Data Source Views list, select the Basket Analysis data source view, and then click Next.

6. In the Specify Table Types step, specify the table types as shown, and then click Next.

Figure 5Specifying the table types

7. In the Specify the Training Data step, specify the columns to use in the mining model as shown, and then click Next.

Figure 6Specifying the training data

Page 10: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

8. In the Specify Columns' Content and Data Type step, click Next.

9. In the Create Testing Set step, reduce the Percentage of Data for Testing value to 0, and then click Next.

Note: It is very important that you follow the lab instructions precisely, particularly when naming objects. This lab includes code that expect objects have been named correctly.

10. In the Completing the Wizard step, in the Mining Structure Name box, replace the text with BasketAnalysis, and in the Mining Model Name box, replace the text with BasketAnalysis_AR.

11. Click Finish.

12. When the wizard completes, in Solution Explorer, notice the addition of the Basket Analysis mining structure, and that the mining structure designer opens automatically.

13. On the File menu, click Save All.

Task 6 – Configuring the Basket Analysis Mining Model Algorithm Parameters

In this task, you will configure the Basket Analysis mining model algorithm parameters.

1. In the mining structure designer, select the Mining Models tab.

2. Right-click the BasketAnalysis_AR model, and then select Set Algorithm Parameters.

Figure 7Opening the Algorithm Parameters window

Page 11: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

3. In the Algorithm Parameters window, configure the Value property for the MINIMUM_PROBABILITY and MINIMUM_SUPPORT parameters as shown.

Figure 8Configuring the algorithm parameters

Note: The two parameters configured here define the sensitivity of the thresholds used to analyze the data when the mining model processes.

4. Click OK.

Task 7 – Processing the Basket Analysis Mining Model

In this task, you will process the Basket Analysis mining model. Once processed, the mining model will contain the patterns and statistics that describe the relationships between frequently purchased models.

1. In Solution Explorer, inside the Sales Analysis project, right-click the BasketAnalysis mining structure, and then select Process.

2. If prompted to build and deploy the project, click Yes.

3. If prompted to overwrite the database, click Yes.

4. In the Process Mining Structure window, click Run.

Note: The deployment process creates and processes the mining structure. At this time, the data is retrieved from the data source, and the Microsoft Association Rules algorithm correlates and identifies frequent relationships across attribute values, which in this case are product models.

Page 12: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

5. When processing completes, in the Process Progress window, click Close.

6. In the Process Mining Structure window, click Close.

Task 8 – Viewing the Basket Analysis Mining Model Content

In this task, you will use three mining model viewers to explore and understand the model content.

1. In the mining structure designer, select the Mining Model Viewer tab.

2. If prompted to build and deploy the project, click No.

3. In the Show dropdown list, select Show Attribute Name Only.

Figure 9Configuring the attribute properties to show

4. To sort the rules in descending order of importance, click the Importance header twice.

Figure 10Sorting the Importance in descending order

5. Review the most important rules (located at the top of the list).

Note: The first rule, Touring Tire Tube -> Touring Tire, reads “there is a 54.3% probability that the purchase of a Touring Tire Tube will result in the purchase of a Touring Tire.”

6. Scroll to the bottom of the list to find rules with negative importance. The purchase of these combinations are highly unlikely; in fact, the purchase of one discourages the purchase of the other.

7. In the Mining Model Viewer tab, select the Itemsets tab.

Page 13: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Figure 11Selecting the Itemsets tab

8. In the Show dropdown list, select Show Attribute Name Only.

9. Increase the Minimum Itemset Size value to 3.

10. Review the frequent itemsets that contain three models.

Note: The Support column represents the number of orders that included these three models.

11. Notice that the most frequent itemset that contain three models includes the Mountain-200 model. You will explore this model visually in the remaining steps of this task.

12. To view the item dependencies, select the Dependency Network tab.

13. In the viewer, on its toolbar, in the Show dropdown list, select Show Attribute Name Only.

14. To locate the Mountain-200 model, in the viewer, click the Find Node toolbar button.

Figure 12Locating the Find Node toolbar button

15. In the Find Node window, select the Basket(Mountain-200) = Existing node, and then click OK.

16. To zoom in, in the viewer, click the Zoom In toolbar button until you can adequately see the selected node and its related nodes.

Figure 13Locating the Zoom In toolbar button

Page 14: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Figure 14Exploring the Mountain-200 and related nodes

Note: Each line (in technical terms it is named an edge) represents a pairwise association. The slider (located on the left) is associated with the importance score.

17. Gradually drag the slider down to highlight the stronger edges by filtering out the weaker edges. Stop when you see only the selected node, Mountain-200, and the one node that the selected node predicts.

Page 15: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Figure 15Exploring the Mountain-200 and strongest related node

Note: The legend at the bottom of the viewer describes the node colors. Initially, in this scenario, all nodes predict both ways (i.e. the sales of Mountain-200 often results in the sale of the Mountain Bottle Cage, and vice versa). As you filter out the weaker edges, notice that ultimately it is the Mountain-200 that more likely results in a purchase of the HL Mountain Tire.

Task 9 – Querying the Basket Analysis Mining Model

In this task, you will create two singleton queries to test the model predictions.

1. In the mining structure designer, select the Mining Model Prediction tab.

2. Right-click inside the Select Input Table(s) window, and then select Singleton Query.

Page 16: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Figure 16Configuring a singleton query

Note: A singleton query enables the input of data expressed in the query rather than sourced from an external dataset.

3. Click inside the Value box to reveal an ellipsis, and then click the ellipsis.

Figure 17Locating the clicking the ellipsis

4. In the Nested Table Input window, in the Key Column list, select the Mountain-200 model, and then click Add.

5. Click OK.

6. In the query designer, in the query grid, in the Source column dropdown list, select Prediction Function.

Page 17: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Figure 18Selecting the Prediction Function source

7. In the corresponding Field column dropdown list, select PredictAssociation.

8. From the Mining Model window, drag Basket into the corresponding Criteria/Argument column.

Figure 19Dragging Basket to the Criteria/Argument column

9. To query the three likely models associated with the Mountain-200 model, in the Criteria/Argument column, append a comma and the number 3 at the end to create the following argument.

DMX

[BasketAnalysis_AR].[Basket], 3

10. The query should look like the following.

Page 18: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

Figure 20Reviewing the query

11. On the mining model prediction toolbar, toggle to Query.

Figure 21Toggling to the Query view

Note: The query designer displays the DMX statement. This statement is requesting the three most likely models based on a basket consisting of only Mountain-200.

12. To execute the query, on the mining model prediction toolbar, toggle to Result.

13. Expand Expression to reveal the three predicted models.

Figure 22Reviewing the query result

14. On the mining model prediction toolbar, toggle to Design.

Page 19: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

15. To add another model to the basket, on the Singleton Query Input window, click inside the Value box, and then click the ellipsis.

16. In the Nested Table Input window, in the Key Column list, select the HL Mountain Tire model, and then click Add.

17. Click OK.

18. On the mining model prediction toolbar, toggle to Query.

19. Notice the addition of the HL Mountain Tire.

20. Modify the SELECT line to read SELECT FLATTENED.

Note: The FLATTENED keyword will produce a flattened result that can be easily consumed by an application. Note however, that once you modify the query created by the graphic designer that you lose the graphic support functionality.

21. On the mining model prediction toolbar, toggle to Result.

22. Review the query result that now requests the three most likely models based on a basket consisting of the Mountain-200 and the HL Mountain Tire models.

Task 10 – Enhancing the Adventure Works Online Shopping Application

In this task, you will use modify the AWOnlineShopping Web application to deliver relevant model suggestions using the BasketAnalysis_AR data mining model.

1. Switch to Visual Studio 2010.

2. In Solution Explorer, right-click the Web.config file, and then select Open.

3. On the Edit menu, select Find and Replace | Quick Replace.

4. In the Find and Replace window, in the Find What dropdown list, enter SERVERNAME.

5. In the Replace With dropdown list, enter the name of the machine that hosts Analysis Services.

6. Click Replace All.

7. Verify that one replacement occurred, and then click OK.

8. Close the Find and Replace window.

9. On the File menu, select Save All.

10. To close the Web.config file, on the File menu, select Close.

11. In Solution Explorer, right-click the ShoppingCart.aspx item, and then select View Code.

12. Press Control+G, then in the Go to Line window, enter 90, and then click OK.

Page 20: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

13. In the code window, review the GetDataMiningSuggestions function and the associated comments.

Note: It is not necessary to understand the details of this code. This code is responsible for dynamically building a DMX statement similar to the one you created in the previous task.

14. Press Control+G, then in the Go to Line window, enter 67, and then click OK.

15. Replace this line with the following line.

Visual Basic

dr = GetDataMiningSuggestions()

Note: This modification will retrieve suggested models predicted by the BasketAnalysis_AR data mining model.

16. On the File menu, click Save All.

Task 11 – Browsing the Enhanced Adventure Works Online Shopping Application

In this task, you will browse the enhanced AWOnlineShopping Web application.

1. On the Debug menu, select Start Without Debugging.

2. When the Internet Explorer window opens, if necessary, maximize the window.

3. On the menu, select Shopping Cart.

4. On the Shopping Cart page, notice that the three suggestions have changed according to associations with the items added to the shopping cart.

5. Click the Display Database Command label, and then review the database command.

6. From the suggestions list, click the HL Mountain Tire link.

7. On the Catalog by Model page, click the HL Mountain Tire link.

8. On the Product Details page, click Add to Shopping Cart.

9. On the Shopping Cart page, notice that the three suggestions have been revised.

10. Click the Display Database Command label, and then review the database command.

Task 12 – Finishing Up

In this task, you will finish up by closing all applications.

1. Close the Internet Explorer window.

Page 21: Hands On Lab: Building Your First Data Mining …az12722.vo.msecnd.net/sql10r2byfbi-trainingcourse1-0/... · Web viewBasic Development Settings, and then click Start Visual Studio

2. In Visual Studio, on the File menu, select Exit.

3. In SQL Server Business Intelligence Development Studio, on the File menu, select Exit.

Summary

In this lab, you have created a data mining model that uses the Microsoft Association Rules algorithm to identify patterns about models commonly purchased together. The patterns discovered by the data mining model were used to enhance the customer experience while shopping online.