Association rules viewers
Reposted from Thomas Ivarsson blog with the author's permission.
MS Association Rules is the data mining algorithm in Analysis Services that is recommended for market basket analysis. This means that you would like to see what products customers buy together in order to give recommendations to other customers. A lot of material exist on the web of how to set up this algorithm in BIDS and there is also a data mining tutorial in Books On Line for Association Rules. In the Adventure Works cube project you have this algorithm ready to look at and my examples are from the SSAS 2008 R2 version but also applies to SSAS 2005 and SSAS 2008.
My post is about how to analyze or use the visualizing tools that is part of BIDS 2005 and later.
I would recommend beginners to start with the dependency network tool that is the last tab of the mining model viewer in BIDS. In order to see this you will have to click on the Market Basket mining structure in the solution explorer window in BIDS. What you will see is how the most important rules for products that were bought together by customers. The dependency network graph below is related to the rules tab in the mining model viewer. The Itemsets tab does not have the connection with the dependency network tool.
Let us use the links strengthness tool to the left in the dependency network tool to only show the three strongest links in that tool. A link and a rule is the same here and each rule or link show products that are bought together.
The strongest rule is between the Touring Tire Tube and the Touring Tire products above.
The second most important rule above is between the water bottle and the road bottle cage products.
The third most imporant rule below is between the water bottle and the mountain bottle cage products.
If you move from the dependency network tab to the rules tab you will see this three rules at the top if you sort by the importance indicator/bar on that tab.
The higher the importance of the rule it is more likely that the products will be bought together. The probability index value to the left is less important when you want to detect these relationships.
On the itemsets tab you will see how frequent products occurs together but there is no rule stating that if you will buy product A you will also buy product B. You will only see that product A and B will occur together It is more a less a tab for seeing that you minimum support requirement is working. It is the same as stating that I require that product A and B most occur together with a minimum frequency, like 10 times or more in the data set. You can actually see below that the itemsets with the largest support are not the most important rules.
The statistical theory behind this will take many blog posts to explain so I will refer to Books On Line for more information about this.
Thomas Ivarsson has been working with the MS BI platform since SQL Server 7 in 1999. Presently he is working in the telecom industry in Sweden, with a data warehouse based on SQL Server 2005. From 1999 to 2007 he worked as a consultant also on the three SQL Server BI platforms. During the latest years he has spent most of time on SSAS, Reporting Services, ProClarity and Performance Point. He also has several years experience of the ETL process with DTS and SSIS. During 2008 and 2009 he has been working with introducing data mining in his daily business to see patterns in a service network behaviours. His blog can be found here: http://thomasianalytics.spaces.live.com
Tags: data mining