Many-to-Many Relationships and Partition Slices
User Rating: / 0
PoorBest 
Written by Chris Webb   
Tuesday, 24 January 2012 23:09

Reposted from Chris Webb's blog with the author's permission.

As you’re probably aware one of the benefits of partitioning a measure group is that it can improve query performance: for example, if you have partitioned your measure group by year and you run a query that only requests data for a particular year, then SSAS should only query the partition that holds the data for the year you’ve requested. There are however several scenarios where SSAS will scan partitions you would not expect it to scan – the ones I knew about up to now are all listed in the “Unexpected Partition Scans” section in this excerpt from “Expert Cube Development” – and I’ve just come across another scenario where this happens.

I was working with a customer the other week that had implemented Dave Fackler’s approach to handling multiple calendars using many-to-many relationships, as detailed in these two blog posts:
http://davefackler.blogspot.com/2008/05/handling-multiple-calendars-with-m2m.html
http://davefackler.blogspot.com/2008/06/handling-multiple-calendars-with-m2m.html

Now there’s absolutely nothing wrong with the design that Dave describes here – it’s a classic application for many-to-many relationships – and there aren’t any better ways of meeting this requirement. However, what I noticed when looking in Profiler at the customer’s cube was that all of the queries I was writing were resulting in reads on all of the partitions in the main measure group, even when I expected only one partition should be hit, and this was slowing the queries down a lot.

You can recreate the same problem in the version of the Adventure Works that Dave posts on his blog. Take the following query that uses the regular Adventure Works Date dimension and does not use the m2m relationship:

select {[Measures].[Reseller Sales Amount]} on 0,
{[Date].[Calendar].[Month].&[2004]&[3]}
on 1
from [Adventure Works]

When run on a cold cache, in Profiler you can see that it only results in the Reseller Sales 2004 partition being hit in the Reseller Sales measure group:

image

This is exactly as you’d expect. However if you request the same month (March 2004) via the Calendar dimension, which is connected to the Reseller Sales measure group via a m2m relationship and the Date dimension, you see all the partitions on Reseller Sales are hit:

select {[Measures].[Reseller Sales Amount]} on 0,
{[Calendar].[Year-Quarter-Month].[Month].&[2000200409]}
on 1
from [Adventure Works]

image

Not so good. But it turns out that this is an unavoidable side-effect of using many-to-many relationships: a filter on a many-to-many dimension (in this case the Calendar dimension) does not result in a filter being applied on the intermediate dimension (the Date dimension) to the measure group (Reseller Sales). Greg Galloway came across this some time ago and filed an issue on Connect which confirms that this is by design.

Given that I doubt this behaviour will be changed any time soon, the takeaway is that when you’re planning your partitioning strategy you should think twice about partitioning using a dimension that is used as an intermediate dimension in a many-to-many relationship. Otherwise you will lose all of the performance benefits of partitioning when your queries use that many-to-many relationship… 


chris-webb

Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com .


 

 

Pyramid Analytics