TopCounts With Ties In MDX
User Rating: / 0
PoorBest 
Written by Chris Webb   
Monday, 11 February 2013 20:43

Reposted from Chris Webb's blog with the author's permission.

Pretty much everyone that knows MDX knows the TopCount() function, which is used to find the top n items in a set - it can be used for finding your top 10 products, your top 20 salespeople, and so on. However most people don't consider it's biggest drawback: it always returns n items, and doesn't take ties into account.

Consider the following query on Adventure Works, which returns the top 9 dates by Internet Order Count:

SELECT {[Measures].[Internet Order Count]} ON 0,
TOPCOUNT(
[Date].[Date].[Date].MEMBERS
, 9
,[Measures].[Internet Order Count])
ON 1
FROM [Adventure Works]

image

Now look at this query, which does the same thing but returns the top 10 dates:

SELECT {[Measures].[Internet Order Count]} ON 0,
TOPCOUNT(
[Date].[Date].[Date].MEMBERS
, 10
,[Measures].[Internet Order Count])
ON 1
FROM [Adventure Works]

image

Notice how June 11 2004 and June 17 2004 both have the same Internet Order Count of 86, but the latter date isn't included in the first query. From this you can see that it's important, when you're doing a top n query, not to always return n items but to take tied values into account.

Luckily there's a way of doing this. If you can download and install the dll from the Analysis Services Stored Procedure Project, you can use the TopCountWithTies() function that it provides, like so:

SELECT {[Measures].[Internet Order Count]} ON 0,
ASSP.TOPCOUNTWITHTIES(
[Date].[Date].[Date].MEMBERS
, 9
,[Measures].[Internet Order Count])
ON 1
FROM [Adventure Works]

image

Unfortunately, installing third-party dlls is not always allowed in a production environment, and SSAS 2012 Tabular doesn't support dlls at all. However there is a way of getting the same result in pure MDX. Here's an example:

WITH
SET TOP9 AS
TOPCOUNT(
[Date].[Date].[Date].MEMBERS
, 9
,[Measures].[Internet Order Count])
MEMBER MEASURES.TIEDRANK AS
RANK(
[Date].[Date].CURRENTMEMBER
, TOP9
, [Measures].[Internet Order Count])
SET TOP9WITHTIES AS
ORDER(
FILTER(
[Date].[Date].[Date].MEMBERS
, MEASURES.TIEDRANK>0 AND MEASURES.TIEDRANK<=9)
, MEASURES.TIEDRANK
, BASC)

SELECT {[Measures].[Internet Order Count], MEASURES.TIEDRANK} ON 0,
TOP9WITHTIES
ON 1
FROM [Adventure Works]

image

What I'm doing here is first using the TopCount() function to find the set of the top 9 dates (this step isn't strictly necessary, but it has a significant positive impact on performance), and then using the Rank() function with the third parameter to find the tied rank. The Rank() function should find the position of a member in a set, but when the third parameter for it is specified it doesn't match members based on the members themselves but on the values of the tuple specified in the third parameter. Hence, even when I have  a date that doesn't appear in the set TOP9, the Rank() function can still return a value for it if that date has a value for Internet Order Count that does appear in that set.


chris-webb

Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com/ .


 

Tags: mdx
 

Pyramid Analytics