Report Portal

Last Ever Non Empty - a new, fast MDX approach

Reposted from Chris Webb's blog with the author's permission.

The last non empty semi-additive measure aggregation functionality in SSAS enterprise edition is very useful, but it doesn’t support one common business requirement: while it will give you the last non empty value within any given time period, it doesn’t handle the variation where you want to get the last non empty value of a measure from all preceding time periods (this is what I’m calling the ‘last ever non empty’ value). There are a number of business scenarios where you’d want to do this, for example finding the value of the last purchase a customer made, the last price you sold a product at, and the stock level of a product in a shop the last time a sales rep visited. Traditional MDX solutions to this problem have suffered from poor performance but in this blog post I’ll describe a new approach that performs much better; I think it will be very useful to a lot of people, and I’m quite proud of it!

Let’s take the following MDX query on Adventure Works as an example of the problem:

SELECT
HEAD([Customer].[Customer].[Customer].MEMBERS, 10)
*
{[Measures].[Internet Sales Amount]}
ON 0,
NON EMPTY
[Date].[Date].[Date].MEMBERS
ON 1
FROM [Adventure Works]

Here’s part of the results:

image

From this we can see that individual customers only bought from us once or twice. Now, for any date, let’s create a calculation that will find what the value of the last purchase by any given customer was, regardless of however long ago it was. Up until last week I’d have tackled this problem using a combination of the NonEmpty and Tail functions – for each customer and date, get the set of all preceding dates, find the dates which had values and find the value of the last date. Here’s the code:

WITH
MEMBER MEASURES.[Last Sale Original] AS
TAIL(
NONEMPTY({NULL:[Date].[Date].CURRENTMEMBER} * [Measures].[Internet Sales Amount])
).ITEM(0)

SELECT
HEAD([Customer].[Customer].[Customer].MEMBERS, 10)
*
{[Measures].[Internet Sales Amount],MEASURES.[Last Sale Original]}
ON 0,
[Date].[Date].[Date].MEMBERS
ON 1
FROM [Adventure Works]

And here’s the part of the results dealing with the first customer, Aaron A. Allen:

image

On my laptop the query takes 14 seconds to run, and that’s with only 10 customers on columns (it executes in cell-by-cell mode, I think); in many real world scenarios this kind of performance isn’t acceptable and that was certainly the case with the customer I was working with last week. So I came up with the following new MDX that does the same thing much faster:

WITH 

MEMBER MEASURES.DAYSTODATE AS
COUNT(NULL:[Date].[Date].CURRENTMEMBER)-1

MEMBER MEASURES.HADSALE AS
IIF([Measures].[Internet Sales Amount]=0, NULL, MEASURES.DAYSTODATE)

MEMBER MEASURES.MAXDATE AS
MAX(NULL:[Date].[Date].CURRENTMEMBER, MEASURES.HADSALE)

MEMBER MEASURES.LASTSALE AS
IIF(ISEMPTY(MEASURES.MAXDATE), NULL,
([Measures].[Internet Sales Amount],
[Date].[Date].[Date].MEMBERS.ITEM(MEASURES.MAXDATE)))

SELECT
HEAD([Customer].[Customer].[Customer].MEMBERS, 10)
*
{[Measures].[Internet Sales Amount]
,MEASURES.[LASTSALE]}
ON 0,
[Date].[Date].[Date].MEMBERS
ON 1
FROM [Adventure Works]

On my laptop this query now executes in 3 seconds. Here’s what it’s doing:

  • First of all the DaysToDate measure returns the zero-based index of the current date within the set of all dates, so the first date in the time dimension would have index 0, the second 1 and so on. This could be replaced by a real measure to get slightly better performance but I left it as a calculated measure for the sake of clarity.
  • Next, the measure HadSale returns the index of the current date if it has a value and null otherwise.
  • Next, the measure MaxDate returns the maximum value of HadSale for the set of all dates from the beginning of time up to the current date. This will give us the index of the last date which had a value.
  • Finally we can take this index and, using the Item function, get the value of Internet Sales Amount for the last date that had a value.

If we want to take this approach and apply it to a server-based calculation, and make it work at all levels on the Date dimension, we need a slight variation. Again using the Adventure Works cube to illustrate, here’s what you need to do…

First of all, you need to create a new column in your fact table that contains only null values and use this as the basis of a new real (ie not calculated) measure, which should be called MaxDate. This should have the aggregation function Max.

image

You then need to add the following code to the MDX Script of the cube:

CREATE MEMBER CURRENTCUBE.MEASURES.DAYSTODATE AS
COUNT(NULL:[Date].[Date].CURRENTMEMBER)-1
, VISIBLE=FALSE;

CREATE MEMBER CURRENTCUBE.MEASURES.HADSALE AS
IIF([Measures].[Internet Sales Amount]=0, NULL, MEASURES.DAYSTODATE)
, VISIBLE=FALSE;

SCOPE(MEASURES.MAXDATE, [Date].[Date].[Date].MEMBERS);
    THIS = MAX(NULL:[Date].[Date].CURRENTMEMBER, MEASURES.HADSALE);
END SCOPE;

CREATE MEMBER CURRENTCUBE.MEASURES.LASTSALE AS
IIF(ISEMPTY(MEASURES.MAXDATE), NULL,
([Measures].[Internet Sales Amount],
[Date].[Date].[Date].MEMBERS.ITEM(MEASURES.MAXDATE)));

This does basically the same as the previous example only now MaxDate is a real measure instead of a calculated measure, and we’re using a scoped assignment to overwrite its value at the Date level. Above the Date level the default aggregation method of the MaxDate measure kicks in and we see the Max value of MaxDate for all dates in the current time period – which means at the month, quarter and year level we once again get the index of the last non empty date. Here’s what the result looks like in the cube browser:

image


chris-webb

Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com .


Tags: mdx

 

2007-2015 VidasSoft Systems Inc.