Calculating Ages In MDX
Reposted from Chris Webb's blog with the author's permission.
One area where I see people have a lot of problems when they're learning MDX is when they are working with dates. SQL has a lot of great functionality for working with dates while MDX, to be honest, does not - therefore people tie themselves in knots writing complex MDX expressions (which often don't perform all that well) to recreate what they would do in SQL. However, as is often the case with MDX, approaching the questions from a completely different angle yields much better results.
For example, let's imagine that you wanted to calculate the age in years of any customer on any given date. Using the Adventure Works cube you have a Date dimension, which will be how you want to select the date that the age is to be calculated at; there is also, on the Customer dimension, an attribute called Birth Date that gives you each customer's date of birth. You need a calculated measure that calculates the age of each customer for any date selected on the Date dimension.
If you think about this as a SQL date problem, you'll probably want to use functions like DateDiff() - one of the VBA functions that is available in MDX (but is not native MDX). The following article shows how DateDiff() and other VBA functions can be used to calculate ages in VBA code:
Here's a query showing how this calculation can be implemented in MDX:
The output of the query is this:
On my laptop this query, which returns 365 columns and 106 rows, returns in around 3.5 seconds. A few things to point out:
- I'm using the .Member_Value function instead of the .MemberValue function to get the typed date value associated with each date, as Bill Anton's recent blog post points out that it is marginally faster.
- I'm using .Properties("Birth Date", typed) to get the typed date value for each customer's birth date. If I set the ValueColumn property on the Customer attribute of the Customer dimension, or enabled the hierarchy on the Birth Date attribute, I would be able to use .Member_Value here too and get a very minor performance improvement.
- Most of time spent by the query is taken by the yeardiff calculated member, which uses the DateDiff() function. This isn't surprising because calling any VBA function in MDX is expensive and something to be avoided.
Performance here is ok, but we can do better. The most important thing to do when tuning any MDX calculation is to try to pre-calculate as much as possible. In this case there are two calculations we can move out of MDX into the structure of the cube:
- The DateDiff() function here just calculates the difference between the years of the two dates. Therefore if you create a new attribute on the Customer dimension to hold the integer representing the year of the customer's birth date, and use the existing Calendar Year attribute on the Date dimension, you can avoid using DateDiff() completely and do a simple subtraction.
- Likewise, rather than calculating the combination of month and date in the MDX, these values can be pre-calculated in the dimension and then accessed as properties.
Here's the rewritten version of the calculation, assuming that the following new attributes have been added to the Customer and Date dimensions:
- [Customer].[Birth Year] holds the year of the customer's birth as an integer
- [Customer].[Birth Month Day] is an integer containing the month number multiplied by 100 plus the day number of the month for the customer's birth date. For example the date 5th June would have the value 605.
- [Date].[Month Day] is the same as [Customer].[Birth Month Day] but for the Date dimension.
This query runs in around 1.2 seconds, a significant improvement. Of course if you were to create a completely new fact table that contained the age of every customer on every date in the Date dimension then you wouldn't need to use any calculated measures and performance would be even better, but that may not be practical if you have a lot of customers and a lot of dates.
Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com/ .