Storage and the NameColumn and KeyColumns Properties
Reposted from Chris Webb's blog with the authors permission.
Those of you who have worked with SSAS Multidimensional for a reasonable amount time will, no doubt, be very familiar with the NameColumn and KeyColumns properties of an attribute (if you’re not, see here and here) and how they should be used. You will probably also know that when the KeyColumns property has been set to only one column, then the NameColumn property can be left unset and the key will be used as the name of the attribute’s members.
However, while onsite with a customer recently I noticed something strange. Here’s an illustration: if you create a simple dimension based on the DimCustomer table in Adventure Works and create a single attribute based on the CustomerKey column, this is what you see in the Properties for that attribute:
However, if you deploy the database then import it into a new project in SSDT/BIDS, then you see that the NameColumn property has been set:
My first thought was that this was a bit dangerous, because it might mean that the imported version of the database would start storing extra strings for the names. But this was incorrect because a look at the data directories for the two versions of the dimensions showed they contained the same files and were using the same amount of storage:
I’m very grateful to Akshai Mirchandani of the dev team for confirming that in this situation, it is irrelevant whether you set the NameColumn or not – data duplication will always happen, and the key values will be stored again as strings. The only time it doesn’t happen is when the key and the name of the attribute are both the same column and that column is a string.
This means that if you have a very large attribute that is in danger of exceeding the infamous 4GB limit (although this is of course fixed in SSAS 2012) and which never needs to be visible, you can use the trick that Greg Galloway describes here to reduce the size of the string store. This involves creating a dummy column in your DSV (or underlying view or table) that contains only an empty string and then setting this as the NameColumn of your attribute. For the example above, this is the result:
For this example, the overall amount of storage used for the dimension has gone down from 1.24MB to 1.04MB, and although you can see the .ahstore file (the hash store) for the Customer Key attribute have grown, the size of the string store, Customer Key.asstore (note: don’t get confused between .asstore and .astore files), has reduced from 362KB to 1KB.
Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com .