Report Portal

How to manage changing parent child hierarchys in Analysis Services 2005

Changing Dimensions in Parent Child Hierarchies

A changing dimension is a common issue in many OLAP applications. The problem is pretty broad:

- Members moving around. For example, after a company reorganization a sales region can be moved from one area to another.

- Characteristics of a dimension member change, For example, the address of a customer changes.

In this note, I’m not going to attempt the definitive guide to managing changing dimensions. Instead, I want to address a request that came from a customer on how best to manage members moving around in a parent child dimension with the simple requirement that users can see how data rolls up the dimension as the dimension was structured at the time of the transaction and that the total value for each dimension member is available over a period of time that it moved around two or more times.

This document assumes some knowledge of Analysis Services.

To manage this changing dimension, I’m going to use surrogate keys to identify each of my sales regions. A surrogate key is a unique identifier solely created to distinguish each dimension member from another. It does not depend on any data in the dimension table and has no business value.

Initially, our parent child sales organization table could look like this:

In the Organization dimension, we’ll create 3 hierarchies: Organization Key, Organization Structure and Organization name.

The Organization Key hierarchy is the key attribute hierarchy – each dimension requires a key attribute hierarchy. The properties of this hierarchy are listed below:

This is a flat hierarchy of all organizations. Although required, it may not be that useful to browse and can be hidden. This hierarchy will contain a member for each organization and for each change made to an organization.

The Organization Structure hierarchy is a parent child hierarchy. It’s relevant properties are displayed below:

Some important things to note about this hierarchy. First, the Usage property is Parent and the KeyColumn it key is PID (this is what tells the server that the hierarchy is parent child). Second, the NameColumn is set to the Name column from the Organization table so users will see the name of the organization instead of a meaningless (to them) identifier when browsing. Last, the MembersWithData property is set to NonLeafDataHidden. This hides the members created for each non-leaf level (to learn more about these and other properties, refer to the SQL Server reference material).

When users browse this parent child hierarchy, they will see all versions of each organization. So if I move an organization from one place to another, this hierarchy will show it in both places. This will satisfy the first requirement that data for each organization rolls up to its parent at the time the transaction took place.

I want one more hierarchy to satisfy  the second requirement to be able to see all data for each organization regardless of it position in the hierarchy. That hierarchy is defined below:

The most important thing here is that the key column of this hierarchy is not the organization’s surrogate key (ID), but its business key (its name).  

At this point the, Organization Structure hierarchy looks like this:

Transactions in the fact table are associated with each organization using its surrogate key (most importantly, not the name). So if the fact table looked like this:

Then the data from the cube would be as follows:

Now, consider moving Australia out of the Western region into the Asian region. To accomplish this, a new dimension record is added:

This adds a new record to the parent child table. Any new transactions in the fact table that Australia participates in would be identified by this new ID of 14 and not the previous value of 7. Lets add a new transaction record against Australia using its new ID with an amount of $40.00.

After reprocessing, the cube data looks like this:

Notice that Australia is in two places – the original in West with its transactions and the new one in Asia with transactions after the move. However, if the Organization Name hierarchy is queried, we see all transactions against Australia and the other organizations:

This allows us to rollup the data according to where the organization was when the transaction occurred while not losing the ability to see the total data for each organization.

Other Comments

This is a trivial example for the purposes of explaining a concept. Additional actions can be taken in real world scenarios.

It may be useful to indicate which records are historical and which are active. This can be accomplished by adding a column to the dimension table and exposing a member property or other hierarchy.

The example here shows how a “leaf” member (one at the bottom of a parent child hierarchy) can be moved. If a non leaf member such as EMEA was moved to a different parent, it must get a new ID as well as all its descendants.

The Organization Name attribute hierarchy is, like most attribute hierarchies, a flat hierarchy with just t2o levels. The top level contains the all member and the bottom level contains each hierarchy. If there are many organizations, this may not be a reasonable hierarchy to browse. To solve this, add other attributes and create a user defined hierarchy. 

Tags: scd, parent-child

 

2007-2015 VidasSoft Systems Inc.