Report Portal

PDW and SSAS

Reposted from Chris Webb's blog with the author's permission.

One new feature of SQL Server PDW 2012 that hasn't had the attention it deserves is the fact that it is now officially supported as a data source for Analysis Services, both Multidimensional (in ROLAP and MOLAP modes) and Tabular (in In-Memory and DirectQuery modes). If you are working with extremely large data volumes in SSAS then PDW might be something you want to do some research on. For SSAS Multidimensional in MOLAP mode or Tabular models in In-Memory mode, using PDW as a data source should make processing run extremely quickly. For SSAS Multidimensional in ROLAP mode or Tabular models in DirectQuery mode, it can give you interactive query access to data volumes that MOLAP/In-Memory simply couldn't handle (remember though that DirectQuery only works with DAX queries, so Excel PivotTables don't work with it, only Power View).

There are a few public sources of information on PDW/SSAS integration. One is the white paper on PDW that you can download from Henk van der Valk's blog here:
http://henkvandervalk.com/introduction-to-sql-server-2012-parallel-data-warehouse

Here's the relevant section:

New in SQL Server 2012 PDW, you can use PDW as a high performance relational data source for building multidimensional OR tabular models with SQL Server Analysis Services (SSAS). For example, you can:

· Use DirectQuery to perform real-time queries against a SQL Server PDW data source from a tabular model.

· Reduce ROLAP query times by using clustered columnstore indexes on SQL Server PDW tables.

· Use new features of SSAS. For example, use EnableRolapDistinctCountOnDataSource to run distinct count operations on PDW tables.

· Use PDW as a data warehouse for aggregating Hadoop data for use in multidimensional or tabular models.

The DirectQuery page in Books Online says similar things:
http://msdn.microsoft.com/en-us/library/hh230898.aspx

Some quotes:

In contrast, a tabular model in DirectQuery mode uses data that is stored in a SQL Server database, or in a SQL Server PDW data warehouse. At design time, you import all or a small sample of the data into the cache and build your model as usual. When you are ready to deploy the model, you change the operating mode to DirectQuery. After you change the operating mode, any queries against the model will use the specified relational data source (either SQL Server or SQL Server PDW), not the cached data.

.

DirectQuery can take advantage of provider-side query acceleration, such as that provided by xVelocity memory optimized column indexes. xVelocity columnstore indexes are provided in both SQL Server 2012 and SQL Server PDW, to support improved DirectQuery performance.

It's a shame there isn't more information out there though. This post from Michael Mukovskiy has some interesting findings on using PDW v1 as a data source for SSAS:
http://blog.oraylis.de/2013/07/pdw-cube-processing-experience/
.presumably PDW 2012 would give even better results.


chris-webb

Chris has been working with Microsoft BI tools since he started using beta 3 of OLAP Services back in the late 90s. Since then he has worked with Analysis Services in a number of roles (including three years spent with Microsoft Consulting Services) and he is now an independent consultant specialising in complex MDX, Analysis Services cube design and Analysis Services query performance problems. His company website can be found at http://www.crossjoin.co.uk and his blog can be found at http://cwebbbi.wordpress.com/ .


Tags: management

 

2007-2015 VidasSoft Systems Inc.