| Analysis Services Distinct Count Optimization |
| Written by Denny Lee |
| Saturday, 19 April 2008 01:03 |
SummaryDistinct count (such as unique visitor counts on a Web site) calculations provide valuable information but come with a number of performance challenges. This white paper describes tests that were performed to determine how best to optimize these calculations and includes best practices based on the test results. IntroductionWhile distinct count (such as unique user counts or visitor counts) calculations provide valuable information, these calculations are not easy to solve. A simple example is a grocery "market basket" scenario where a single visitor can purchase various food items. While many items may have been purchased from your store, knowing whether one or many customers made these purchases has an impact on your marketing campaigns. In a more tech savvy Web analytics scenario, if each visitor has a high number of page views for each visitor on your site, you have a very “sticky” Web site with relatively loyal customers. Financial Web sites are an example of sites with this type of visitor profile where many transactions (research, purchases, sells, options, and so on) are performed by a set number of users. On the other hand, an automotive information Web site may have a relatively low page view/visitor ratio with a high number of visitors. These latter sites may have a lot of “non-sticky” visitors because users are visiting different sites looking for the latest information on their vehicle of choice. Business intelligence in the form of loyalty can be derived from these distinct count calculations where the former site has more loyal customers while the latter has fewer loyal customers. Ultimately, you want to adjust the targeting of your Web advertising campaigns to suit your visitor population. For more information, see Amir Netz’s excellent article Analysis Services: DISTINCT COUNT, Basket Analysis, and Solving the Multiple Selection of Members Problem; while it is specific to Microsoft® SQL Server™ 2000 Analysis Services, it provides useful information on the distinct count and grocery basket analysis problem. Often distinct counts are calculated by custom applications and pipelines using a variety of parsing techniques to determine the distinct number of items. The distinct count measure was introduced in SQL Server 2000 Analysis Services, an enhancement that was extremely useful for business intelligence analysts. It was originally based on the dimension structure in Analysis Services and provided fast access to distinct count calculations for small-to-medium sized volumes of data. In SQL Server 2005 Analysis Services, there is more flexibility in how you structure data so that you now get even better performance. While distinct count calculations are much faster in SQL Server 2005 Analysis Services, the purpose of this white paper is to provide a methodology for improving the performance of enterprise-size distinct counts. If you work with a relatively small set of data, you will see performance advantages when using these techniques, but you will not experience the same gain compared to those seen with enterprise-size data. In our case, enterprise size is loosely defined as greater than or equal to 20 million rows a day. Table of ContentsIntroduction |
Latest Author Articles
- Reintroducing Usage-Based Optimization in SQL Server 2008 Analysis Services
- Analysis Services Distinct Count Optimization
- Analysis Services: Should you use many-to-many dimensions?
- Using ByAttribute or ByTable Processing Group Property with Analysis Services 2005
- Renaming an Analysis Services 2005 cube
Top Rated
- MS Document: Designing SQL Server 2005 Analysis Services Cubes for Excel 2007 PivotTables
- SQL Server 2008 White Paper: Analysis Services Performance Guide
- Functionality & Performance Testing Analysis Services 2005 with Teradata v12
- Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services
- Building Ad-hoc Reporting Solutions with SQL Server 2005 Report Builder and Analysis Services OLAP
- MS Paper: 2007 Microsoft Office System Business Intelligence Integration with SQL Server 2005
- Configuring HTTP Access to SQL Server 2005 Analysis Services on Microsoft Windows XP
- Why Local Cube Creation with ASSL is Superior to Local Cube Creation with the Create Global Cube Com
Analysis Services Distinct Count Optimization
