Ella Maschiach's BI Blog

BI, Olap Technologies, Tools, System Analysis and Design

News

About Ella Maschiach

Business Inteligence

Intelligencia – for your SSRS reports on an SSAS cube

I wrote a post a while back about the pros and cons of developing a Report Model based on a cube. I got two comments on it telling me to check out Intelligencia. I decided to look into it, and what can I say? I'm impressed...

Intelligencia is an add - in to Visual Studio which helps you build Reporting Services Reports on top of a cube. Those of you who have tried doing that in the past with what there is, probably remember that it wasn't terribly comfortable all of the time... First example that goes to mind? Creating an SSRS reporting against a cube with a Parent Child Dimension. Intelligencia lets you use that dimension very easily, also helping you pick the members you want from it (or from any dimension) with a few simple clicks.

It's pretty simple to use with its own sort of data source:

And from there on out, it's the use of their wizard.

Intelligencia looks and feels like something out of Office 2007 with the added perks of having selection windows in the formula bar for your MDX writing:

An MDX function library with intellisense:

And a VB function library as well.

It also a small warning just in case you made a mistake with the function you defined:

And you can use it just as you would use Excel, defining calculations on the formula line:

After you've finished designing your report, you can get out of their investigator and see the MDX that was created for you:

I would also recommend using their option for "Include hierarchy Name in Field Names" so you don't get lost in the with all the attribute hierarchies you included. Also, marking "Include IsTotal flags" should help you with further enhancing the design of your report.

So yeah, it's pretty cool and it has quite the friendly UI. I really think this is a very important tool for you if you're using Reporting Services on top of a cube, because the interface just turns the work to that much simple - both in the sense that work against the cube has become simple, and in the sense that working with Intelligencia itself is simple.

I would also like to mention that I have contacted IT Workplace support quite a few times during the evaluation period of the product and always got a very quick and helpful response (which is also VERY important).

So for the very least, I would really recommend you give Intelligencia an in depth look.

Microsoft Europe is looking for you

Are you interested in working for Microsoft? Do you live in Europe (or have a permit to work in Europe)? Well Microsoft Europe just made a site just for you:

http://www.joinmicrosofteurope.com/

Now I've check the "work environment" and "meet the team" and it all sounds very good. I say - go for it!

SQL Server SP3 Beta and more news

Well apparently, you can miss out on a lot of things during a two weeks vacation. Luckily I came back in time to see SP3 Beta for SQL Server 2005 get released! But I'll talk about that in a bit...

The announcement of Project Gemini in SQL Server 2010 sparked a very heated debate in the BI Community, over how much it might be abused by users as an alternative to an organized DW \ OLAP solution. The discussion did not go unnoticed by Microsoft. I guess I can only sum up by saying that we'll see what happens in 2 years time...

The end of the summit where Project Gemini was announced, there was yet another display for how Microsoft sees the future of BI - this time with a demo done on Surface.

Meanwhile, for SQL server 2008 things continue as usual. Cumulative Update 1 was released in the beginning of the month, enabling us, amongst other things, to render reports to PDF in specific fonts. I guess releasing CU 1 for SQL Server 2008 also took enough weight off the shoulders for Microsoft, because they have committed to releasing SP3 for SQL Server 2005 by the end of the year.  In the meanwhile - there is a beta release for SP3. It is expected to include all of the Cumulative Updates up until and including CU 9. So this means that you should also check out the release of Cumulative Update 10 for SQL Server 2005, (making the grounds also for CU 11, so I guess all as normal).

The really BIG news for SQL Server 2008? The RTM of the Feature Pack! This includes, amongst other things, the Data Mining Add-ins for Microsoft Office 2007 and the beloved Report Builder 2.0. Report Builder 2.0 still isn't a click once application so I guess there is more we can expect to come.

Does all of this mean I should be happy to take just a two week vacation and not something longer?...

The future of SQL Server: Project Kilimanjaro, Madison and Gemini

The Microsoft BI Conference 2008 has been the source for quite a few announcements. First of all, now that DATAllegro was purchased by Microsoft, the talk is about SQL Server handling bigger amounts of data. There have been posts not just about terabyte projects, but also petabyte projects. The full integration of DATAllegro's technology into SQL Server is planned for the first half of calendar year 2010 and is code named Project Madison. SQL Server 2010 itself is code named Project Kilimanjaro.

The real surprise? Project Gemini which sounds like Microsoft giving QlickView a run for its money... but I'm ahead of myself. Project Gemini is the code name for Analysis Services using column-oriented and in-memory technologies running inside Excel. This means faster access for all applications working against cubes stored in this brand new way. Microsoft is hoping to empower the end - users even more, as now they will be able to build models by themselves.

I think this is marvelous news and a brilliant move on the part of Microsoft (not that you would expect anything less...). First of all, I saw a presentation of QlickView a while ago. QlickView offers instant, in memory, manipulation of massive datasets. The demo I saw was very impressive. Thing is, for the good and the bad of it, QlickView works against tables rather than cubes and I don't know how much that may limit your capabilities at defining calculations (I would have to stress I haven't tried and can only assume). But in any case, QlickView looks very impressive visually and the response time we saw was excellent, which is also the reason why its share in the BI market is growing rapidly. The QlickView representative who came to give us the presentation said he believes in memory storage is the future and I guess now I see just how right he was... Microsoft has taken QlickView's idea of in memory storage one step further, in the fact that for them it's going to be for the cube itself. This means that you should still be able to use MDX calculations against your cube.

Secondly, this also puts an emphasis on the capabilities of Excel and SharePoint and the way they work with BI. Microsoft is pushing the capabilities of Excel even further, after making Excel 2007 a better cube viewer and enhancing its Data Mining capabilities. This is on top of empowering Excel itself and turning it into Excel Server. Now, Excel should become the staging area for power users to create their own cube models and share them, after publishing them to SharePoint. So in that sense I feel like Microsoft is also leveraging its capabilities in work portals and data collaboration around the organization with Excel and SharePoint. All in all, very exciting news.

For more details about these projects, please refer to the recent posts from Chris Webb, Mosha Pasumansky and Marco Russo.

Finding distribution in a list of values

I had a friend of mine ask me a question not long ago. He had developed a site for posting ads on selling houses for a certain company. The managers of the company were now trying to analyze the data on people posting ads. One of the things they were asking him to do, was give them a graph on how many people were posting ads on a certain span of prices. They weren't going to define him the span of money, but rather wanted him to give them the span, according to the amount of people that existed in it. And that's were he got stuck... because there was really (almost) no end to how much money can be put on a house, and there seemed to be endless values to choose from in between.

Now I remembered from the Analysis Services Tutorial I did, (which is a great place to start learning about SSAS), that you can group attribute members with the DiscretizationBucketCount property along with the DiscretizationMethod property. The DiscretizationMethod helps you determine the method by which SSAS will group your values (for instance, by Equal Areas) and the DiscretizationBucketCount determines the amount of groups that will be set.

So what was my suggestion? Turn the Sales table into a Fact - Dimension sort of table, where the price of the house is an attribute, and the amount of ads was the measure. We can then use DiscretizationMethod and DiscretizationBucketCount on the attribute of price to help us look at it more clearly.

BI & Cloud Computing

Well I guess the future is really up in the cloud...

The last strategic move for Panorama is joining Google and offering a BI Analytics tool to work with Google Docs - PowerApps. But that's old news by now. GoodData has recently gotten funding for their BI in the cloud, giving you a complete platform for storing and analyzing your data. Business Objects offers "OnDemand" another platform that enables you to share Crystal Reports over the web. The list goes on, with other companies trying to get a piece of the pie.

So maybe that's what pushed Microsoft from just enabling you to store data on its SSDS (SQL Server Data Services) and add also Data Mining to the cloud... The SQL Server Data Mining Team has just put up a site to preview the abilities you could have for data mining over the web. The site currently has Adventure Works DW sample data and also enables you to load a CSV file of your own, for you to work on and experiment:

 

The visual effects are wonderful and should really help you better understand what potential lies in your data.

You could see the Key Influencers for cars (according to the AW Sample Data):

 

Or predict the sales for the following period:

Data Mining just got a whole lot more interesting for me - check it out for yourself!

BI and SQL Server Videos

I came across two very interesting web sites lately, both of which are a great source for good webcasts. The first site is Learn Microsoft BI which has all sorts of video on BI projects and how to develop them using SSAS. There has actually just been a first video on PerformancePoint Server as well. The site requires registration, but is well worth it.

The other site is SQL Server Videos which targets more DBAs and developers.

Having done just a few webcasts and knowing how much work goes into it, I am in complete awe of what's given in both sites. I strongly recommend you check both of them out.  

Update 28/10/08: Microsoft has added videos to its MSDN catalog of Books Online for SQL Server 2008.

Slowly Changing Dimension Type 2 Tips & Tricks

Slowly Changing Dimension Type 2 Tips & Tricks

I've blogged in the past about Slowly Changing Dimension Type 2, and I see that it's a subject that really interests you. I remember quite distinctly how hard and demanding it was to create it for my project (trust me I do!) and even to do it for two business entities. So, hoping that I may assist you even slightly, I thought I may share with you a few thoughts I have on the matter:

Do you really need it? Now I know this is going to sound quite obvious, but before you decide to take on such a demanding task, you should really ask yourself what you can take away from it. Does your user have a proven ROI from using the development over time in those areas for that entity? Does he have meaningful insights he can take away from comprising how the entity evolved over time? Only if the answer is "Yes" to both questions then you should go into the development of SCD type 2.

What are you going to track? Sometimes, your users can get carried away with themselves... it happens to the best, no doubt, whether it's because they're enthusiastic, or because they thinks bigger is better. No matter what the reason, sometimes your work as a System Analyst is to contain them. Now I'm not saying we're not here to serve our users, but there are those time when they request things simply because they can, and not because they need them. Because development of Slowly Changing Dimension type 2 can be quite complicated, you should try to contain the amount of attributes you follow, as the more attributes you try to follow, the more problems may arise in development. I think it'd be fair to say that for the entities in my project only 4 are meaningful, though I found myself required to track more... Don't forget, you can always snow - flake your entity, distinguishing between the attributes that will be tracked over time, and those that will stay constant or be overwritten.

SCD in SSIS - (this part is written with the contribution of Hamada Kais and Boris Kogan - thank you for your insight!)

Yes, you do have in ETL a component meant to track the way your entity has evolved over time, but...

  1. You have to ask yourself - what sort of data are you about to receive? Will you only get the new rows concerning your data or are you always going to get the entire history from the operational DB? The Slowly Changing Dimension Transformation component in SSIS is expecting to receive only new rows concerning the attributes of your business entity.
  2. The Slowly Changing Dimension Transformation component in SSIS only tracks changes from here onwards. If you need to build the SCD type 2 dimension for your business entity from its past, then you may need to use a stored procedure in T-SQL.
  3. SSIS only works with comparisons based on "equal" (=). If you're looking to make a comparison that uses a "between" or even only "less than and equal to" (<=) or "more than and equal to" (=>), you may need to customize your SSIS solution.
  4. For very large SCD dimensions, the Slowly Changing Dimension component in SSIS may not perform as well as using conditional transforms and lookups on filtered data. You should consider looking into the Table Difference component developed by Alberto Ferrari to help you there.

In any case, be prepared to do a lot of "cleansing" work. There's nothing like the QA that goes into checking the Slowly Changing Dimension to help you find all sorts of mistakes that may occur in the OLTP DB.

Still I have to admit, after you go through it all and emerge on the other side, there is nothing like the development of a Slowly Changing Dimension type 2 to make you feel like you've come a long way in BI development.

Report Builder 2.0 RC1
My favorite "toy" in SQL Server 2008 just got a brand new spanking release! Report Builder 2.0 just went RC1 after it didn't appear in the feature pack of SQL Server 2008 RTM. When I tried it out first, as a Report Designer Preview in CTP6, I loved it so much that I even decided to make a webcast out of it (both in Hebrew and in English).  Later, I found out this would become the "next generation" of Report Builder. When I downloaded the feature pack that came with SQL Server 2008 RC0 I admit I was a bit disappointed, as they took out the link directing me to the Data Source definition, and I for one really liked that functionality. But it all seems to have been worth the wait. First of all, they have changed the first screen again and what you see will start you on your way to either a Tablix or a chart (though obviously you can add later one of the other): 

 

And then you start your way on a wizard, going through from creating the Data Source: 

 

Creating a query with joins and a start of a parameter with just a few clicks: 

 

Creating groups with a simple drag and drop: 

 

And formatting those groups is just as easy: 

 

After a bit of styling, all that's left is to run the report: 

 

If you remember, in Report Builder 2.0 CTP6 you didn't have a wizard to help you build a query and a parameter so easily. I think I may soon find myself becoming obsolete for quite a few of my users... And that's not all because you can also open and edit Data Sources and Reports saved on the Report Server.

So, if you're excited just as much as I am, I strongly recommend you download Report Builder 2.0 in its RC1 version. The final release of Report Builder 2.0 is scheduled for fall of this year, along with the updated feature pack for SQL Server 2008.

Cumulative Update Package 9 for SQL Server 2005 SP2

CU9 for SQL Server 2005 Service Pack 2 has just been released, which I should have guessed as the grounds have also been made for Cumulative Update Package 10 as well.

It seems that you may need CU9 to correct a bug caused from installing Cumulative Update Package 7. What's the problem? Apparently you may get an error message after installing CU7, when you explicitly define a slice at a partition and then process the partition by using a query that returns no records. To read more about the problem that may arise from CU7, that was fixed in CU9, please check the Knowledge Base Article detailing the error message that can come when you process a partition. Another problem that's fixed is for an MDX query that's run against a Parent - Child dimension that has the HideMemberIf property set to ParentIsBlankSelfOrMissing.

You can read further about Cumulative Update Package 9 for SQL Server 2005 SP 2 and then also request a download of CU9.

Still, I have to ask, now that SQL Server 2008 has RTM-ed, when is SP3 going to be released?...

Please drive safely

Today, I understood the difference between reading about a car accident, and witnessing one.

I was on my way to work this morning and I had crossed the street to the other side. I even managed to take a few steps towards the office. Suddenly, I heard a screech of breaks from behind me. I turned around to see on the other side of the street, a man flying off in the air and land on the asphalt. I screamed.

After a second, I noticed the accident involved two people, the driver of the mini - bike (which I saw getting thrown off his mini - bike) and a pedestrian who had started crossing the street. People all around stopped and rushed to their aid. The driver of the mini - bike now had his helmet off. He was sitting on the asphalt shaking his head in disbelief, as a man who came to his aid was putting his hand on his shoulder. The pedestrian wasn't as lucky. He couldn't get up. A man was holding his hand and rubbing it while another supported his back, as he was lying on the side. Someone else was calling an ambulance, and all I could think was how lucky we're so close to the hospital.

An ambulance came within a few minutes. The driver was walking around anxiously. The pedestrian stayed lying on the asphalt. The ambulance took the pedestrian to the hospital, as police started questioning the driver.

There are so many accidents happening nowadays. Please think of your loved ones at home. Don't become part of the statistics, drive safely.

BIDS Helper for SQL Server 2008

I know I had recommended the BIDS Helper for you before, but as a new version of this Visual Studio add – in just got out this Saturday with support for SQL Server 2008, I thought it's a great time to remind everybody they should check it out.

Most of the features in this release are available only through their source code, which is there for you to compile and deploy yourself. My personal favorite feature would be the Non-Default Properties Report, which helps you track the settings in your SSIS Package or your SSAS cube and dimensions for varying properties. This should really help you monitor all the special definitions you made in your cube, without having to search all those property dialog boxes (which would be somewhat tiring, to say the least).

If you've been following the new features for Analysis Services 2008, then you probably recognize quite a few of them from the BIDS Helper which was available for you already from Analysis Services 2005. So now that there's a release for BIDS Helper for SQL Server 2008, don't you want to see what will be incorporated in the next version of SQL Server as well?...

SQL Server 2008 Express – Download it!

Till now Microsoft enabled you to download SQL Server 2008 for a trial version of 180 days. You could get it from MSDN for the Developers and Architects of you and from TechNet for IT Professionals and IT Managers (you get the same download on both sites, only with differently focused follow – up emails, according to your role).

Yesterday however, Microsoft also released SQL Server 2008 Express. Yes, I grant you, it does not have full functionality in it like SQL Server 2008, but I personally would be very happy to play around with what’s already inside and not think about 180 days till the expiration of the trial software. It has PowerShell Integration and supports new Date & Time data types, but at the same time, it still doesn’t yet include, amongst other things, the enhanced Gauges & Charting which were added to Reporting Services 2008 or the ability to export your reports to Word. For those, we’ll have to wait a bit longer till SQL Server 2008 Express with Advanced Services comes out, around the end of August.

So if you’re ready to start playing – go to download SQL Server 2008 Express.

SQL Server 2008 RTM for MSDN Subscribers

logo-header-sql08-dg

Yes, it’s finally out – THE NEW VERSION of SqlServer , 2008 ;)

A Parent Child Dimension for an Unbalanced Hierarchy

I found myself revisiting the Parent Child dimension quite unexpectedly.

I'm working on a project which involves the division of the city to different statistical areas. The division of the city into these different statistical areas happens once every 10 years or so, according to the changes in population tracked by the municipality itself and the government in the Census of Population and Housing.

Up till now, my cube had in it just the statistical areas distribution from one Census onwards and so had just one sort of distribution I needed to connect to in my cube. However, now I was asked to add historical data which included in it the distribution that existed before the current Census. The current distribution of the city into areas is either similar to the previous version, or at times a split of a certain areas into two new areas (for instance). If it stopped there, we could say that the lowest granularity level I need to connect to in the fact tables in my SSAS project are always those of the last Census. Unfortunately enough, that isn't the case. I was also handed a draft of the statistical areas forecasted for the Census in one year's time. This time around, there were areas which were some times identical, sometimes a merger of other areas and at other times a division of other areas. So, in effect, I didn't have anymore a certain Census for which the areas in it consisted for the lowest granularity. I needed to find a new way to define a granularity level for this dimension in my cube. I guess from the title you can infer the solution…

My statistical area is now no longer the simple hierarchy it used to be. Instead, I found it will be a different hierarchy for each Census conducted. Also, no hierarchy is necessarily of a lower granularity than another. In the new dimension – which is a Parent Child dimension - the areas would now be defined not just by their ID number, but also by the Census year to which they belong. The Parent Child dimension will now also have a fictitious level for each division or unification that happened. Considering that each area consists of 3 digits, and that I have areas before the Census of 1995, after it and after that of 2009, my Parent Child dimension could look somewhat like this:

 

While looking at the above, we have to remember that a new member can be created at each level of the Parent Child Dimension. I no longer have a balanced hierarchy, as each Census may lead to a new member created based on, or stemming from, one of the existing children. This is due to the fact that each area may become a parent to two or more areas (the lower branch of the tree) and a new area may become the parent of two or more merged areas (the upper branch of the tree).

Before, my Parent Child Dimension came from the fact I needed to connect to data at non – leaf levels, but I knew I had 4 levels in my organizational hierarchy, now I'm not sure how many levels I have in my Parent Child Dimension. This may mean that I may not be able to manipulate my Dimension to show in Report Builder.

Now all that’s left is to redefine the keys in the fact tables in my cube, do checks to see it was done properly, redefine the Panorama reports…

More Posts Next page »