{"id":177,"date":"2009-08-18T20:53:24","date_gmt":"2009-08-19T01:53:24","guid":{"rendered":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/?p=177"},"modified":"2009-08-18T20:53:24","modified_gmt":"2009-08-19T01:53:24","slug":"sql-server-2008r2-self-service-bi-gemini-how-it-performs-with-bigger-tables","status":"publish","type":"post","link":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/177_sql-server-2008r2-self-service-bi-gemini-how-it-performs-with-bigger-tables","title":{"rendered":"SQL Server 2008R2 Self Service BI &#8211; Gemini &#8211; how it performs with bigger tables"},"content":{"rendered":"<p>Gemini was released about a week ago and I was playing with it almost every evening. I am still trying to understand what it is, how it\u00a0works, what its limitations are, how to use DAX, etc. And I can say that I like what I see so far. <a href=\"http:\/\/business-intelligence.kdejonge.net\/working-with-gemini-and-excel-2010-to-make-a-pivot-table\/\" target=\"_blank\">Kasper de Jonge<\/a> and <a href=\"http:\/\/cwebbbi.spaces.live.com\/Blog\/cns!7B84B0F2C239489A!4690.entry\" target=\"_blank\">Chris Webb<\/a>\u00a0already posted their initial Gemini reviews. In this post I will share my experience with Gemini so far.<\/p>\n<p>First of all I want to point that for Gemini tests\u00a0all you need is Excel 2010 and the Gemini Add-in file (it is just about 30MB). You do not need SQL Server 2008R2.\u00a0I am pointing this out because\u00a0as I was\u00a0setting up my testing environment I installed SQL Server 2008R2 and\u00a0later realized that it is not required. It is quite amazing that this 30MB add in file contains such powerfull software &#8211; the Gemini front-end, and\u00a0the in memory\u00a0Analysis Services server. Of course for Enterprise level installation you will need SQL Server 2008R2 and Sharepoint 2010, but such setup is not part of my current tests.<!--more--><\/p>\n<p>\u00a0After my initial tests with a few thousand records I finally decided to do tests on bigger tables.\u00a0I took the\u00a0AdventureWorksDW database and expanded the FactInternetSales table\u00a0to 22mln records by simply duplicating the included records. This table size grew to 3.2GB. My goal was to see how well Gemini can handle millions of rows. I ran my tests on my Laptop with 3GB of RAM and an Intel Duo CPU 2.5GHz processor. On the same laptop I have my SQL Server 2008 database with my expanded data, Excel 2010 and the Gemini add-in. I limited SQL Server memory to 500MB.<\/p>\n<p>My\u00a0first tests to load 22mln records into the Gemini add-in failed. After loading over 21mln records I got an\u00a0error message:<\/p>\n<blockquote><p>Memory error: Allocation failure : Not enough storage is available to process this command. .<br \/>\nServer: The operation has been cancelled.<br \/>\nServer: The operation has been cancelled.<\/p><\/blockquote>\n<p>I tried to stop some services, close\u00a0other applications and then repeated the\u00a0same table load into Gemini, but always at the very end I was getting the same error message. Finally I started to reduce number of records in my FactInternetSales table and that helped. I found that on my laptop I was able to load just about 20mln rows. Considering that this laptop also runs SQL Server, I can live with this limitation.<\/p>\n<p>It took me about 5min to load into Gemini 20mln rows and as I repeated my FactInternetSales table load multiple times, I saw consistent Gemini\u00a0data load speed &#8211; about 4mln records per minute. Of course, load type depends on how many fields there are in the table, machine power, etc. If\u00a0 your SQL Server will be on the network, your load speed will depend on the network&#8217;s speed too. But this gives you an idea &#8211; when working with Gemini and bigger data sets it will take time to get data into Gemini. This actually comes as no surprise, just I am impressed with this speed. During the load I was checking the task manager and noticed that Excel was using 80-90% CPU time and SQL Server was using 10-20% CPU time. Total CPU usage during the data load into Gemini was almost always 100%.<\/p>\n<p>Data loading was most time consuming operation. As soon as I had 20mln rows in Gemini I wanted to see if sorting and filtering operations really works as fast as I saw in demos before. I always doubted that Gemini can work with millions of rows that fast. But I was not disappointed &#8211; filtering on any field or sorting worked in sub-seconds. I tried to do multiple filters, sorting, filters again but Gemini responded so fast that I always had to track the displayed record count in the left bottom corner just to confirm that the operation was executed. I was very impressed.\u00a0This will please many Gemini users and I expect to hear many\u00a0wows \u00a0during demos.<\/p>\n<p>Then I loaded a\u00a0few dimensions and manually added relationships to the fact table. After that I created a\u00a0pivot table and started to browse data. Again, I was blown away by Gemini&#8217;s speed &#8211; millions of records on my laptop and a response to any query comes back in milliseconds. How do they do that? I still cannot believe I am getting this power from the 30MB add-in.<\/p>\n<p>After that I hit &#8220;Save&#8221; button. It took about 40 seconds for Excel to save this\u00a0new excel file. This file contains all of the\u00a0data (!) and metadata from all tables and pivot definitions. The\u00a0saved Excel file size was 72MB. Just to remind you\u00a0&#8211; I loaded over\u00a03.2GB SQL Server table into Gemini that compressed and saved that\u00a0data in the 72MB Excel file. I probably got such a great compression because my table contained a lot of duplicate data. I noticed that saving Excel file after any changes with such amount of data always was on a slow side &#8211; it took about 20-40 seconds.<\/p>\n<p>Opening the Excel file took just a few seconds and pivot table took few more seconds to return results from a changed query. Opening the Gemini add-in with 20mln rows saved\u00a0took just a few seconds too.<\/p>\n<p>Overall I am very impressed with Gemini and the way\u00a0it handled my fact table with 20mln rows very well on my laptop.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Gemini was released about a week ago and I was playing with it almost every evening. I am still trying to understand what it is, how it works, what are limitations, how to use DAX, etc. And I can say that I like what I see so far. Kasper de Jonge and Chris Webb already posted their initial Gemini reviews. In this post I will share my experience with Gemini so far.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[17],"tags":[],"aioseo_notices":[],"_links":{"self":[{"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/posts\/177"}],"collection":[{"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/comments?post=177"}],"version-history":[{"count":6,"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/posts\/177\/revisions"}],"predecessor-version":[{"id":183,"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/posts\/177\/revisions\/183"}],"wp:attachment":[{"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/media?parent=177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/categories?post=177"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.ssas-info.com\/VidasMatelisBlog\/wp-json\/wp\/v2\/tags?post=177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}