Report Portal

About me

I work as SQL Server DBA / Developer and BI Consultant in Toronto, Canada.Canada Flag More...
Vidas Matelis picture

Search

blank

SSIS package that process all partitions/measure groups/cubes in one database

April 10th, 2007 by Vidas Matelis

Recently I posted a blog entry on how to process all dimensions in a single Analysis Services 2005 database using SQL Server Integration Services (SSIS) package. Here I’ll add code to this package that will allow you to process whole database, cubes, measure groups or partitions.

Please follow steps from previous post on how to create new package and add script component. Then for the script component use code from bellow:

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

Public Class ScriptMain
    Public Sub Main()
        Dim ExecutionSuccessfull As Boolean = True ‘ If true, package executed without errors
        Dim sProcessType As String = “ProcessFull”

        Dim oConnection As ConnectionManager
        oConnection = Dts.Connections(“DM”)
        Dim sServer As String = CStr(oConnection.Properties(“ServerName”).GetValue(oConnection))
        Dim sDatabaseID As String = CStr(oConnection.Properties(“InitialCatalog”).GetValue(oConnection))
        Dim oServer As New Microsoft.AnalysisServices.Server
        oServer.Connect(sServer) ‘ connect to the server and start scanning down the object hierarchy
        Dim oDB As Microsoft.AnalysisServices.Database = oServer.Databases.FindByName(sDatabaseID)

        If oDB Is Nothing Then
            ExecutionSuccessfull = False
            GoTo Done
        Else

            Dim oDim As Microsoft.AnalysisServices.Dimension
            Dim oCube As Microsoft.AnalysisServices.Cube
            Dim oMeasureGroup As Microsoft.AnalysisServices.MeasureGroup
            Dim oPartition As Microsoft.AnalysisServices.Partition

            oServer.CaptureXml() = True ‘ Start capturing XML.
            ‘ oDB.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Use this to process whole DB at once
            ‘ Process all dimensions
            ‘For Each oDim In oDB.Dimensions
            ‘    ‘ This will generate XMLA, but because CaptureXML is True, will not execute it!
            ‘    oDim.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all dimensions
            ‘ Next

            For Each oCube In oDB.Cubes
                ‘ oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all cubes

                For Each oMeasureGroup In oCube.MeasureGroups
                    ‘ oMeasureGroup.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ‘ Process all measure groups

                    For Each oPartition In oMeasureGroup.Partitions
                        oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull)  ‘ Process all partitions
                    Next
                Next
            Next
            oServer.CaptureXml() = False ‘ Stop capturing XML
            ‘ Execute captured XML. First parameter Transactional, second parameter Parallel, third optional parameter: processAffected
            ‘ These are very important parameters!

            Dim oResults As Microsoft.AnalysisServices.XmlaResultCollection
            Dim oResult As Microsoft.AnalysisServices.XmlaResult
            Dim dataBytes(0) As Byte

           
            oResults = oServer.ExecuteCaptureLog(True, True)

            Dim oMessage As Microsoft.AnalysisServices.XmlaMessage

            ‘Log the errors and warnings
            For Each oResult In oResults
                For Each oMessage In oResult.Messages
                    If oMessage.GetType.Name = “XmlaError” Then
                        ‘The processing failed
                        Dts.Log(oMessage.Description, 0, dataBytes) ‘ You need to setup proper package loging to see this!
                        ExecutionSuccessfull = False
                    Else
                        ‘It’s just a warning.
                        ExecutionSuccessfull = True ‘ if you want to fail on warning, change this to False
                    End If
                Next oMessage
            Next oResult

        End If
Done:
        oServer.Disconnect() ‘ disconnect from the server — we are done

        If ExecutionSuccessfull Then
            Dts.TaskResult = Dts.Results.Success
        Else
            Dts.TaskResult = Dts.Results.Failure
        End If
    End Sub

End Class

By default this package will build code to process each partition. This is just an example. There is commented out code there that can process all measure groups or cubes or whole database at once.  Use this code as guidance to build your own SSIS package that does what you want.

In my data warehouse design I decided to maintain the list of partitions in SQL Server table. I have a “partition base” table that contains base information about each measure group and partitioning field (I have just one, usually date field).  Then I have a stored procedure, that during each DW load calculates if new partitions have to be created or old ones have to be deleted and loads this information into extended partition information table. It also marks which partitions are affected by latest load and has to be reprocessed. Although this might seem like a complicated task, it is not. And it is well worth the effort. My SSIS package then compares this extended partition  table with actual SSAS database and then drops/creates new partitions and process partitions that have to be processed.

I would be interested to hear how other people are managing processing of partitions?

Posted in SSAS, SSIS | 33 Comments »

33 Responses

  1. Dan Meyers Says:

    Just what I was looking for… I recently posted the URLs for the other two post that have to do with processing dims and designing aggregates on my blog. I will add this URL too. Thanks for the code!

  2. Jesse Orosz Says:

    We handle processing of partitions through AMO code which is compiled as a dll and called through a SSIS package. All SSAS processing actions (updates, new partitions, deletes) are logged into a SQL table (we call is CubeProcessingLog). This table is then used with the ETL logging table to see what needs to be processed every hour or day. By doing it this way we only process the new incoming data; never older data. It’s fairly complex, however I could let it run for years and not have to touch it.

  3. Vidas Matelis Says:

    Jesse,

    Why do you compile AMO code as dll? Are there any benefits for that?

  4. sergey Shenblum Says:

    I tried to use only AMO to create the Scheduler for ETL and dimensions/partitions data processing but…hit some bugs. some code for partition/dimension automation is at one of my blog. Basically, the scheduler is controling and recording the status and steps for SSIS ETLing data into Database and then parallel processing of cubes separating dimensions processing and partitions processing…etc..all configurations as saved in .xml file (web config) as metadata to drive ETL and cubes’ schema changes if any…

  5. Vidas Matelis Says:

    Sergey,

    It looks like in your blog you are talking about special case when during processing you also changing source table of measure group. Is that right?
    Have you consider having SQL View that is used as a source for the measure group. Then you would change just view definition (example select from different table) instead of trying to solve this in AMO?

    Vidas Matelis

  6. Sergey Says:

    Yes, I have developed the controller that automatically updates data (ETL with SSIS packages) in the Data Warehouse tables and processes the dimensions and then partitions based on successfull or not successfull run of SSIS packages. The logic of controller is based on statuses of ‘steps’: etl,process dimensions,process partitions, final cleansing of staging data. it also has locking mechanizm (based on logical grouping for certain cubes to be ETLed and Processed) to prevent any duplication of processing for dimensions shared in many cubes. I am thinking now to have PARALLEL processing, so far it is sequencial – based on success – the next block of steps is executed.The controller supports all type of MSAS processings.COntroller also can process the certain date, month, year or any FROM – To Interval with granularity hr,day, week, month, year.
    And, yes, you are correct,Vidas, one of the tags in XML file that stores metadata to control/execute the controller, specifies the query binding and staging table to process partition based on time granularity for specified date or FROM-TO interval.
    This controler also records log info into SQl tables what is happening during execution to be able to troubleshot where and why. on my blog I have posted just one class that gave me a headache…with AMO…only XMLA batches will be able to do the job and the XMLA file needed to be ‘parametrized’ like ###database### to be able to replace certain names from controller.

  7. Sergey Says:

    Thank you Vidas for sharing the source code – would use it for Process Full.

  8. Sergey Says:

    ProcessAdd for dimensions from querybinding – is a ‘feature’ that does not work as it supposed to.

    see http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1117591&SiteID=1

  9. Bhavin Says:


    Else
    ‘It’s just a warning.
    ExecutionSuccessfull = True ‘ if you want to fail on warning, change this to False
    End If

    Small possible issue, I think the ExecutionSuccessfull should be commented out, unless failures are always the last item in the results collection. Otherwise a warning may set the bit back to true even if an error did occur but was processed earliar in the collection.

    Regardless thanks for the code.

  10. John Hardin Says:

    Vidas:

    Do you know whether there’s a way to control the parallelism of the partition reprocessing? I’m trying to write a package that will reprocess multiple partitions in one measure group, and I want to regulate the load on the server.

    Thanks!

  11. Vidas Matelis Says:

    John,

    I could not find a way to control Parallelism during processing using ExecuteCaptureLog method. I’ll try to research a bit more, and if I’ll find answer, I’ll post it here.

    You probably already know that you can control parallelism with XMLA. There is Parallel parameter for that like in this example:

    < Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    < Parallel MaxParallel="4">
    < Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2">
    < Object>
    < DatabaseID>Adventure Works DW< /DatabaseID>
    < /Object>
    < Type>ProcessFull< /Type>
    < WriteBackTableCreation>UseExisting< /WriteBackTableCreation>
    < /Process>
    < /Parallel>
    < /Batch>

    Vidas

  12. Emily Says:

    Hello Vidas,

    The code above helps a lot, I built the similar code in SSIS package for my project, it runs successfully, but I cannot see the detail processing log information, i.e which dimensions and partitions are processed, is there any log file generated by Analysis manager for this processing?

    Thanks,
    Emily

  13. Vidas Matelis Says:

    Emily,

    I have not implemented proper loging in this package. My goal was just to share code to process objects.

  14. Ron Says:

    I built a package that processes the measure groups in parallel by launching the same package four times in a sequence container. Prior to calling the sequence container, I inserted a record for each measure group I wanted to process into a transient table. Each measure group process registers itself via a SP and a registered process table and then uses the returned registered process id as a unique value to request the next available measure group to process. When I run it on my machine locally, it works like a champ. When I deploy it to a server with four processors, I randomly get one or two deadlocks which cause those instances to be shut down. The remaining processes still successfully process the rest of the measure groups, but I cannot figure out why I am encountering deadlocks. Any ideas?

  15. Vidas Matelis Says:

    Ron,

    By default this package will already process everything in parallel. There is no way to limit how many parallel tasks will run, so it will use all processors available.
    If you need to control parallelism, better way would be to build process XMLA script like I did in my post here: http://www.ssas-info.com/VidasMatelisBlog/7_ssis-package-to-process-all-dimensions-in-ssas-2005-db

    Can you please give a bit more details why you are running 4 packages in parallel?

  16. Steve Says:

    Vidas, This is just what I was looking for. Your explainations are great and easy to understand. Thanks a million

  17. Naveen Says:

    Hi Vidas,

    I am thinking about processing the indexes after processing the dimensions, measure groups and partitions in the cube. Is it required to do that?

    If yes, does the following modified code work ?


    For Each oCube In oDB.Cubes
    oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ' Process all cubes

    For Each oMeasureGroup In oCube.MeasureGroups
    oMeasureGroup.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ' Process all measure groups

    For Each oPartition In oMeasureGroup.Partitions
    oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull) ' Process all partitions
    Next
    Next
    oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessIndexes)
    Next

  18. Vidas Matelis Says:

    Naveen,

    ProcessFull option already processes all data and indexes. If you want to split processing off data and indexes, then insteead of doing ProcessFull you should do ProcessData and then ProcessIndexes.

    Please note than in my script oCube.Process and oMeasureGroup.Process lines are commented out and left there just to show that you have an option to do processing at the higher level.
    So your script should be:

    For Each oCube In oDB.Cubes
    For Each oMeasureGroup In oCube.MeasureGroups
    For Each oPartition In oMeasureGroup.Partitions
    oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessData) ‘ Process all partitions
    Next
    Next
    oCube.Process(Microsoft.AnalysisServices.ProcessType.ProcessIndexes)
    Next

    By the way, it is recommended to do process data and process indexes in the separate statements when you want to monitor if you have too many aggregates and because of that processing takes too long.

  19. Naveen Says:

    Hi Vidas,

    Thanks for the swift reply!
    I changed the script accordingly, and now would you be able to point me to the right resources where i can learn setting up ‘proper error handling – save errors/warnings to dts log’ ?

  20. wookies Says:

    Hi Vidas,

    How can i process specific measures group using your code? Any Ideas?

  21. C Sekhar Says:

    Hi Vidas,

    In my implementation, I am following exactly the same steps as you did above, and it worked well. I am handling the KeyNotFound issue by setting it to IgnoreError in Error Configuration.

    However, as per my requirement, I am required to capture those records that are causing the KeyNotFound error and write them to a log. Is there a way to achieve it?

    I’ve tried to achieve the same and but didn’t find a way.

    Could you please let me know how to do the same? Your early reply will be greatly appreciated.

    Thanks
    Sekhar

  22. Ram Says:

    Hi Vidas,

    Thanks a lot this is very useful

    is there same thing for partitioning of the cube?

    Thanks & Regards,
    Ram

  23. eshwar Says:

    Hi Vedas,

    I have a question on partitioning
    I have a partition based on year but the data can be added to a market(country) for any year.
    so we end up with processing all the partitions

    in this what should be our strategy?
    is the partitioning on the market will help?

    waiting for your reply…..
    Thanks,
    Eshwar

  24. Vidas Matelis Says:

    Hi Sekhar,

    Currently I process partition using statement:
    oPartition.Process(Microsoft.AnalysisServices.ProcessType.ProcessFull)
    .Process method has another optional parameter with type ‘Microsoft.AnalysisSErvices.ErrorConfiguration’
    This parameter “Specifies settings for handling errors that can occur when the parent element is processed.” (from BOL)
    I have not used it myself, as I do such type of cleanup during ETL, but you should be able capture key errors using it.

    Regards,

    Vidas Matelis

  25. Vidas Matelis Says:

    Ram,
    For partitioning cube please see my blog post:
    http://www.ssas-info.com/VidasMatelisBlog/13_ssis-package-to-dropcreate-partitions-based-on-partition-list-in-the-sql-server-table

    Regards,

    Vidas Matelis

  26. Vidas Matelis Says:

    Eshwar,

    At the end of this post I have a note, where I explain, that I actually track in SQL server during ETL load what partitions are changed and then I process just affected partitions. You should do the same. There is a bit info about that in this post:
    http://www.ssas-info.com/VidasMatelisBlog/13_ssis-package-to-dropcreate-partitions-based-on-partition-list-in-the-sql-server-table
    But you will have to fill in missing code yourself. For different reasons I cannot share my full code.

    Regards,

    Vidas Matelis

  27. Sue Says:

    I wrote a vb.net assembly and call from SQL Agent as we are not able to use SSIS for processing. I’ve looked, but I can’t find any sign that anyone else has done it this way and am looking for help with error config. Any suggestions?

  28. Vidas Matelis Says:

    Sue,

    Post your questions in the SSAS MSDN forum here:
    http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/threads/

    I am sure somebody from the forum will help you.

  29. Rui Says:

    Hi Vidas,

    I’ve already implemented an incremental processing based on your posts and everything works as expected!! Thank you so much for sharing it ;)

    Now i’m trying to implement an incremental processing using partition processadd type but i’m facing some problems… In my scrip, everything is very similiar to your post except partition processfull type wich i use processadd type and a query binding. The thing is, when i try to process only one partition using processadd type it works fine… but when i try to process two partitions in parallel it return a script error…

    After some “digging”, i’ve concluded that the generated xml in the script is not correct when compared to the generated xml by BIDS… Have you already face this problem too? Do you know a workaround?

    That’s in advance!!

    Best regards

  30. Vidas Matelis Says:

    Hi Rui,

    Sorry, I am not using ProcessAdd type, so I have not seen this problem. I don’t know what good workaround would be. I am assuming that doing ProcessAdd one by one is not acceptable – as this would take much more time than running processing in parallel.

  31. Analysis Services (SSAS) Processing and Aggregations « Dan English's BI Blog Says:

    […] have to give credit to Vidas Matelis here for providing some of the AMO coding for the processing, SSIS package that process all partitions/measure groups/cubes in one database .  I modified it a bit to be able to limit the scope for a particular cube along with a Try Catch […]

  32. Vaibhavi Sawant Says:

    Hello
    Ur blog is very informative for SSAS learning people.
    I have a query regarding proclarity.
    I want to make report which gives information about visitors of my dashboard.I got some data from proclarity event viwer but it dosenot give information about which dashboard or view usr is accessing.
    Can you help on this.
    Thanks in advance waiting for reply :)

  33. Prashant Mhaske Says:

    i have issue regarding dynamic partition processing using SSIS Script Task. while processing selected partition , other partition are also getting process. Why ?