Day 3 of my virtual advent calendar, about stuff I like in SQL Server 2008..
Backing up of analysis services in SQL Server 2005 ran out for steam for databases of about 20Gb, and so you have to follow a different route to back them up (see this technet article). The time taken to back up a database does not grows faster than the growth in size of that database. e.g. doubling the size of the data might mean the backup takes 3 times longer.
This has been fixed in SQL Server 2008 as the backup process has been completely rewritten and now closely follows the time it would take to copy the physical files. This means that the simple backup in analysis services can now cope with data sizes of 100Gb plus which is pretty good when you remember that the analysis services database is typically a quarter or a sixth of the size of the relational database the data came from.
Finally,it’s a trivial exercise to migrate an analysis services databases from SQL Server 2005 to 2008 either through a backup restore or by opening and deploying the source project to analysis services 2008.
Day 2 of my virtual advent calendar, about stuff I like in SQL Server 2008..
Following on from my previous post, in some data warehouses there is a separate dimension for time of day, so that demand through a day can be modelled. Storing time in SQL server 2005 was a bit of a cludge typically involving picking an arbitrary date (like 1/1/1900) and then tacking the time on to the end of that. Now there’s a separate time data type so it’s easy to store the right data and create the time dimension using a script like this:
declare @time time = '00:00'
declare @timekey int = 0
declare @timegrain int =15
if not exists
(select * from sys.tables where name = 'dimTimeofday')
create table dimTimeofday( timekey int, TimeofDay time)
while @timekey < 1440 begin
insert into dimTimeofday(timekey,Timeofday) values (@timekey, @time)
set @time = dateadd(minute,@timegrain,@time)
set @timekey += @timegrain
end
For more on the new time data type check books on line here.
There is no possible connection between databases and Christmas, so I simply wanted to share 24 my favourite bits and bobs that are in SQL Server 2008….
The new date data types in SQL server crack a problem I have had since I started BI as there is (nearly) always a time dimension in every data warehouse. In many retail data warehouses the grain (level of detail) was reduced by grouping all transactions in a day. The trouble is in SQL Server 2005, you had to create a date like 25/12/2008 00:00 as per dimTime in AdventureWorks
Now we a date datatype which also allows ius to store just the date and has implicit conversion form datetime to rip out the time.
For more on the new date data types check books on line here.
I see a lot of people asking about how to make integration services highly available and there isn’t an easy answer, so let me explain why.
Integration services (SSIS) is designed to bulk move a set of data from a source to a target while performing some sort of translation on the way. The most often causes of failure for SSIS packages I have seen are, in order:
1. The source is not available. This can because a process upstream of the SSIS package has not completed to create the target, a change in credentials or an actual issue with the source
2. The target is not available. Credentials or networking issues are the main culprits
3. There has been some change to the structure of the source or the data in it , not anticipated in the design of the package.
No SSIS high availability solution is going to be able to mitigate these risks, so you end up protecting what you can by making the source and targets highly available and then try and figure out some way of doing the same for SSIS which doesn’t have any inherent high availability built in e.g. it is not cluster aware.
If this is something you do want to do then I would run SSIS on two separate servers and put tests and logging in the packages such that if a package runs on server B it checks to see it has not already been run on server A. In this scenario it’s important to ensure each package can be rerun from the beginning without corrupting any data (something which is good practice anyway). It is also possible to set up load balancing for SSIS(http://technet.microsoft.com/en-us/library/ms345184.aspx), and store logging information centrally (which could be on a cluster).
Another thing to consider if this flow of data is mission critical and not too large is to use a completely different approach like BizTalk.
Activity Monitor is a simple and quick way to see what’s happening right now in SQL Server,
Not too much happening on my demo rig but you get the idea.
It took me a few minutes to open this because my local books on line has an error in it on how to open it (it suggests expanding form the management tree) so my other tip is to use the books on line that is actually on-line.
Another good thing that’s done at the SQL Server community evenings is a 5 minute slot where anyone can have a go at showing something interesting.
Tony Rogerson got me to open the batting to encourage others to have a go. I am not sure who it was who came up next but it was a simple thing on renaming the infamous sa account…
ALTER LOGIN sa WITH NAME = SECRETSA.
Tony was too keen on leaving his laptop like that in case anything broke so just run the same command again..
ALTER LOGIN SECRETSA WITH NAME = SA.
It’s a useful security tip, especially if you are seeing lots of failed sa logins.
Among the interesting stuff discussed at last nights SQL community meeting at the Microsoft Campus,was a lively debate on whether to wait for SQL Server 2008 sp1. The traditional view has been to wait until sp1 because that is the stable release. One wit in the audience suggested that SQL Server 2008 was really SQL Server 2005 sp4 and therefore everyone should go right ahead. While SQL Server 2008 is actually a lot more than a service pack, it is not the major rework that SQL Server 2005 was.
David Portas an MVP working for Conchango pointed out how stable it was and he should know he’s been using it for year. He went on to pint out that maybe an sp1 would not be that well tested and would you really deploy that version the day it came out?
My own take on this is that there probably won’t be a service pack for a long time if at all. SQL Server 2008 is very stable and don’t take my word for it because if it wasn’t the stories would be all over the web (the Register, ZDNet, slashdot etc). So your decision on when to upgrade should not be influenced by that and reasons to upgrade should instead depend on what you are doing with SQL Server:
- You want to move off of SQL Server 2000 perhaps because are concerned about support
- You have SQL Server 2000 and want to go 64 bit so you can address more memory
- You want to use one of the new data types in your application, like spatial or filestream.
- You want to better manage a large estate of SQL Servers and consolidate them.
If your interested in SQL Server 2008, you might want to download the product from your TechNet subscription and at least install the client on your local machine so you can see how it looks.
It’s a simple matter to restore SQL Server backups to later versions, in fact you don’t need to do anything different but what about the other way around?
I first got caught out with this back in the days of SQL Server 7 and 2000, and the basic answer is the same today for SQL Server 2008 as it was then – you can’t restore a backup from a newer version of SQL Server to an older one. Attach and detach will also fail.
The simplest approach I can think of is to transfer database objects in Integration Services. And don’t forget to bring over any logins as well to avoid orphaned users for example:
- to get them out of the current system use SSIS or this BCP command
Bcp master..syslogins out \\sqlserverdemo\data\syslogons.dat -N -S . -T
EXEC sp_resolve_logins @dest_db= ‘personnel’
@dest_path= ‘\\sqlserverdemo\data\’
@filename= ‘syslogins.dat’
The only reason I can think of that you would need to do this would be to back out of a migration to a later version of SQL Server, and my advice is to do everything possible to avoid the need for this to happen by carefully planning the migration.

I am still getting loads of questions on virtualising SQL Server so rather than blogging away here, I am getting involved with Windows ITPro magazine along with a number of cross industry virtualisation experts to run a virtual (what else would it be!) event “Virtualization: Get The Facts” from 11:00am (GMT) tomorrow.
I am on the SQL server slot at 15:00 so if I haven’t answered your specific questions you can virtually chat to me from the comfort of your desk, and we can both save the cost and time of travelling.
If you need any other incentive you can win an ipod nano (??) for joining in!
The other hot topic at last Friday’s ICAEW round table on Business Intelligence was KPI’s.
Toby Wilson (Finance Director of Microsoft UK) explained that Microsoft uses 30 KPIs despite the diversity of its diverse business (XBox, Mobile, Live services, Software, Consulting etc.). These are set top down and although they are a constraint they are fair because everyone knows the rules of the game. Once you get green on so many of these (including revenue of course) then an FD has flexibility to innovate to be more successful.
One of the other finance directors on the panel countered this by stating that there was only one KPI needed by a business – cash. His assertion was that many a profitable business has gone to the wall because it actually had no money, and in the current credit crisis this is even more likely.
However the sorts of KPI’s used at Microsoft and many other companies I have worked for are lead indicators and monitoring these will lead to sustainable long term growth. This is where the balance in balanced scorecard comes in. In this case balance between success today and success tomorrow.
A typical balanced scorecard might look like this with KPI’s under each heading
- Financial health including cash, contribution margin, cost base are showing you how you are doing today.
- Growing the business. Improving market share, by retaining existing customers and attracting new ones.
- Innovation. Developing and delivering new ideas to maintain your edge of the competition.
- People. Retaining and attracting the right talent to your organisation.
The other key facet of the balanced scorecard and its associated KPI’s is to derive departmental, team and individual scorecards from the top level scorecard for the company and to link individual bonuses and performance to these.
This approach was worked in numerous companies from the global Microsoft sized organisation down to small businesses with under 50 employees, since the theory was first published by Robert S. Kaplan and David P. Norton in 1992. To conclude I am sure that the focus it provides to all staff in an organisation is even more important today than it was then, but it is also important to never forget about financila health whether your own or the company you work for.
As I have remarked before Business isn’t run by the IT Department it is run by the people who hire and fire the IT department i.e. the Finance Director (FD). Given the current economic climate now would be a perfect time to see what you can do to help, so I have spent the morning listening to my FD (.. of Microsoft UK), Toby Wilson at the Institute of Chartered Accountants of England & Wales (ICAEW). He joined a round table on Business Intelligence hosted by Andrew Sawers the Editor of Financial Director. I thought some of the discussions although often repeated would be worth repeating because these are straight from the men with the cheque books, and their number one priority is:
One Version of the Truth. Often repeated mantra perhaps because it is so rare in many organisations. Toby deals with this in a two key ways:
- Challenge any other report that cannot be reconciled with his own data. The main UK scorecard and financial data at his disposal has been rigorously tested and so is the accepted truth.
- Data provided by third parties e.g. PC sales from the Dixons Group needs to be accurate and timely and this can be relied because that reporting mechanism is how third parties get their incentives and discounts.
One really obvious point is the need to snapshot your data at point in time so all reports from it are based on the same set of data. Several technical approaches can help to achieve this
- database snapshots, although this is quick it is fragile as snapshot are dependent on a database and if the version of this database is lost the snapshot are useless and cannot be easily recreated.
- Type 2 slowly dimensions (see this post on type 2 dimensions for more info), can have extra attributes to marked when a row is in force or not, as can the fact table. This can be important to rule in or rule out financial adjustments reversals etc. applicable to a particular scenario.
- A special scenario dimension which allow fact to be included in certain scenarios and not in others. This allows for multiple budgets and forecasts to be sotred alongside the actuals.
My other top tip is to ruthlessly ensure that every reports contains a header with the all the details of how filters are set the date it was published and by whom etc.
Data Visualisation is one of my many interests and Marc Holmes showed us a bunch of wordles like the one above in yesterdays team meeting, and we had to guess the blogger. Mike Taulty won despite not recognising his own blog!. You can see why Marc didn’t use my wordle, SQL Server sort of stands out from the rest.
So what you see is what you get!
Technorati Tags:
wordle,
blogging
So here’s how it’s done..
So here’s how it’s done..I rarely know more about SQL Server than my friend Beatrice so I was surprised when I mentioned plan guides and she hadn’t heard of this new feature. I have to say this is not a big mainstream thing and it takes me back query hints in SQL Server 2000.
Essentially you get the optimiser to generate a plan for you and then apply this to the query so that it always gets used every time that query is subsequently run. It might be a good way of getting the best performance out of a third party application you can’t change.
So here’s how it’s done..
SET STATISTICS XML ON
GO
SELECT WorkOrderID, p.Name, OrderQty, DueDate
FROM
Production.WorkOrder AS w
INNER JOIN Production.Product AS p ON w.ProductID = p.ProductID
WHERE
p.ProductSubcategoryID > 4
ORDER BY
p.Name, DueDate;
GO
SET STATISTICS XML OFF
GO
The important bit is the query plan (double click on the xml to see this)..
You can get the SQL handle for the plan like this …
SELECT
*
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY
sys.dm_exec_sql_text(qs.sql_handle) AS qt
CROSS APPLY
sys.dm_exec_text_query_plan(qs.plan_handle,
qs.statement_start_offset,
qs.statement_end_offset) AS qp
WHERE
qt.text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';
GO
and create a plan from it with this…
DECLARE @plan_handle varbinary(64);
DECLARE @offset int;
SELECT
@plan_handle = qs.plan_handle,
@offset = qs.statement_start_offset
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY
sys.dm_exec_sql_text(qs.sql_handle) AS qt
CROSS APPLY
sys.dm_exec_text_query_plan(qs.plan_handle,
qs.statement_start_offset,
qs.statement_end_offset) AS qp
WHERE
qt.text LIKE N'SELECT WorkOrderID, p.Name, OrderQty, DueDate%';
EXECUTE sp_create_plan_guide_from_handle
@name = N'MyPlanGuide',
@plan_handle = @plan_handle,
@statement_start_offset = @offset;
GO
There are event classes to see if its being used or missed e.g. you might have changed the schema for example and these are cunningly named as
- plan guide successful
- plan guide unsuccessful
You can check your plan guides using fn_validate_plan_guide(plan_guide_id), for example
USE AdventureWorks;
GO
SELECT plan_guide_id, msgnum, severity, state, message
FROM sys.plan_guides
CROSS APPLY fn_validate_plan_guide(plan_guide_id);
GO
You can see if a plan guide is in use if you run the query again and look at the properties of the query plan (press F4):
The plan guide will also show up under programmability in Management Studio:
and to get rid of it you’ll need to run
EXEC sp_control_plan_guide @operation = N'DROP', @name = N'[MyPlanGuide]'
Finally the full detail is here in the SQL Server TechCenter
I rarely go off topic on my blog, but I was wondering whether I am wasting my time here. Eileen rhas some research predicts that blogging is becoming redundant and the new thing to do is microblog on Twitter (I am DeepFat on Twitter) , Yammer et al, as this is the best way to share things instantly. My research indicates that my stuff is being read more and more and I get a lot of positive feedback when I get to meet some of you. I do use microblogging to keep in touch with what I am doing, but not how to do stuff or why.
Just to be clear this blog is not about level 400 deep dives in to the dark corners of SQL Server. It’s about:
- (hopefully) useful stuff related to data management and BI,
- making sense of how Microsoft’s solutions fit together,
- when to use what.
I don’t see the point in competing with the excellent technical output from MVPs and the wider UK SQL community, or with all of the essential stuff in TechNet. When I call out important KB’s, Hand on Labs, WebCasts and events, it’s because these are in my opinion the good ones.
Another point is that I find blogs generally easy to search and I use mine as my personal books on line. Some of the stuff on here requires considerable research, as I haven’t always got the answer, so if I am wasting my time here I can free up about 20% of my day for other things.
My final point is that blogs are quite permanent which is both good and bad. Good because when SQL Server 2008 is in extended support some of this will be useful to those businesses that haven’t upgraded yet. Bad because if I make a mistake it’s there until it’s discovered (so far only the odd typo).
OK Rant over, let me know what you think while I get back to SQL Server.
Imagine you have a windows 2003/SQL Server 2005 cluster (active-active) and your mission is to both virtualise and maintain high availability using Hyper-V in Windows Server 2008/SQL Server 2008. This may sound odd but I was asked the question at TechEd and I wanted to do some research on it. Fortunately I on Windows 7 training this weekend and have access to 2 of the key players Jeff Woolsey who ‘owns’ Hyper-V development and clustering product manager Dave Dion.
So are your options? and more importantly how do they help?
I would start this by addressing what are you most worried about…
- Server outage
- Site Outage
- Network issues
- SAN issues and media loss
Also what are your goals e.g.
- Database availability
- User response time
- Supported by your solution providers (e.g. hardware vendors, Microsoft etc.)
- Ease of Maintenance, automation diagnostics
- Predictability that when something goes wrong you know what will happen to mitigate it.
Finally other workloads are also on this setup which will also be virtualised.
The customer who asked me this wanted to cluster the cluster by..
- Creating a hyper-V cluster ClusterH on Node1 and Node2
- Creating virtual machines (VM) vmA and vmB to run on that Cluster with a resource group for each VM such that vmA runs on Node1 and vmB runs on Node2 (i.e. active active). Both of these are using pass though disks i.e. they are talking directly to a LUN on a SAN and so will perform very well.
- Clustering vmA and vmB to form ClusterV with SQL Server 2008 installed.
So how does this stack up against our criteria above?
Basically not very good..
Initially this solution looks quite good - if node 1 or node 2 dies then the surviving vm on the surviving node will take ownership of the storage. But this will take time. Currently hyper-V VM’s only support iSCSI (fibre channel is on the way). The two VMs will have different routes to the same shared storage so if that is not accessible then the system will fail. However the killer is that Microsoft do not support SQL Server failover clustering of VMs as per this KB. BTW this limitation applies to all virtualisation technologies, not just Hyper-V.
Is there a better way? For example if you mirrored the critical databases between vmA and vmB then you now have a copy of the database on either shared or separate storage. You could then script a manual failover to the mirror if the node running on the principal moved over to the other node as its storage would now be remote to the VM. This is supported, and mitigates a number of failures including:
- Hardware failure of a node.
- Media failure on one of the nodes or part of the SAN.
- Site failure if the nodes are remote to each other and have shared SAN
- An instance of SQL Server crashing
Of course this is at the database level, but the principal could be extended for more than one database and other workloads as well.
However in my opinion the best way to consolidate here is to create a physical cluster and consolidate SQL Server workloads on into it. Clustering is a lot easier in Windows Server 2008 and you n o longer need carefully matched hardware. though the surviving nodes in a cluster need to be powerful enough to handle all of the work of the cluster.
Let me know what you think and what your plans are in this space.