Virtualization with SQL Server

This is something that both Paul and I want to research more but, there have been a lot of questions about this recently. So, I thought I'd pull together a quick post of some important resources if you're interested in virtualization.

1) You should know what the support implications are - Bob Ward (Microsoft PSS Principal Escalation Engineer extraordinaire) - has blogged about the support implications here: SQL Server Support in a Hardware Virtualization Environment

2) You should know some of the best practices in configuration and setup - and the SQL team (specifically the SQLCAT team) has just recently released a whitepaper: Running SQL Server 2008 in a Hyper-V Environment - Best Practices and Performance Recommendations.

3) Finally, here are a few additional - and helpful resources: 

  • SQL Server's virtualization section on microsoft.com: http://www.microsoft.com/sqlserver/2008/en/us/virtualization.aspx which has a lot of links/resources/references.
  • If you're a SQL Server Magazine subscriber, check out Demystifying SQL Server Virtualization by Mike Otey in the the September 2008 edition of SQL Server Magazine

Have fun and post back with any issues on virtualization that you've run into!

kt

Transaction Log VLFs - too many or too few?

Way back in June 2005, I blogged about '8 Steps to better transaction log throughput'. I did this blog post after seeing (again and again) overly fragmented transaction logs... Transaction logs can become *VERY* fragmented when they are not preallocated and instead they grow excessively through unmanaged (and probably the default settings for) auto-growth.

While having WAY too many VLFs because of auto-growth is still the most common form of problem within transaction logs, another problem has been creeping up more and more... too few VLFs. If you preallocate a very large transaction log (10s to 100s of GB), SQL Server may only allocate a few VLFs - as a result, log backups will be allowed to run normally but, SQL Server only clears the inactive VLFs when you've moved into a different VLF. If your VLFs are 8GB in size, then you need to accumulate 8GB of log information before the log can be cleared...so, many of your log backups will occur normally but then one (the one that finally hits > 8GB in used size) will take quite a bit more time AND possibly cause you performance problems because it's now clearing 8GB of log information.

First, here's how the log is divided into VLFs. Each "chunk" that is added, is divided into VLFs at the time the log growth (regardless of whether this is a manual or auto-grow addition) and it's all dependant on the size that is ADDED not the size of the log itself. So, take a 10MB log that is extended to 50MB, here a 40MB chunk is being added. This 40MB chunk will be divided into 4 VLFs. Here's the breakdown for chunksize:

chunks less than 64MB = 4 VLFs

chunks of 64MB and less than 1GB = 8 VLFs

chunks of 1GB and larger = 16 VLFs

And, what this translates into is that a transaction log of 64GB would have 16 VLFs of 4GB each. As a result, the transaction log could only clear at more than 4GB of log information AND that only when it's completely inactive.

To have a more ideally sized VLF, consider creating the transaction log in 8GB chunks (8GB, then extend it to 16GB, then extend it to 24GB and so forth) so that the number (and size) of your VLFs is more reasonable (in this case 512MB).

Have fun and thanks for reading!!
kt

PS - I've been made aware of a bug when you use an exact size of 4096MB. I'll get more details and post them here but the long story short is to avoid 4096MB as an exact value. I've been told (and I haven't played with this one yet), that 4095 doesn't have the problem. Oh, and the problem is that the 4GB does NOT get divided into equally sized VLFs.

SQL Server 2008 Central Management Servers - have you seen these?

There's a new feature of SQL Server 2008 that during beta was called a Configuration Server... In RTM it was renamed to a Central Management Server. It's not overly obvious and without knowing how cool it can be - you might skip right by it. And, if you're using SQL Server 2008 to manage 2000, 2005 and 2008 servers - this actually works for all of those (and can be quite cool).

In the "Registered Servers" pane, expand the "Database Engine" option to see this feature. It's basically a separate option in addition to Local Server Groups. Right-click to register a Central Management Server.

A Central Management Servers (at first glance) seems as though it's solely a way to store server groups and registered servers - centrally - so that you (and your team) have easier access to the registration properties of a group of servers. And, in setting it up, adding groups, and registering servers - well, it doesn't seem like it is any different. In fact, that's all it is - in terms of creation. However, there are new options in terms of how to use it. It definitely IS different! In fact, it's in this simplicity (and the options it exposes) that lies its coolness. Let me show you :)

On a VPC (that we typically use for demos/labs, etc. for SQL Server 2008), we have 5 instances installed (SQLDev01, SQLDev02, SQLDev03, SQLExpress and ConfigServer). All instances (except Express) are SQL Server 2008 RTM Developer Edition and they're just simply named instances (there is no special designation (other than its name) for "ConfigServer"). I created the "ConfigServer" instance as a "server to hold configuration and general centralized management features - like Policy-Based Management, the management data warehouse for Performance Data Collection, a Master Server (for Master Server/Target Server Administration), and even a centralized management server. For many of these things you'll want to use a non-Express Edition of SQL Server as some features won't work on SQL Express (I'll explain more on this later). So, for this example, I'm going to use the ConfigServer instance (note: Centralized Management Servers *DO* work with SQL Express - something I previously thought was not possible... and, this is WAY cool). Regardless, the overall features that I plan to use with this server (like Policy-Based Management) require a version other than SQL Express so I'm going to use a Developer Edition instance for this centralized management server.

OK, so in this case, I created two groups (Development and Production) and each group has servers...

If I right-click on ANY level (a specific server, a group, or the Centralized Management Server itself) then you get multiple options as seen above:

  • New Query
  • Object Explorer
  • Evaluate Policies
  • Import Policies

In this case, I'm going to right-click on the Production Group and choose New Query – this opens a query window with one slight difference. The status bar at the bottom has a different color. OK, I bet you won’t even really notice this but a regular query window has a pale yellow status bar. A centralized management server query window has a pale pink status bar. I find that this isn’t overly noticeable – so I change it to a much more bold color (fuschia!). To change this, use Tools, Options, Text Editor, Editor Tab and Status Bar and then change the setting for the Group Connections option under the Status Bar Layout and Colors section. Anything that stands out is preferred. The reason why will be apparent soon…Within this query window, anything I execute will be executed against each of the servers in this group. I can even choose to execute a query against the centralized management server itself and this will execute against every server in every group. Powerful, but potentially-too-easily dangerous.

Here, I’ll execute SELECT @@version against the Production group:

The default behavior is that this is executed against all servers and the results are unioned (or merged). If you want to change this behavior you can also set this in Tools, Options under Query Results, SQL Server, Multiserver Results – “Merge Results” equals True. You can also set whether or not the login name is appended to the results (this defaults to false) in addition to whether or not the server name is appended (this defaults to true). You can change all three of these settings.

So, why did I make the status bar fuschia… what if I execute DROP DATABASE dbname… yes, it will try and drop this database from all servers in the group. While very powerful, this is something of which you should be careful.

OK, so here are the most interesting things about centralized management servers:

1) To designate a server as a CMS - that server must have an msdb. Originally, I thought that you couldn't use SQLExpress but because SQLExpress *does* have an msdb, this does work. However, SQLExpress lacks many other features/capabilities so outside of simplifying connectivity and query access, I'd probably target a higher edition of SQL Server. Even more specifically, if you're interested in where all of the server group information is stored:

Local Server Groups stored in: C:\Documents and Settings\<user>\Application Data\Microsoft\Microsoft SQL Server\100\Tools\Shell\RegSvr.xml

Centralized Management Server details are stored within the msdb in:

dbo.sysmanagement_shared_registered_servers_internal

dbo.sysmanagement_shared_server_groups_internal

2) Not only can you execute queries but you can right-click and choose Object Explorer and each server will be connected to and entered into your Object Explorer window. This will minimize your manually connecting/opening each server one by one.

3) AND, the servers can be servers other than SQL Server 2008... I've connected to 2005 and 2008 and you shouldn't have a problem adding any other SQL Servers - even 7.0...but, I haven't tried that (let me know if it works for 6.5 :).

3) Finally, in addition to executing queries, you can also Evaluate or Import Policies. This is something that can be really powerful. In addition to executing something directly, you can evaluate a policy against a number of servers in one step. Policy-Based Management is something that you'll start (if you haven't already) hearing a lot more about. And, since I'm always behind at blogging - check out these other links/blogs for more info on PBM:

Enjoy and thanks for reading,

kt

INNODB Shared Locks, Exclusive Locks and INSERT INTO SELECT FROM

I wrote an application that is able to send out 3-8 million messages an hour with only 10 CPU's. This application is a part of an Offline Task system that scales linearly.

How is this done, I'll go into detail hopefully at the mySQL conference if they accept my proposal.

The scope of this blog post is to go over building the "Task Queues". Currently I have 13 Queues, one queue for each Shard that I run. The data is federated by user or randomly federated with a GUID that lives as long as the job. A request came in to add 20 million jobs to the queue all at once. The problem is with this list, will I cause deadlocks in innodb as I add the jobs to the queue as one transaction? Can live traffic still write to this queue?

To verify that Deadlocks will not occur - having an understanding about how locks work in INNODB is key. I suggest reading this page.

To build the queue I dumped the data source by

SELECT identifier, 16 INTO OUTFILE "/data/mysql/BuildQueue.log" FROM SOURCE_TABLE WHERE CONDITION.


The isolation level is REPEATABLE-READ; I'm setting a shared lock to get the most current version of the data. Writes are not blocked.

Next:

I create a table on each shard where the queue is located.

CREATE TABLE IF NOT EXISTS OfflineTasksHold (
object_id bigint(20) NOT NULL DEFAULT 0,
object_type smallint unsigned NOT NULL DEFAULT 0,
PRIMARY KEY(object_id,object_type)
) ENGINE=INNODB;



Then I issue a command on each Shard


LOAD DATA INFILE "/data/mysql/BuildQueue.log" INSERT INTO TABLE OfflineTasksHold;
START TRANSACTION;
INSERT INTO OfflineTasks (object_id, object_type) SELECT * FROM OfflineTasksHold;
COMMIT;


Each shard is getting around 4-5 million rows, while accepting real-time traffic of 20-60 tasks a second to the OfflineTasks table. The OfflineTasksHold table does not have any real-time requirements and is solely used to keep the queue in Primary Key order, plus there is a CHANCE that LOAD DATA could set an exclusive lock on the OfflineTasks table-shutting down adding data to the table by the live site. The INSERT sets an exclusive lock on the rows that are being added, so the Offline Task Sheppard - the process that pops tasks off the queue is blocked for a small period - which is acceptable. Why are they blocked? Well, the massive insert sets an Exclusive Lock. The Sheppard is trying to grab the rows that are locked waiting up to 50 seconds, until innodb_wait_timeout is reached. This condition is acceptable. All other inserts are able to go into the queue without a slow down.

In summary, the job queue is built and can be automated with confidence knowing that death to the various app will not occur. Processing slows down for a bit but speed right back up.

Processing an OLAP cube with a T-SQL Stored Procedure

Here's a simple SQL Server stored procedure that you can call to process an OLAP cube using T-SQL.  The parameters should be self-explanatory.  To me, this is a little easier and more flexible than processing using DTS packages.

create procedure ProcessCube

    @Database varchar(100),

    @Cube varchar(100),

    @Partition varchar(100)  = null, -- If NULL, process the entire Cube

    @Server varchar(100) = 'localhost'

as

 

/*

    Author:        Jeff Smith

    Version:    10/27/2008

*/

 

/* variables used to store object handles */

declare @o_svr int, @o_db int, @o_cube int, @o_part int, @o_mds int

declare @hr int

 

/* different cube processing options. This SP uses "default" */

declare @PROCESS_DEFAULT int

declare @PROCESS_FULL int

declare @PROCESS_REFRESH_DATA int

 

set @PROCESS_DEFAULT = 0

set @PROCESS_FULL = 1

set @PROCESS_REFRESH_DATA = 2

 

-- create a DSO.Server object:

exec @hr = sp_OACreate 'DSO.Server', @o_svr out

if @hr <> 0

    begin

        print 'Error at create server:'

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

-- Connect to the server:

exec @hr = sp_OAMethod @o_svr, 'Connect', null, @Server

if @hr <> 0

    begin

        print 'Error at connect to server:'

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

-- Get the MDStores property from the Server:

exec @hr = sp_OAGetProperty @o_svr,'MDStores', @o_mds OUT

if @hr <> 0

    begin

        print 'Error at get getting Server MDStores:'

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

-- Get the database from the MDStores:

exec @hr = sp_OAGetProperty @o_mds,'Item',@o_db OUT,@Database

if @hr <> 0

    begin

        print 'Error at get database:'

        exec sp_OAGetErrorInfo @o_mds

        goto cleanup

    end

 

-- get the MDStores property from the database:

exec sp_OADestroy @o_mds

exec @hr = sp_OAGetProperty @o_db,'MDStores', @o_mds OUT

if @hr <> 0

    begin

        print 'Error at get database MDStores:'

        exec sp_OAGetErrorInfo @o_db

        goto cleanup

    end

 

-- get the Cube from the MDStores

exec @hr = sp_OAGetProperty @o_mds,'Item',@o_cube OUT, @Cube

if @hr <> 0

    begin

        print 'Error at get Cube:'

        exec sp_OAGetErrorInfo @o_mds

        goto cleanup

    end

 

if @Partition is null -- Process the entire Cube, not just a single partition

    begin

        exec @hr = sp_OAMethod @o_cube, 'Process', null, @PROCESS_DEFAULT

        if @hr <> 0

            begin

                print 'Error at process Cube:'

                exec sp_OAGetErrorInfo @o_cube

                goto cleanup

            end

    end

 

else    -- just process the specified Partition

 

    begin

        -- Get the MDStores property of the Cube:

        exec sp_OADestroy @o_mds

        exec @hr = sp_OAGetProperty @o_cube,'MDStores', @o_mds OUT

        if @hr <> 0

            begin

                print 'Error at get Cube MDStores:'

                exec sp_OAGetErrorInfo @o_cube

                goto cleanup

            end

 

        -- Get the partition to process:

        exec @hr = sp_OAGetProperty @o_mds,'Item',@o_part OUT, @Partition

        if @hr <> 0

            begin

                print 'Error at get Parition:'

                exec sp_OAGetErrorInfo @o_mds

                goto cleanup

            end

 

        -- Process the partition:

        exec @hr = sp_OAMethod @o_part, 'Process', null, @PROCESS_DEFAULT

        if @hr <> 0

            begin

                print 'Error at process Partition:'

                exec sp_OAGetErrorInfo @o_part

                goto cleanup

            end

    end

 

-- And unlock all objects on the server:

exec @hr = sp_OAMethod @o_svr, 'UnlockAllObjects'

if @hr <> 0

    begin

        print 'Error at unlock all server objects:'

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

cleanup:

 

if @o_mds is not null exec sp_OADestroy @o_mds

if @o_Part is not null exec sp_OADestroy @o_Part

if @o_cube is not null exec sp_OADestroy @o_cube

if @o_db is not null exec sp_OADestroy @o_db

if @o_svr is not null exec sp_OADestroy @o_svr

mySQL Disk Benchmarks with Sysbench. When will mySQL / INNODB stop scaling?

Today's "commercial quality disks" are amazing but they follow the same limitations as yesterday's disks. mySQL scales very well, but disks do not. So if you're IO bound when will your expectation of speed fail?

Test setup:

DELL 2950 PERC-6 HWRaid BBC 6 DISK 15K RPM 3.5" RAID-10 256K stripe across two channels-using WRITE THROUGH CACHE on mkfs.ext3 -T largefile4 Linux Filesystem.


The theory is that the outer part of the spindles is the fastest, and the inner portion is slower - since the outer is where the data starts (thanks for the info Benjamin Schweizer). Thus one can conclude that the more disk space your application(s) use the slower the throughput, since the heads have to move more. Brad F. my co-worker did a benchmark to prove this. Our goal is to find out at what is the saturation point if our expectation is to have 22 MB / sec of random access.

Why do we want 22MB / sec of random access throughput? We want to guarantee a certain level of performance when adding new apps to a common backend-which is I/O bound: we need to know when things will break.

Here is what Brad found: Total disk size for our RAID-10 setup =~ 800G. What point does it FAIL to achieve our expectations of sustained 22MB/s?

rndrw test across 100G test / 750G LV =~ 35 MB/s # outer part of the spindles
rndrw test across 100G test / 300G LV =~ 32 MB/s # outer part of the spindles
rndrw test across 250G test / 300G LV =~ 24 MB/s # sweet spot
rndrw test across 350G test / 384G LV =~ 21 MB/s # saturation point
rndrw test across 750G test / 800G LV =~ 14 MB/s # waste of space


In conclusion these test show that even though a RAID-10 setup with 800G of space is available, the expected performance drops when data exceeds the sweet spot of 250G-300G of 800G usable-data array.

Disclaimers: There are many factors that can raise or lower the bar, like different file systems, different I/O schedulers, flushing. For my setups I like

Deadline I/O scheduler
256K Stripe
few inodes (don't need them)
ext3 since that’s what stable and available.

I’m back … with some news!

I apologize for not posting any new content in quite some time, but now I am back and will soon start posting on a (hopefully!) regular basis once again.

The reason for my hiatus was a pretty good one, though: I recently became a father with the birth of my son Benjamin on October 2, 2008!   He is doing great and already knows that he should always do his formatting at the client and never within the database.  He's a natural!

As if that wasn't enough, I also found out recently I have been named as a 2009 SQL Server MVP!  Now I can finally get a decent seat at a restaurant by pulling the "do you know who I am?" routine!  In all seriousness, though, I am very honored and excited by the award and I intend to live up to it in 2009 by continuing to update this blog and help users out in the SQLTeam forums.

Thank you to everyone who reads this blog and participates in the discussions, and I promise more updates are coming soon.  If you have any specific topics or ideas for a blog post that you think I may be able to cover effectively, just let me know.

Best Regards,

Jeff Smith
SQL Server MVP
 
(Hey -- I like the way that looks!)

Scotland is amazingly beautiful but, ah, the weather *IS* worse than Seattle!

OK, I never really understood it when Paul said "Seattle doesn't even compare to Scotland" (in terms of weather) until our trip in September. And, well, that's also why I was so silent in September. We were teaching in the UK (London and Edinburgh) and Ireland (Dublin) for the first couple of weeks and then we were off to Paul's sister's wedding (in Helensburgh) and some sight-seeing in Scotland (Glencoe/Kyle of Lochalsh/Portree and all around Skye). We put over 1000 miles on our rental car... And, I could go on for hours about how many castles, sheep and highland cows we saw but I'd probably spell everything wrong. So, for this post, I thought I'd just pass on a few of my favorite shots (including Paul in a kilt!).

Enjoy!

(click-through if you want to see a 1024x768 image)

This is Paul and his Father Garth - just before the wedding festivities started... no, they don't look alike at all??! WOW!

And this is Stirling Castle - which we saw on our way from Edinburgh to Helensburgh...

On a free day between arriving in Helensburgh and the wedding, we took a long day drive around Inveraray and Kilmartin and ended up driving through Oban and then back to Helensburgh. We saw a couple of cool things - lots and lots and lots of sheep:

and the shores of Loch Fyne around Inveraray were stunning:

and, Castle Sween (which is now apparently owned by [not all that friendly] people that run a trailer park... and they won't let anyone park near the castle (even though there seemed to be plenty of spots...). Instead they want you to park about a mile away and walk to the ruin. But, the good news is that there are great viewing spots up the hill as well:

Then, after the wedding, we took off for Kyle of Lochalsh and Skye. We drove past Lochcarron and this shot is across the loch from the church and [the new and old] burial grounds at Lochcarron.

Then we went up to Strome castle:

And, over to Applecross (which is one of Paul's favorite spots - mostly because of how hard it is to get there AND the excellent driving on the way (up the Bealach na Ba)):

And then heading up the coast from Applecross:

And, Eilean Donan Castle which really hit more as a drive-by/stop-quickly-photograph because umpteen tour buses has just arrived and the one thing neither of us likes is sight-seeing crowds. We did go into the Dunvegan Castle and we were there with only 20 or so people (in the entire castle) so that was a bit more our speed:

And one of the hightlights of our trip was taking the Glenelg Ferry to Skye. The turntable ferry driver (pictured there straightening/turning for the second row of cars - yes, it can only take 6 cars total!) was excellent. Once we got to the other side (and there were no cars waiting on either side), we all chatted for a few minutes... great guy - and even though the Glenelg Ferry isn't the main transport to Skye (there's a relatively new bridge that's just by Kyle of Lochalsh), it's really worth the drive. The drive is BEAUTIFUL and there are some excellent places to stop along the way. And, once across, there's a "hide" on the other side (about a 1/4 mile walk) that allows you to view the small lighthouse, seals, sea otters, and all sorts of sea birds, etc. And, the bridge is right where we were staying but we still went out of our way to take the Ferry over (yes, we took the bridge back).

And, I have a soft spot for highland cows.......... (ok, Paul wanted to kill me after our 50th cow. "Can we please stop so I can take a picture" probably still haunts him.)

And, Dunvegan Castle on a notably miserable day that this picture doesn't even begin to show:

And, there are just tons and tons and tons more... but, that's a fairly good sampling. Oh, and a final place to go - Neist Point:

In fact, if (no WHEN!) we go back for a few days, we're planning to stay/dine at the Three Chimneys (we heard amazing things about the restaurant but just didn't get a chance to go because our hotel was 30+ miles away...and a nice dinner needs a nice bottle of wine (and not 30 miles of single track roads to drive)). So, a night (or two) at the Three Chimneys and at least another road trip to the lighthouse at Neist Point. Apparently, you can even stay there. The lighthouse is on the other side (45 minutes by foot - but worth it):

Enjoy!
kt

RunAs Radio Interview Posted - “Kim Tripp Indexes Everything”

Well... I think I had had too much tea that morning ;-). But, as always, chatting with Richard and Greg was great. Here's the specific show link: http://www.runasradio.com/default.aspx?showNum=76.

Oh, and just for the record, I didn't come up with that title. But, I do hope that all your [high-priority and important] queries are indexed!

Enjoy!
kt

SQL2008HOLsVPC setup instructions and a note for the August 2008 release

OK, I know this post doesn't really apply to most folks but I did want to have the setup instructions on my blog so that I could refer people to it. For the August release we updated all 17 labs and all of the demos for SQL Server 2008 RTM; however, we did use a "security enabled" VPC. And, well, security is bugging people right about now because the password policy in this VPC was set for 42 days. I know all of you run into this on production servers but on a test VPC, it's pretty much nothing more than an annoyance. So... I thought I'd mention how to disable it for the VPC.

(1)    Remove the policy for 42 password expiration

                Local Security Policy Editor (Admin Tools, Local Security Policy)

                Security Settings

                                Account Policies

                                                Password Policy

                                                                Set the Maximum password age to 0 days (which is no password expiration)

 

(2)    Change the SQLService account to not require a password on next login

Computer Management

                                Users and Groups

                                                SQLService

                                                                Deselect the option "User must change password at next login"

         you might not actually need to set this but it's good to check!

 

(3)    Windows Update

While you're at it then, you should also consider running Windows Update...

 

(4)    Shutdown and save changes...

And, here's the full instructions that we usually give out at conferences/workshops: SQL2008Aug HOLs DVD SETUP Instructions.pdf (25.93 KB)

As for accessing this content on the web, it's part of Jumpstart and it will be part of the SQL Server 2008 Resource Kit. Also, I just refreshed the image for the resource kit so that it does NOT have the password policy enabled so you won't need to do that step. Also, the Jumpstart folks will probably refresh the image there so that it also uses this version. Regardless, you might still want to run Windows Update. And, you'll also find a lot of these resources at upcoming events and conferences as well as through these resources.

Enjoy!
kt