Archive for the 'Miscellaneous' Category

How to calculate Median in SQL Server 2005 or SQL Server 2008

Nothing earth-shattering here, I was just helping out a colleague with this so I thought I’d post up the example I gave him.

– sample table:
create table People
(
    Person varchar(1) primary key,
    City varchar(10),
    Age int
)

go

– with some sample data:

insert into People
select ‘A’,'Boston’,23 union all  — odd #
select ‘B’,'Boston’,43 union all
select ‘C’,'Boston’,29 union all

select ‘D’,'Chicago’,15 union all — single #

select ‘E’,'NY’,12 union all  — even #
select ‘F’,'NY’,55 union all
select ‘G’,'NY’,57 union all
select ‘H’,'NY’,61

go

– here’s our query, showing median age per city:

select city,
    AVG(age) as MedianAge
from
(
    select City, Person, Age,
        ROW_NUMBER() over (partition by City order by Age ASC) as AgeRank,
        COUNT(*) over (partition by City) as CityCount
    from
        People
) x
where
    x.AgeRank in (x.CityCount/2+1, (x.CityCount+1)/2)   
group by
    x.City    
   

go

– clean it all up
drop table People

And here’s the result:


city       MedianAge
———- ———–
Boston     29
Chicago    15
NY         56

(3 row(s) affected)

Simple remove “City” from the SELECT clause and the GROUP BY clause to get the median age for all. 

There may be more efficient tricks out there, but this is certainly the shortest and simplest technique I am aware of.

 


An interesting take on “Stored Procedures” (link)

Another great post SQL-related post over at TheDailyWTF regarding a, umm, “creative” use of “Stored Procedures”:

http://thedailywtf.com/Articles/For-the-Ease-of-Maintenance.aspx


Programming is like dreaming?

This is from March, so it’s a little old, but I just stumbled upon it and found it a bit interesting. I never thought of programming this way before, but it does make sense:

Programming is like a dream


Is it just me …

… or is about time I got back to some blogging?


Processing an OLAP cube with a T-SQL Stored Procedure

Here’s a simple SQL Server stored procedure that you can call to process an OLAP cube using T-SQL.  The parameters should be self-explanatory.  To me, this is a little easier and more flexible than processing using DTS packages.

create procedure ProcessCube

    @Database varchar(100),

    @Cube varchar(100),

    @Partition varchar(100)  = null, – If NULL, process the entire Cube

    @Server varchar(100) = ‘localhost’

as

 

/*

    Author:        Jeff Smith

    Version:    10/27/2008

*/

 

/* variables used to store object handles */

declare @o_svr int, @o_db int, @o_cube int, @o_part int, @o_mds int

declare @hr int

 

/* different cube processing options. This SP uses “default” */

declare @PROCESS_DEFAULT int

declare @PROCESS_FULL int

declare @PROCESS_REFRESH_DATA int

 

set @PROCESS_DEFAULT = 0

set @PROCESS_FULL = 1

set @PROCESS_REFRESH_DATA = 2

 

– create a DSO.Server object:

exec @hr = sp_OACreate ‘DSO.Server’, @o_svr out

if @hr <> 0

    begin

        print ‘Error at create server:’

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

– Connect to the server:

exec @hr = sp_OAMethod @o_svr, ‘Connect’, null, @Server

if @hr <> 0

    begin

        print ‘Error at connect to server:’

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

– Get the MDStores property from the Server:

exec @hr = sp_OAGetProperty @o_svr,‘MDStores’, @o_mds OUT

if @hr <> 0

    begin

        print ‘Error at get getting Server MDStores:’

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

– Get the database from the MDStores:

exec @hr = sp_OAGetProperty @o_mds,‘Item’,@o_db OUT,@Database

if @hr <> 0

    begin

        print ‘Error at get database:’

        exec sp_OAGetErrorInfo @o_mds

        goto cleanup

    end

 

– get the MDStores property from the database:

exec sp_OADestroy @o_mds

exec @hr = sp_OAGetProperty @o_db,‘MDStores’, @o_mds OUT

if @hr <> 0

    begin

        print ‘Error at get database MDStores:’

        exec sp_OAGetErrorInfo @o_db

        goto cleanup

    end

 

– get the Cube from the MDStores

exec @hr = sp_OAGetProperty @o_mds,‘Item’,@o_cube OUT, @Cube

if @hr <> 0

    begin

        print ‘Error at get Cube:’

        exec sp_OAGetErrorInfo @o_mds

        goto cleanup

    end

 

if @Partition is null – Process the entire Cube, not just a single partition

    begin

        exec @hr = sp_OAMethod @o_cube, ‘Process’, null, @PROCESS_DEFAULT

        if @hr <> 0

            begin

                print ‘Error at process Cube:’

                exec sp_OAGetErrorInfo @o_cube

                goto cleanup

            end

    end

 

else    – just process the specified Partition

 

    begin

        – Get the MDStores property of the Cube:

        exec sp_OADestroy @o_mds

        exec @hr = sp_OAGetProperty @o_cube,‘MDStores’, @o_mds OUT

        if @hr <> 0

            begin

                print ‘Error at get Cube MDStores:’

                exec sp_OAGetErrorInfo @o_cube

                goto cleanup

            end

 

        Get the partition to process:

        exec @hr = sp_OAGetProperty @o_mds,‘Item’,@o_part OUT, @Partition

        if @hr <> 0

            begin

                print ‘Error at get Parition:’

                exec sp_OAGetErrorInfo @o_mds

                goto cleanup

            end

 

        – Process the partition:

        exec @hr = sp_OAMethod @o_part, ‘Process’, null, @PROCESS_DEFAULT

        if @hr <> 0

            begin

                print ‘Error at process Partition:’

                exec sp_OAGetErrorInfo @o_part

                goto cleanup

            end

    end

 

– And unlock all objects on the server:

exec @hr = sp_OAMethod @o_svr, ‘UnlockAllObjects’

if @hr <> 0

    begin

        print ‘Error at unlock all server objects:’

        exec sp_OAGetErrorInfo @o_svr

        goto cleanup

    end

 

cleanup:

 

if @o_mds is not null exec sp_OADestroy @o_mds

if @o_Part is not null exec sp_OADestroy @o_Part

if @o_cube is not null exec sp_OADestroy @o_cube

if @o_db is not null exec sp_OADestroy @o_db

if @o_svr is not null exec sp_OADestroy @o_svr

I’m back … with some news!

I apologize for not posting any new content in quite some time, but now I am back and will soon start posting on a (hopefully!) regular basis once again.

The reason for my hiatus was a pretty good one, though: I recently became a father with the birth of my son Benjamin on October 2, 2008!   He is doing great and already knows that he should always do his formatting at the client and never within the database.  He’s a natural!

As if that wasn’t enough, I also found out recently I have been named as a 2009 SQL Server MVP!  Now I can finally get a decent seat at a restaurant by pulling the “do you know who I am?” routine!  In all seriousness, though, I am very honored and excited by the award and I intend to live up to it in 2009 by continuing to update this blog and help users out in the SQLTeam forums.

Thank you to everyone who reads this blog and participates in the discussions, and I promise more updates are coming soon.  If you have any specific topics or ideas for a blog post that you think I may be able to cover effectively, just let me know.

Best Regards,

Jeff Smith
SQL Server MVP
 
(Hey — I like the way that looks!)

Foreign Key Constraints: SET NULL and SET DEFAULT

Most people know about cascading updates and deletes, but did you know there are two other foreign key constraint options you can use to maintain referential integrity?

Read all about them in my latest article over at SQLTeam.com.

These features, introduced with SQL Server 2005, haven’t got a lot of publicity, but they can be very useful.  I just used the SET NULL option recently for the first time (inspiring me to put together an article on it) and it works great.

Database Column Names != Report Headings

Always remember that the column names returned in a result set do not have to be the same as what you eventually output at your presentation layer.  

For example, suppose you have stored procedure that accepts a @CurrentYear parameter and returns a sales variance between the current year and the previous year for each customer.  I often see programmers struggling with writing dynamic SQL to produce output like this:

CustomerID   2008 Total    2007 Total   Variance

———-   ———-    ———-   ——–

ABC          $100          $50          $50

DEF          $200          $250         -$50

That is, the names of the columns vary based on the data; that is not a good way to return data from your database!  A much better result set to return is simply this:

CustomerID   CurrentYear   PrevYear   Variance

———-   ———-    ———  ——–

ABC          $100          $50        $50

DEF          $200          $250       -$50

Notice that with that set of columns, no dynamic SQL is needed, and the column names returned are always constant regardless of the value of the @CurrYear parameter.  

As mentioned, the fact that your data set has columns labelled “CurrentYear” and “PrevYear” does not mean that you cannot re-label them any way that you like on your report or web page. 

If your client code called the stored procedure and provided a @CurrentYear parameter, then it knows exactly what “CurrentYear” and “PrevYear” represent, and you can easily label the columns in the final result exactly as needed with simple formulas or a few lines of code. 

Remember that in the world of relational database programming, table names and column names should be constant — only the data itself should change.   Focus on returning consistently structured data from your database, and let your client applications handle the labeling of columns to make them look nice. 

How To Calculate the Number of Week Days Between two Dates

If the start date and end date are both week days, then the total number of week days in between is simply:

(total difference in days) - (total difference in weeks) * 2

or

 DateDiff(dd, @start, @end) - DateDiff(ww, @start, @end)*2

… since the DateDiff() function with weeks returns the number of week “boundaries” that are crossed; i.e., the number of weekends.

If you have a table of holidays, then you can simply subtract them out as well:

DateDiff(dd, @start, @end) -
DateDiff(ww, @start, @end)*2 - 
(select count(*) from holidays where holiday_date between @start and @end)

Now, what if the start day or the end day is on a weekend?  In that case, you need to define what to do in those situations in your requirements.

For example, if the start date is Sunday, Nov 20th, and the end day is Monday, Nov 21st — how many week days are between those dates? There’s no universal correct answer; it could be 0, or 1, or perhaps even “undefined” (null) depending on your needs.

Convert input explicitly at your client; don’t rely on the database to “figure it out”

A common mistake beginners make when working with SQL is trying to format their output at the database layer, as opposed to simply doing this at the presentation layer (i.e., client application, reporting tool, web page, etc).  I’ve covered that quite a bit in various blog posts, but I’ve only touched upon another similar issue which I feel is equally as important and also commonly mishandled.

In the SqlTeam forums, I often see code that accepts DateTime input in the form of a string value (say, from a TextBox on a web form) and uploads that value to the database written like this:

SqlCommand c = new SqlCommand();

c.CommandText = “insert into SomeTable (DateCol) values (’” + txtDate.Text + “‘)”;

c.ExecuteNonQuery();

Now, I think that hopefully even most beginners will agree that this is bad code.   The primary issue, of course, is SQL Injection.  Avoiding SQL Injection is very easy to do using Parameters.  So, let’s say that you rewrite this code using parameters like this:

SqlCommand c = new SqlCommand();
c.CommandText = “insert into SomeTable (DateCol) values (@DateVal)”;
c.Parameters.AddWithValue(”@DateVal”,txtDate.Text);
c.ExecuteNonQuery();

Looking at that, it seems we have done quite a bit better and should be happy with the code.  It works well, and no injection is possible.  But there is still an issue!  Why?  The txtDate.Text property returns a string, not a DateTime!   And, since we are not setting the data type of the parameter explicitly, the parameter being passed is a string (i.e., VARCHAR or NVARCHAR) value, not a true DateTime value.  This means that SQL Server must implicitly cast your string to a DateTime to store it in your table, and this may or may not work successfully, or as expected, depending on how the string is formatted. 

I’ve said it over and over and I’ll say it again:  The concept of formatting dates should never be something that your database code should ever worry about.  The database layer should be accepting DateTime data from clients, and returning DateTime data to your clients. Where and how the client got the data before passing it to the database, or what the client does with the data in terms of formatting after receiving it from the database is of no concern to the database itself. 

So, we might decide that to fix this, we can simply declare the data type of the parameter explicitly:

SqlCommand c = new SqlCommand();
c.CommandText = “insert into SomeTable (DateCol) values (@DateVal)”;
c.Parameters.Add(”@DateVal”, SqlDbType.DateTime).Value = txtDate.Text;
c.ExecuteNonQuery();

It appears that now we are in good shape, right?  Actually — no! There is still an implicit conversion happening, because we are still passing a string value — the txtDate.Text property — to the parameter, not a true DateTime!

Let’s try one more time.  How can we avoid these implicit conversions?  The answer that question is always the same: Convert explicitly!  Your client application is fully capable of handling the parsing, validation, and conversion of that string to a true DateTime value, so go ahead and do it:

DateTime dateval = DateTime.Parse(txtDate.Text); // plus more code to validate, of course

SqlCommand c = new SqlCommand();
c.CommandText = “insert into SomeTable (DateCol) values (@DateVal)”;
c.Parameters.Add(”@DateVal”, SqlDbType.DateTime).Value = dateval;
c.ExecuteNonQuery();

Now we are in business!  Before we even create the SqlCommand object, we have a true DateTime value that we are ready to pass along to SQL Server.  Our SQL code doesn’t need to worry about formatting, parsing, converting, or anything — it is being passed a completely valid piece of data with the correct type.   In short, we can now be sure that whatever value we came up with for the date in our client code is exactly the value that will be stored in our database.  That’s the idea, right?

So, please, don’t rely on your database code to validate your input.    Don’t just pass along generic string data and “hope” that at the end of the day the database can “handle it”.  Eliminate the chance of anything going wrong and write your code to explicitly cast and convert and validate any and all input before the database even comes into the picture.