Looking for security vulnerabilities in database code

I've always been concerned with security and I've always stressed the importance of auditing the REAL user context not just the current user (see this post on EXECUTE AS and auditing). So, I generally try to avoid using dynamic string execution and if necessary create well tested/protected parameters (fyi - using QUOTENAME can be a fantasic solution to protectng identifiers as input parameters but it can't protect more complex strings).

Having said that, what if I'm looking at a database for the first time... just poking around trying to see if there's anything that needs further attention? I've come up with a quick query... And, while it's not going to "solve" your problem (as that's going to take some re-writing of code) or even truly verify if you're vulnerable, it gives you a "quick list" of where you should look first! If your code uses dynamic strings AND it's elevated - then start there! 

SELECT OBJECT_NAME(object_id) AS [Procedure Name],
  CASE
      WHEN sm.definition LIKE '%EXEC (%' OR sm.definition LIKE '%EXEC(%' THEN
'WARNING: code contains EXEC'
      WHEN sm.definition LIKE '%EXECUTE (%' OR sm.definition LIKE '%EXECUTE(%' THEN
'WARNING: code contains EXECUTE'
  END AS [Dynamic Strings]
,
  CASE
     
WHEN execute_as_principal_id IS NOT NULL THEN N'WARNING: EXECUTE AS ' + user_name(execute_as_principal_id
)
      ELSE
'Code to run as caller - check connection context'
  END AS [Execution Context Status]
FROM sys.sql_modules AS sm
ORDER BY [Procedure Name]

Is this enough? Anything else you'd check? What do you think?

THANKS!
kt

mysql UC 2009 Talk

Scribe is a bit buggy with displaying this presentation:

Scaling a Widget Company

PHP mysql connect and using APC to recover

So, when you connect on the fly to a database your subject to a variety of issues, like when the db is not available and when the db does not have a route.

One of the main reasons why a dev may want to connect on the fly is because they have too many front ends to hold a persistent connection on the backends. Since mySQL does not use libevent, holding open threads to mySQL is much more costly. Threads == Memory.

But, that's here nor there. The main purpose of this post is to talk about how to recover from failed connections that block apache threads.

Common Failures:
No route to Host
Flapping NIC
Locked Tables
Recovering from a Crash
more of the same.


My Environment:
I have a bunch of webservers (200+) that all have 300 possible threads (60000 possible connections to a single DB) behind a load balancer that uses the LB least connections protocol to distribute load across all 200+ webservers.

Since PHP is used in a stateless mode (no guarentee that the same user will hit the same server), I have to have some way of telling all other apache procs for said box that a server is down. I use stateless on the fly connections, so each apache proc will test the connection.

Here in lies my most common problem. If a box in the backend dies, all apache threads will block for a predefined time-out.

In /etc/php.ini (it could be in /etc/php.d/mysql.ini) I set

mysql.connect_timeout = 5 // the default is 60


In my common_db class when connecting to a database, the connect routine returns a database handle object

if (PEAR::isError($dbh) && $delta_to_connect >= 5){

#
# mark ip as dead for 15 min
#

apc_store($ip, array('DEAD'), 900);
return false;
}

return $dbh;




Now PRIOR to calling the database connect code I check to see if the IP is up.

$status = apc_fetch($ip);
if ($status[0] === 'DEAD'){
return false;
}

... do connect ...


But, there could be a variety of issues that can call false positives, like network flaps, someone blocking the db for some time etc. So, I'll allow one request .1% of the time bypass the status check to try again.


if ($status[0] === 'DEAD' && (mt_rand(0, 1000) != 1)){
return false;
}



But, if your app can't connect to the DB aren't you down anyway?

Yes, although I don't have to restart all the httpds.
Also each DB has a redundant pair to when returning false, the app code will try the other set of servers.

I have a variety of methods to deal with these scenarios but this is the quickest to implement.

The Tipping Point Query Answers

OK, I'll definitely take a beating from all of you for having gone so long between my survey posts and now. I won't even go into the details but between some crazy work schedules, multiple sinus problems and even migraines... well, I've been a bit behind. Let's just say that April/May were rough at best. I'm feeling better and well, now I'm trying to catch up. I had really gotten the blogging bug in March but I completely lost it in April. But, this tipping point series is in dire need of lots of explaining so I'm really hoping to get a few posts done in this area for sure!

First, I started the discussion around this in a few surveys:

Survey/Question 1

Q1 was described as this: if a table has 1 million rows at 20 rows per page (50,000 pages), at what percentage (roughly) of the data would a nonclustered index no longer be used. Blogged here. Here's what the survey said as of today:

And, for Q1 the correct result (Between 0-2% of the rows) is actually the best result (but, by no means the overwhelming majority at only 28%). However, often people just "think" the answer is very small. So... I did a few more questions/surveys. 

Survey/Question 2

Q2 was described as this: if a table has 1 million rows at 100 rows per page (10,000 pages), at what percentage (roughly) of the data would a nonclustered index no longer be used. Blogged here. Here's what the survey said as of today:

And, for Q2 the correct result (Less than .5% of the rows) is actually at a tie for the best (but, again, even a small percentage at only 22%). Again, often people just "think" the answer is very small. So... I did one more question/survey. 

Survey/Question 3

Q3 was described as this: if a table has 1 million rows at 2 rows per page (500,000 pages), at what percentage (roughly) of the data would a nonclustered index no longer be used. Blogged here. Here's what the survey said as of today:

And, for Q3 the correct result (Between 10-20% of the rows) is actually NOT the highest answer. And, this is even more convincing that there's confusion around what's going on and why.

The Tipping Point

What is the tipping point?

It's the point where the number of rows returned is "no longer selective enough". SQL Server chooses NOT to use the nonclustered index to look up the corresponding data rows and instead performs a table scan.

When does the tipping point occur?

It depends... it's MOSTLY tied to the number of pages in the table. Generally, around 30% of the number of PAGES in the table is about where the tipping point occurs. However, parallelism, some server settings (processor affinity and I/O affinity), memory and table size - all can have an impact. And, since it can vary - I typically estimate somewhere between 25% and 33% as a rough tipping point (and, you'll see from a bunch of my examples, that number is not EXACT). Then, I translate that into rows.

Math for Tipping Point Query 3: If a table has 500,000 pages then 25% = 125,000 and 33% = 166,000. So, somewhere between 125,000 and 166,000 ROWS the query will tip. Turning that into a percentage 125,000/1million = 12.5% and 166,000/1million = 16.6%. So, if a table has 500,000 pages (and 1 million rows) then queries that return less than 12.5% of the data are likely to USE the nonclustered index to lookup the data and queries over 16.6% of the data are LIKELY to use a table scan. For this table, that percentage seems "reasonable". But, most of us say that the tipping point happens at a much lower percentage... why? Because row size - which determines table size (and therefore pages) is really what has the greatest impact. So, let's look at Tipping Point Query 2... 

Math for Tipping Point Query 2: If a table has 10,000 pages then 25% = 2,500 and 33% = 3,333. So, somewhere between 2,500 and 3,333 ROWS the query will tip. Turning that into a percentage 2,500/1million = .25% and 3,333/1million = .33% (not even 1%). So, if a table has only 10,000 pages (and 1 million rows) then queries that return less than a quarter of 1% of the data are likely to USE the nonclustered index to lookup the data and queries over one third of one percent are LIKELY to use a table scan. For this table, that percentage seems really low BUT, at the same time it makes sense (to a point) that a small table would be scanned... but, for less than 1%. 1% is NOT selective enough. For small tables, it might not matter all that much (they're small, they fit in cache, etc.) but for bigger tables - it might be a big performance problem. 

Math for Tipping Point Query 1: If a table has 50,000 pages then 25% = 12,500 and 33% = 16,666. So, somewhere between 12,500 and 16,666 ROWS the query will tip. Turning that into a percentage 12,500/1million = 1.25% and 16,666/1million = 1.66% (under 2%). So, if a table has 50,000 pages (and 1 million rows) then queries that return less than 1.25% of the data are likely to USE the nonclustered index to lookup the data and queries over 1.66% are LIKELY to use a table scan. Again, this seems like a low number. Again, for small tables, it might not matter all that much (they're small, they fit in cache, etc.) but as tables get larger and larger - it CAN be a big performance problem. 

Why is the tipping point interesting?

  • It shows that narrow (non-covering) nonclustered indexes have fewer uses than often expected (just because a query has a column in the WHERE clause doesn't mean that SQL Server's going to use that index)
  • It happens at a point that's typically MUCH earlier than expected... and, in fact, sometimes this is a VERY bad thing!
  • Only nonclustered indexes that do not cover a query have a tipping point. Covering indexes don't have this same issue (which further proves why they're so important for performance tuning)
  • You might find larger tables/queries performing table scans when in fact, it might be better to use a nonclustered index. How do you know, how do you test, how do you hint and/or force... and, is that a good thing?

Real example of an interesting tipping point

Earlier today, I went on facebook and twitter and gave the following information - very vaguely - and I asked "why" is Q2 so much slower than Q1 if Q2 returns only 10 more rows. Same table and no hints (other than MAXDOP)...

Q1: SELECT * FROM table WHERE colx < 597420 OPTION (MAXDOP 1)

  • returns 197,419 rows
  • takes 116,031 ms (1 minute, 52 seconds)
  • 1,197,700 logical reads, 5 physical reads, 137,861 read-ahead reads
  • 7,562 ms CPU time

    Q2: SELECT * FROM table WHERE colx < 597430 OPTION (MAXDOP 1)

  • returns 197,429 rows
  • takes 244,094 ms (4 minutes, 4 seconds)
  • 801,685 logical reads, 1410 physical reads, 801,678 read-ahead reads
  • 9,188 ms CPU time

There were lots of great guesses... but, it's the tipping point. SQL Server chose to "tip" the second query because it was "over the line". But, it's important to realize that there are cases when that's NOT a good idea. And, what are your options?

In SQL Server 2005 - the only option is to force the nonclustered index to be used:

Q2: SELECT * FROM table WITH (INDEX (NCInd)) WHERE colx < 597430 OPTION (MAXDOP 1)

But, this can be TERRIBLY bad on some machines where the IOs could be a lot faster (and where data might already be in cache). These specific numbers are exactly that - specific to this HARDWARE (and, I chose not-so-optimal HW in this case to highlight this problem). And, depending on what number you use (what if this is a parameter in sps?) you might force SQL Server to do WAY more IOs by forcing the index than allowing the tipping point to do its job. But, depending on your hardware (and/or what you know to be in cache at the time of execution), it might be better to force an index instead of letting SQL Server choose. So, should I force the index? Be careful, if you're wrong - it could take more time and actually be slower.

In SQL Server 2008 - there's a new hint - FORCESEEK:

Q2: SELECT * FROM table WITH (INDEX (FORCESEEK)) WHERE colx < 597430 OPTION (MAXDOP 1)

FORCESEEK is better because it doesn't tie you to a particular index directly but it also doesn't let SQL Server tip to a table scan. However, just like forcing an index - you can be wrong!

So, what should you do? It depends. If you know your data well and you do some extensive testing you might consider using a hint (there are some clever things you can do programmatically in sps, I'll try and dedicate a post to this soon). However, a much better choice (if at all possible) is to consider covering (that's really my main point :). In my queries, covering is unrealistic because my queries want all columns (the evil SELECT *) but, if your queries are narrower AND they are high-priority, you are better off with a covering index (in many cases) over a hint because an index which covers a query, never tips.

That's the answer to the puzzle for now but there's definitely a lot more to dive into. The Tipping Point can be a very good thing - and it usually works well. But, if you're finding that you can force an index and get better performance you might want to do some investigating and see if it's this. Then consider how likely a hint is to help and now you know where you can focus.

Thanks for reading,
kt

Favorite SSMS Options - and some gotchas

Whenever I setup a new machine, I like to set a few default options in SSMS. Here are my favorites:

Tools, Options

  • Environment
    • Fonts and Colors
      • Text Editor font: Lucida Console (a bit thicker and it's a fixed-width font)
      • Text Editor: Selected Text (under display items)
        • Item foreground: Black
        • Item background: Yellow (looks like a highlighter)
      • Sometimes I'll also make the results windows have larger fonts - especially if it's a presentation machine
    • Keyboard
      • Keyboard scheme: SQL Server 2000 (ok, maybe I'm old-school but the QA keyboard shortcuts still seem a lot more natural than the VS keyboard shortcuts... but, if you're more of a VS person, then stick with Standard)
      • Query shortcuts:
        • Ctrl+F1: sp_helpindex2 (if you highlight an object and then hit Ctrl+F1, then it passes the highlighted object in as a parameter. The only negative is that it doesn't delimit it so you can't highlight schema.object unless it's already quoted for the sp 'schema.object')
        • Ctrl+3: SELECT object_name(object_id) AS ObjName, * FROM sys.dm_db_index_physical_stats(db_id(), NULL, NULL, NULL, NULL) ORDER BY 1, 4
        • Ctrl+4: other common queries/procedures...
  • Text Editor
    •  All Languages
      • Line numbers
    • Editor Tab and Status Bar - I usually change the colors of the "Group connections" setting to something very noticeable like Fuschia. I've already blogged why here.
  • Query Execution
    • By default, open new queries in SQLCMD mode (however, if you set this - you will disable Intellisense and Transact-SQL Debugging... and, there's no "hint" or warning that you're doing it.)
  • Query Results
    • SQL Server, Results to Grid
      • Include the query in the result set (this will show you what you executed in the results window while you wait for completion... also confirms what you think you executed. This gets annoying for large batches but is sometimes useful to see what you last executed)
      • Include column headers when copying or saving the results (VERY useful if you're pushing data over to Excel, etc.)
      • Display results in a separate tab (this gives you Tabbed mode instead of split-window mode meaning that results will go to a results Tab instead of the lower half of your window. I prefer this when I have a lot of results to review AND when I'm presenting and typically run with a lower screen res.)
        • Switch to results tab after the query executes (I prefer this so that I'm waiting for the results in the results window)
    • SQL Server, Results to Text
      • I set most of the same settings as Results to Grid (Include column headers, Include the query, Display results in separate tab and Switch to the results tab) BUT, I usually turn off Scroll results as received. If I'm looking at rows at the top of the set (while it's still processing) it often takes my cursor down to the end. I also turn this off in Profiler.

The primary reason for this post though - is to make sure that you realize that one specific option - Tools, Options, Query Execution, "By default, open new queries in SQLCMD mode" - can really surprise you. I used to think that this option was really helpful (and benign) because SQLCMD is a superset of Transact-SQL. So, I typically recommended that you turn it on. However, turning on this option disables Intellisense and the Transact-SQL Debugger. It's documented here: http://msdn.microsoft.com/en-us/library/ms174187.aspx (thanks Paul Mestemaker! :) and I thought this was a bit unintuitive at first but it really does make sense. Both the Transact-SQL Debugger and Intellisense are debugging and/or prompting you with Transact-SQL (not SQLCMD commands). So, disabling it seems right (well, I guess I could argue that it could work with Intellisense but debugging would be much harder). However, I just wish there were some better information in the Tools/Options dialog. I'd like my 2 hours back (trying to figure out which option disabled Intellisense, chatting with my friends on fb to see if they knew, filing a connect bug when I figured out that it was SQLCMD, and then learning from Paul Mest that it's actually expected behavior (duh!), changing the Connect bug to say that the dialog might be better with a warning... it was definitely one of those mornings!  :) :) :)

So, my main point for this post - save you the time I lost in trying to figure out where Intellisense went. It's expected behavior (and it does make sense........now :).

Cheers,
kt

Multi Direction Sorts and avoiding a file sort

There are two PRIMARY directions to sort data in SQL: Ascending (ASC) and Descending DESC.
When these two sort definitions are put together in a single statement a filesort is produced.

Why do we want to avoid filesorts?

Filesorts are bad. 1st they tickle a thread based buffer called sort_buffer_size. Additionally filesorts reads the data twice, unless max_length_for_sort_data limit is reached and as a result the Filesort runs slower to reduce disk I/O. If you want filesorts to run faster at the expense of the disk increase the default max_length_for_sort_data. You can read the filesort algorithm here.

So, here is an example

CREATE TABLE `ABCD` (
`A` int(10) unsigned NOT NULL default '0',
`B` int(10) unsigned NOT NULL default '0',
`C` int(10) unsigned NOT NULL default '0',
`D` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`a`,`b`,`c`,`d`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1



mysql> explain SELECT * FROM ABCD WHERE a=1 AND b=1 ORDER BY c DESC, d ASC\G

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: ABCD
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: const,const
rows: 2
Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)



Notice the filesort? So how does one get around this filesort?

Well

Let's define some roles for columns C and D. C is the parent while D is the child.

  • We want all the latest parents (C)
  • We want all the oldest children (D)

    We require pagination of all the PARENTS (show 10 parents per page) so Queries like this is PRODUCED

    SELECT * FROM ABCD WHERE A=? AND B=? ORDER BY C DESC
    explain SELECT * FROM ABCD WHERE a=1 AND b=1 ORDER BY c DESC LIMIT 10\G
    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: ABCD
    type: ref
    possible_keys: PRIMARY
    key: PRIMARY
    key_len: 8
    ref: const,const
    rows: 2
    Extra: Using where; Using index
    1 row in set (0.00 sec)




    Now

    FOREACH($C_parent as $i => $c_id) {

    $C_parent[$i] = SELECT SQL_CALC_FOUND_ROWS * FROM ABCD WHERE A=? AND B=? AND C=$c_id ORDER BY D ASC LIMIT 1;


    }

    So, we changed 1 query into 11 queries (10 parents per page) to make the page load happen faster, by getting rid of the filesort.

    What 11 queries is faster then 1? Yes, for this case it is. The reason is because filesorts are SLOOOOW, they chew up a lot of limited resources and they should be avoided. I've see filesorts take close to 50-60% of the query time.
  • SSMS Error possibly related to database compatibility mode

    I've seen (and heard of a few other cases where people received) the following error:

    Index (zero based) must be greater than or equal to zero and less than the size of the argument list.

    And, the problem (when I got this error) was the compatibility mode of the database I was in didn't support what was being attempted. In SQL Server 2005, the reports used to generate an "Incorrect syntax near '('." error. For what I was trying to do (execute/access some of the standard reports from within SQL Server 2008 Management Studio), I received this error and immediately thought it might be the compat mode. Sure enough, it was.

    So, I don't know how often this will help you but just in case - I thought I'd post this!

    Cheers,
    kt

    Social networking, keeping up with friends/family and getting more tech info!

    OK, I know I don't blog all that often but when I do, I do try and post as much useful information as I can :). I've got a few posts in the queue and a few more tests to do and code to write before I can wrap them up. In the interim, Paul and I have both decided to throw a bit of our "spare time" to keeping up with friends and family more and just staying "more connected." In that effort, we've both joined facebook (KimberlyLTripp) and twitter (KimberlyLTripp). Our end goal is:

    • Blogging: large/complex posts with detailed info/code, etc.
    • Twitter: short, quick posts on things to check out, review, etc. Maybe I'll do a weekly summary of tweets that I do and/or followed? Would you be interested in that? And, as for my interface, I'm currently using TweetDeck and Twitter.com and I've also joined via WeFollow.com: #sqlserver, #mvp and #womenintech.
    • Facebook: more fun stuff and keeping up with friends, etc. I have to admit that I really love the interface and I'm constantly impressed at how easy it is to upload video, photos, etc... It's just NOT what I expected before I tried it out. I'd truly recommend this to anyone that wants to asynchronously keep up with a large number of people and wants to share photos, comments, video, etc. It's really well done. However, beware of many of the facebook apps. They tend to spam your friends list - sometimes even when they don't ask. Outside of better requirements on fb app developers, I don't have a lot of complaints there.

    Having said that, neither of us is doing anything else (no Plaxo, no LinkedIn, etc.) so if you want to find us - we're definitely around but we're going to stay somewhat focused. ;-)

    And, now that I'm back home again, I hope to have a few more of my longer posts done.

    Cheers everyone!
    kt

    Restoring a backup of a Circular Replication config

    So say you have two mySQL servers called A and B. A and B are in a circular replication ring meaning

    A replicates from B
    B replicates from A

    In addition, log-slave-updates is on, so if one server of either A or B goes down you can recover all the data from either side.

    Therefore, here are some steps.
    Take a snapshot of a server in the ring, with your favorite backup tool (mine is ibbackup)

    When either server A or B dies, restore from said backup. Here are the steps. Let us assume server A goes down, and backups are done from server B.

    restore the backup of server B to server A

    On server-A turn off log-slave-updates (you will see why on the next line)
    Add replicate-same-server-id on server A to my.cnf

    CHANGE MASTER TO to the position and file of the binary log that is reported from your tool on server-A.

    IF the binary log did not get corrupted on server-B your cool, else flip the binary log to the next binary log from server-A on server-B.

    Let server-A catch up
    Stop server-A
    Add log-slave-updates on server-A
    remove replicate-same-server-id on server A
    Start server-A

    done.

    Companion content for Chapter 6 (Index Internals) of SQL Server 2008 Internals

    Recently, the SQL Server 2008 Internals title was released (and only in the past few days have people actually received their copies! In fact, I still haven't seen the book in person... soon though!)

    Anyway, Kalen, Paul, Conor, Adam and I worked to create a comprehensive resource on SQL Server 2008 internals and to supplement the written content, many of us created demo scripts and examples. I've now gone back and created a sample script based on ALL of the code in the entire chapter (and in many cases I extended the code in the samples). And, while this updated content is going to be located on the companion content site, I thought I’d also release it here with some notes.

    The zip contains a solution with 3 projects, each with a few scripts:

    I know the names seem a bit strange but everything is ordered EXACTLY as it is shown in the book. And, in the book, I referenced "a" script called EmployeeCaseStudy-AnalyzeStructures.sql but that script was so big that I broke it down into 7 parts (hence the naming convention of 06...01, 06...02, etc.). Regardless, each script contains a brief header and a few details about the script. To get the most from the script, do not just open the script and execute it. If you really work your way through the script, you should see all of the comments and they will help you to make instance specific changes so that everything runs without error. Just take your time and really try to step back and think about each command (and what you're expecting the output to be) to test yourself while your working through the results. Taking your time and really grapsing all of these internals is what makes it fun!

    Quite a bit of this content can stand alone but it's really best as companion content to the title as there's a lot more "text" and detailed information in the book. But, the scripts are really a great way to dive deeper, learn documented/undocumented commands and really get to know what the heck is really going on internally.

    Finally, I only worked on Chapter 6 so here it is: IndexInternals-Chapter6-Resources.zip (12.23 mb). As for the other companion content, you'll need to get the links from the book. Ah, or from Kalen :). 

    And, certainly, if you find a typo or anything that you think needs more clarification, let me know! I'm more than happy to post updates to this content. 

    Have fun,
    kt