How pure is your data?

It was a normal Monday. I was going over some HA designs and planning the joys of setting up log shipping on a SQL 2000 instance when I got one of “those” emails.

“Hey Database Person, I’m trying to run this query but I keep getting this message:

Msg 9100, Level 23, State 2, Line 1
Possible index corruption detected. Run DBCC CHECKDB.

Can you fix it? I need to run these reports for our production line ASAP!”

Crap. I was just reading an article about all different kinds of corruption last week but didn’t run into this.

My first thought was to do what SQL told me to do, which was Run DBCC CheckDB. Luckily, it wasn’t that large of a database.  After it finished, it returned fine and dandy with no errors. Cool, let’s try that query again and hope everything is fine.

Msg 9100, Level 23, State 2, Line 1
Possible index corruption detected. Run DBCC CHECKDB.

Crap part 2. Possible index corruption? Maybe; SQL says it’s possible. I needed to rebuild the indexes on this table. Since the table contained over 11 million records, I double-checked with the person in charge of the database to make sure nothing would be damaged. I also checked out what was happening within the server. The server was relatively calm and had very low CPU usage. Nice, let’s go.

Couple minutes later, indexes were rebuilt. Awesome, let’s run that query again and see our results.

Msg 9100, Level 23, State 2, Line 1
Possible index corruption detected. Run DBCC CHECKDB.

You son of a…. Well, there weren’t any errors and the indexes were fine. What else could it be? Good ‘ol Google led me to a couple different pages where I finally found CHECKDB with DATA_PURITY.

I ran DBCC CHECKTABLE with DATA_PURITY because I knew exactly which table had the problems. Oh, did the error messages fly then. So many in fact that SQL said there were too many to list (sounds like someone is being lazy if you ask me). I tried to manually find a record or two that had contained bad data.  The column was a float so I figured it might have had some crazy number shoved in there somewhere.

Then I spoke with the person who sent me the error and she said that the DB was upgraded to 2008 from 2000 in the last year. Before, they used “NAN” – not a number, in that column. Bingo. That was it. Now I just had to find the bad records.

I took a backup of the DB and restored it to a dev playground. Next I did a nice SELECT INTO a new table but instead of keeping the column as a float, I converted it a varchar. I checked out a couple of the bad records I found before and lo and behold, there were about 3,200 records that had “#”.

Finally, I contacted the DB owner and told them about the bad data. Luckily, since the records were old they told me to just NULL them out to fix the data. A quick UPDATE statement later and it was back to check the original query.

Booyah! Query complete.

In today’s lesson, I learned about checking the purity of your data.  Also, it’s something to check for when upgrading databases. As BOL says: “For databases that were created in earlier versions of SQL Server, such as SQL Server 2000, SQL Server 7.0, and versions upgraded to SQL Server 2005, these checks are not enabled by default.”