first_page

Migrating out of DTS Technology

Buy this book at Amazon.com!DTS has been replaced by SQL Server Integration Services. It’s like replacing a respectable automotive body shop with a 60,000 foot vehicle assembly facility controlled by several hundred shiny robots. To me, the reasoning behind this hugeness is that less is more with SQL Server Express. For the people on the book cover shown at right, it’s like, “Dude, read our book!”

I’ve been rewriting (again) all of my ‘data activity’ code for my little enterprise and this time all of the ‘activity’ will take place inside .NET assemblies. No more DTS. Not using DTS reduces my dependency on a sizeable chunk of Microsoft, platform-specific technology and increases my chances of porting my code to Mono “in case of emergency.” This new-found freedom has boiled down to one problem: loading big-ass text files—namely, loading Web site logs into SQL Server with .NET.

Any SQL Server 7 guru from 1999 finds this problem laughable; however, I had to take a look at the following to build up my confidence:

  • I peeked at the source code from “A C# Grep Application” by George Anescu for ideas about opening big-ass text files directly and looping through them. I quickly found out that this method is way, way too slow. Hey I can’t know everything, folks! At least not all at once…
  • So, for a while, I thought this was a problem just waiting for ODBC as the solution. The plan was to find an excuse to use the new SQLBulkCopy type in ADO.NET 2.0. Ha! So I took a look at “ADO Connection String Samples” and “ADO Connection Strings”—which eventually led to “Text File Driver Programming Considerations,” “Schema.ini File (Text File Driver)”—and “PRB: The ODBC Text File Driver Only Supports INSERT Statement” for kicks. Then I suddenly realized that this ODBC stuff drags me back into a Microsoft, platform-specific technology. Being moved by this realization was easy because this solution was not really working correctly: the major problem was that the “Text File Driver” does not support Unix line breaks (as far as I know) so determining, say, the ‘row terminator character’ for Unix (CHAR(10)) would be difficult.
  • Then I remembered hanging out with SQL Server gurus back in 1994 and soon after I read this: “How To Use SQL Server to Analyze Web Logs.” The rest is big-ass-file-loading history…

rasx()