Alexander The Great

March 6, 2008

An Idiot Misunderstands SQL

Filed under: Modern Life,Programming,Science,Software — alexanderthegreatest @ 3:44 pm
Tags: , , , , ,

Since my Digital Point – Misunderstood post has proven such a hit, it occurs to me to share a hysterical post I found on SQL.NET. This idiot is writing “Death to TSQL” because of horrible SQL coding skills and because the man doesn’t seem to understand that SQL is not ASP. (Yes, you read that correctly – SQL and ASP Ancient.)

The article concludes that TSQL should never be used for stored procedures, that procs should never (“.EVER.”) use temp tables, and that the CLR (Common Language Runtime – aka .NET) should be used for 1980s style programming. Quoth the raven “In this case, it’s so overwhelmingly better than TSQL that I cannot recommend to anyone coding for SQL 2005 to use TSQL for stored procedures, even CRUD procs.

CRUD = Create Read Update Delete. This went out of style by 1990. A client application should not be allowed to perform primitive operations like these for any number of reasons. Performance – we want chunky, not chatty interprocess protocols. Simplicity and maintainability – the client application should not have to understand the full implementation in the database. This should be hidden from and for them. (This is half the reason we use sprocs in the first place!!) Data quality and integrity – if the network connection is severed midway through a run, while half the creates have run and some of the updates, the data is left in an indeterminate state.

To stack the deck, Payton Byrd makes some shocking and unforgivable errors

  1. T-SQL code designed to be as resource intensive and slow as possible.  Not the cursor use.  The test code comes with this warning “This may not be the most efficient way to do this, but I don’t specialize in TSQL stored procs” which should disqualify the results for any thinking person.
  2. One reason the CLR code is assumed to perform faster than the TSQL code is because the .NET procedure uses a StringBuilder.  In a related article, our anti hero tells us if the test data set were larger, the difference would be more pronounced, because of this.  A StringBuilder is a hack, a work around for .NET because the CLR handles strings in a funny way.  When your code creates and then alters a string, both copies live on in memory, because (due to threading) strings are immutable.  Always creating and repointing references is so tiresome for the host computer that a class was written to give programmers an entirely different way of concatenating strings.  SQL does not suffer this limitation.  One of the key assumptions holding up the conclusion (aka guess) giving this article reason to be is fundamentally flawed.
  3. Wild conclusions “The performance results are absolutely stunning, the CLR stored procedure is 14 times faster than the TSQL stored procedure
  4. More profound confusion “I’m willing to do more testing, but I think this pretty well proves the point that you can get much better performance from the CLR than from TSQL when you step beyond the simplest of stored procedures.”   Apparently the meaning of complex was lost on this writer.  Anybody who’s ever done any database work knows that complex != writing loop code for a set based language to put commas between values.

I’m not writing this to make fun of an incompetent developer.  I’m writing to point out flaws of logic that seem all too common these days.  The internet seems to have created a world where everyone is expected (even if only by themselves) to be an expert in some area.  When a supposed expert doesn’t understand something, the rule this day in age seems to be that one should make assumptions and run with them.  As Tripy would say, do you programming, then, when there’s no time left, do your analysis.

An even better way to cause yourself grief is to test an edge case and then assume the general world is described by some crazy scheme.  If I wrote that there’s no reason ever to drive anywhere because I took a train yesterday, arrived at the station a moment before the vessel departed, then found I saved 5 minutes by not parking, people would rightly assume I had lost my wits.

The Texas Sharp Shooter Fallacy describes exactly this.  Payton’s test case addresses a tremendously rare (and poorly coded) situation, one that might never be encountered by a senior database programmer.  From this highly specific test he makes sweeping general conclusions.  A test was engineered to provide support for a conclusion that had already been drawn.  The test tells us nothing, except that we should avoid Payton’s blog if we aim to learn.

Advertisements

9 Comments »

  1. Wow…
    I mean… Just… Wow!

    I have my own resevations against StoredProcs, but that was incredible.
    There’s a time and a place to use every tool. That kind of gibbrish doesn’t belong anywhere.

    Here’s the real kicker :
    “Nice work setting all this up! You really took care to make the tests equal. Admirable!”
    – An excerpt from the first comment.

    Goes to show what kind of readers visit that blog.

    Comment by eksith — March 6, 2008 @ 4:54 pm | Reply

  2. So, have you done your own benchmarks on the performance difference between T-SQL and CLR Stored Procedures? Not only have I done the testing, but I’ve implemented the solutions.

    Also, you apparently know nothing about programming in general to claim that a StringBuilder is a hack. That’s just plain uninformed or simply ignoring the facts. Strings are meant to be immutable and string concatenation is one of the single most expensive operations that can be performed…. period.

    Also, please explain to your readers why the analogy of ASP isn’t valid? Do you actually understand how non-set-wise code in a stored procedure works?

    Comment by Payton Byrd — April 8, 2008 @ 1:29 am | Reply

  3. Strings are immutable in .NET! The SQL engine isn’t managed code. Things you learn from VB don’t apply across the board to all programming endeavours. I’m sorry, Payton, but to assume otherwise is the height of irresponsibility.

    You ran performance testing against (intentionally?) flawed scenarios. What you’ve done is akin to racing a horse against a paper air plane, and then assuming a letter will arrive from New York to LA faster if it’s sent by Pony Express than by air mail. Fascinating, but irrelevant, and that’s being polite.

    But thanks for the comment. Funny how you’ve disabled them on your own site once you began to receive feedback from the real world.

    Comment by alexanderthegreatest — April 8, 2008 @ 10:50 am | Reply

  4. 3) Strings ARE meant to be immutable. There are very, very, very few languages that don’t treat them as immutable objects, and all of the modern language, especially functional languages and languages that are designed from the ground up for multi-threaded programming use immutable strings. It’s you who is showing absolutely no knowledge of modern programming.

    I disabled the comments on the post because they discussion had devolved into personal attacks and other such nonsense because there are a lot of DBA’s who are very threatened by the existence of CLR stored procs (the same argument happens with Java in Oracle) and decide that instead of dealing with the facts that it’s easier to try to destroy the person who’s proving their antiquated scripting language sucks.

    Comment by Payton Byrd — May 8, 2008 @ 11:02 am | Reply

  5. PS – Your title for this article just reinforces my argument about the conversation devolving into personal attacks because you don’t like the facts which include you must use a cursor to concatenate strings from multiple rows in T-SQL and thus the performance of the proc plain sucks. Even without immutable strings you are still dealing with a SCRIPTED logic language.

    Comment by Payton Byrd — May 8, 2008 @ 11:04 am | Reply

  6. This post (and Payton’s claims) really interested me since I write many a complicated stored proc. So I decided to try it out.

    It’s true that using a cursor there isn’t optimal (or even done well for a cursor), but after implementing the query in the most efficient way I could figure, here’s what I got…

    Here’s the TSQL for the proc

    SET NOCOUNT ON

    DECLARE @RunningSales MONEY;

    DECLARE @Total MONEY;
    DECLARE @OrderID INT;
    DECLARE @ProductID INT;

    SELECT @RunningSales = SUM(ExtendedPrice+Freight) FROM Invoices
    WHERE CustomerID = @CustomerID AND OrderDate <= @BeginDate

    SET @RunningSales = COALESCE(@RunningSales,0.0)

    DECLARE @orders TABLE (OrderId INT, ProductId INT, Total decimal(13,2),RunningTotal decimal(13,2))

    INSERT INTO @orders(OrderId, ProductId, Total, RunningTotal)
    SELECT OrderId, ProductId , ExtendedPrice + Freight, 0.0
    FROM Invoices
    WHERE CustomerID = @CustomerID AND OrderDate BETWEEN @BeginDate AND @EndDate

    UPDATE @orders
    SET
    @runningSales = @runningSales + Total,
    RunningTotal = @runningSales

    SELECT * FROM @orders

    I made similar enhancements to the CLR version…I pre-calculate the runningSales up to the beginDate and it now also selects only the columns that the proc above does..other than that, it’s the same.

    The results were rather surprising…..the CLR version performs identically to the TSQL above. With more data though, the TSQL version starts to edge out the CLR version (but not by a whole lot).

    I think this points to a really efficient CLR, and will certainly make me less hesitant to put, say, a Regex parser in TSQL using a CLR function. But only write procs in CLR from now on? No way.

    Comment by James — May 8, 2008 @ 1:38 pm | Reply

  7. PS, I should note that it isn’t surprising that the CLR version gets slower with more rows since it is looping.

    Comment by James — May 8, 2008 @ 1:41 pm | Reply

  8. @James

    Would you mind posting your C# code? I don’t think a loop is necessary.

    Comment by Payton Byrd — May 8, 2008 @ 4:45 pm | Reply

  9. Here’s a link to my modified RunningSalesCS.cs I renamed the proc to RunningSalesCS1. No other files were changed, and this was taken from the link in the last post in the comments on your blog.

    http://code.brechtel.us/stuff/RunningSalesCS.cs

    Comment by James — May 9, 2008 @ 3:32 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: