Monthly Archives: April 2014

Improving async performance for SqlDataReader

In addition to the Idle Connection Resiliency feature that was added in .NET 4.5.1, the other major “feature” was a massive boost in async performance for SqlDataReader. From our own internal testing we have seen from a 50% performance improvement (for default command behavior) to 180% (for sequential access). This brings async performance to within 20-30% for the speed of sync methods, except that our performance tests are using the async methods wrong – and using it the right way can make async faster than sync.

Making the right calls

In our internal performance testing when we test our async methods, we use async for everything – opening the connection, executing the command, reading rows and get the value for each column. In the official post I wrote for .NET 4.5 I had said that this was not recommended since calling ReadAsync in default command behavior buffers the entire row into memory, so subsequent async calls to read column values is a waste of CPU. Just by switching the calls from GetFieldValueAsync to GetFieldValue I was able to get our performance test running faster with async calls than sync.

This greatly simplifies the rules that I had previously written – always call NextResultAsync, ReadAsync and either call GetFieldValue for default command behavior or GetFieldValueAsync for sequential access

Checking ahead

If you have a look at the code for ReadAsync you can see that there is a lot of “infrastructure” that needs to be set up in order to do an async call, and yet it will never be used if all of the data to read that row is currently available. So, in .NET 4.5.1, we introduced an internal method called WillHaveEnoughData that checks if we are guaranteed to have enough data in the current buffer to satisfy the next request (be it reading a header, column or entire row). In order to do this, we have to make a few assumptions:

Full speed ahead

If you read through the code of WillHaveEnoughData you can see the full set of assumptions, checks and optimizations that have been made. To summarize the code, the way to get the best performance out of the improvements made to SqlDataReader is to ensure the following:

  • Use SQL Server 2008 R2 or later – this introduced a feature called Null Bitmap Compression (NBC) which allows the server to specify which columns are null for a row in the row header instead of setting the column to null in the column’s header.
  • Avoid using NTEXT, TEXT, IMAGE, TVP, UDT, XML, [N]VARCHAR(MAX) and VARBINARY(MAX) – the maximum data size for these types is so large that it is very unusual (or even impossible) that they would happen to be able to fit within a single packet.
  • Keep the maximum size of variable column as small as possible – as I mention above, we assume that variable sized columns will be the maximum size permitted by that column (so a VARCHAR(50) is assumed to always be 50 characters). This can make a huge difference in your applications performance – reducing a column from a VARCHAR(8000) to a VARCHAR(250) or VARCHAR(50) can be the difference between always creating and disposing the async infrastructure and rarely creating it.
  • If you are moving large amounts of data, consider increasing the Packet Size specified in the connection string (NOTE: This will increase the amount of memory required for each connection).
  • For sequential access, make sure to read each column in its entirety before moving to the next column (or next row), especially when reading large columns (otherwise the check for the amount of data needed will include the amount of data left for the current column and, for [N]VARCHAR\VARBINARY(MAX) types, if there is any leftover data at all then we will assume that it will not fit in a single packet)

Hopefully this gives you a good idea of the performance improvements made in .NET 4.5.1 and how to get the most out of them.

Tagged , , , , , ,