Thursday, July 22, 2010

Understanding VARCHAR(MAX) in SQL Server 2005

In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.

To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.
To solve this problem, Microsoft introduced the VARCHAR(MAX),  NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.

Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.

No comments:

Post a Comment