Encrypting Large Values

Federal identification numbers and credit card numbers are regulars when it comes to encrypting data. The US plain text version of the prior is eleven characters. The latter is typically sixteen characters. When these are encrypted using a symmetric key the size of the cipher text will range from thirty-nine to sixty eight bytes depending upon the selected algorithm. These lengths hardly raise the question of whether the cipher text will result in a breach of the size limit for a row.

In a nutshell, the row size limit in SQL Server can be explained as follows:

  • The limitation of the row size is a page.
  • A page comprises of 8 kilobytes (8,060 bytes).
  • 8 kilobytes is approximately 8,000 characters.

There may be times when the data that is requiring encryption is very large. Book manuscripts, official documents, schematics, blue prints, images, and archived e-mail messages are just a few examples of data that are stored in a database and can easily be larger than the maximum row size.

Varbinary The Maximus
Encrypted data is stored in the database as a varbinary data type. When defining a column of this data type you have the option to define the size explicitly from 1 to 8,000. You can also define the size as “max” which is 2,147,483,647 bytes (2^31-1) which is approximately 266,437 pages.

The behavior of data that exceeds the row size limit when the “max” size option is utilized depends upon how the sp_tableoption system stored procedure is set for the table in question. If the “large value types out of row” option is set to the value of “1”, the varbinary value will automatically be stored on a separate page whether the initial row has reached the row limit or not. If the option is set to “0” the data will be stored on a separate page only if it exceeds the row size limit. The subsequent pages that are used are linked to the initial page with a 16 byte pointer.

Using War and Peace
Leo Tolstoy’s famed book “War and Peace” often is used as an example of a very long book. The English translation of this book contains approximately 3.1 million characters. This characteristic made it’s text an excellent candidate for illustration of encrypting large text.

Consider the first chapter of this book which, in plain text, is 6,566 bytes in length. Using a symmetric key with the AES_256 algorithm, the results is a cipher text that is 6,612 bytes in length. This certainly fits within the row size limitations; but does not allow for much additional data to be stored within that same row.

Based upon the definition and behavior of the varbinary(max) data type, the entire text of “War and Peace” should fit within approximately 388 pages. This is certainly within the boundaries of the “max” size limitations. Although, when attempting to encrypt the entire manuscript of this legendary book, using the very same symmetric key, we receive the following error message:

Msg 8152, Level 16, State 10, Line 13
String or binary data would be truncated.

The devil in the details is that the output of the ENCRYPTBYKEY method is a maximum size of 8,000 bytes. Therefore despite our column being set to accept cipher text larger than the row limits the encryption process in SQL Server does not allow us to create cipher text of that size. This is true for the other encryption methods including the HashBytes method.

What, Me Worry?
One possible solution to this challenge would be to break up the large text into segments of less than 8,000 bytes prior to encryption. Book manuscripts have logical break points. Each chapter can be encrypted and stored separately. Implementing a normalized table structure will provide an flexible and organized method of storage.

Another approach might be to seek encryption options externally from the database. Encrypting the large text before it is introduced to the database could yield a broader list of options.

A final consideration is to challenge why the large text requires encryption. The intent to obfuscate data does not necessarily call for encryption. As an alternative, consider converting the plain text directly into a varbinary data type by utilizing the CONVERT method. For example:

INSERT INTO MyTable ([MyVarbinaryColumn]) VALUES

This statement will return a varbinary version of the plain text that is equal to the full length of the string. It’s resulting value is not discernable to the naked eye. To return this value to plain text requires only the conversion of the varbinary(max) data to a varchar(max) data type using the same CONVERT method. This approach does not offer the security level that encryption offers; but it may suffice for the intended obfuscation while overcoming the encryption method limitations.