A SQL-Based Universal Currency Formatter

Comments 0

Share to social media

When an application displays results for its users, the results should be easy on the eyes to make sure that they are interpreted properly. Programming languages have therefore developed extensive libraries that are designed specifically for the job of formatting results to make them less ambiguous for the users and clearer to read. Until SQL Server 2012 introduced the FORMAT() function that makes use of .NET’s full language and cultural formatting library, TSQL’s facilities to do this existed but were incomplete because conventional wisdom has it that the task of formatting values as strings should be handled in the application’s presentation tier. Usually, yes, but not inevitably. Occasionally the SQL developer has the task of rendering a value, currency for example, as a string according to the language and culture of the user of the system.

For anyone who has to write for previous versions of SQL Server, the task of formatting currencies for display can be nightmarishly complex. This is because of the wide variety of formats around the world. So, as is the tradition here at Simple Talk, we’ll take the complex and make it simple.

Currency Formatting at its Simplest

Let’s take a look at a simple example of a first attempt at formatting a numeric value for display as a currency.

We have inserted the 3 commas needed to format our big money number in a currency display format. If this were US dollars, all we need to do is concatenate the currency symbol ($) in front of the string.

What can we learn from this? First, it takes three nested STUFFs to insert all the commas we need. Each offset amount (8, 5 and 2) is applied from right to left (inner STUFF to outer STUFF) to ensure changing string lengths don’t affect the offset. Also, each offset is different by exactly 3, which also happens to be the number of digits grouped by the commas.

Unfortunately for us, our simple example doesn’t work too well when we change our number to 10 billion. In that case, the last group will contain four zeroes. But this can be simply fixed as follows:

Now we’re using the length of our amount, less an offset, to calculate exact placement of the comma. Once again, the offsets themselves are different by exactly 3.

Yet again though, we are stymied because the moment we get to one trillion (or worse yet to a number less than one billion) our simple example won’t work. The solution, of course, is to apply as many STUFFs as there must be commas in our formatted number.

At this point in time, most people would drop back and punt, thinking that the only way to repeatedly apply STUFF to a string is in a loop. But maybe there’s another way. Let’s try this:

Notice how we have forced the ordering from our VALUES clause to control processing STUFF from right to left.

Since the largest number of digits SQL can handle in a number is DECIMAL(38,2), we know this means that exactly 36 digits can appear to the left of the decimal and hence up to 12 offsets are required. We just need to make sure we don’t calculate a negative offset, and handle negative numbers, so the above code becomes:

Now we have a pretty good way to insert our commas, the next thing we need to consider is formatting variances across countries.

A Countries Table Containing Currency Formats

Actually we’ll be creating two tables to store our formatting information and countries with their currency. The DDL for these tables is:

You can run this DDL from the setup script in the resources section of this article. The data has been pulled together from many sources and, while not comprehensive is a pretty good representation of most major countries and currencies. The author takes no responsibility for the accuracy of this data so you should verify the data that is there for any currencies of interest to you. Some of the resources utilized to put this data together include:

Now we’ll present a modification of our prior formatting script that takes advantage of the new Countries and CurrencyFormats tables we just created.

The important modifications that we have made are described below:

  • The type of our @BigMoney has been changed to have 3 decimal digits. If you check the data put into the Countries table, you will find that no country uses more than 3 decimal digits.
  • We created the additional local variables: @LMoney and @Count. @LMoney is the initial length of our currency string and is used to calculate a record count (used in the CROSS APPLY). @Count (which we’ll discard later) is useful in testing this script to see the number of STUFFs applied to the currency string.
  • The outer SELECT is from our Countries tables and we limit that to the country of interest in the WHERE clause (e.g., US).
  • The STUFF now inserts the SeparatorCharacter column of the Countries table, which in at least one case is more than one character in length, instead of a comma.
  • We perform two cascading CROSS APPLYs to calculate the number of leading digits from our CurrencyFormat, and use that to calculate the record count we’ll need to retrieve from our CurrencyFormats table to process the full length of the string. The challenge was to get this just right and must be precise otherwise you’ll find separator characters inserted in awkward places in some currency amounts.
  • It was necessary to discard the table row constructor (VALUES) syntax in favor of retrieving the offsets from an existing table to ensure row ordering, otherwise inconsistent results appeared.
  • The final CROSS APPLY actually selects the offsets we’ll apply in the STUFF to get to our final, formatted currency string. It is important to note the CLUSTERED INDEX on this table, which is designed to return our offsets from smallest to largest, ensuring that our STUFFs are applied right to left in our currency string.

We recommend you familiarize yourself with this script, by testing various values for @BigMoney for these countries: US, JP, SZ and IN, all of which have unique aspects to their currency formats. This way you can assure yourself that the algorithm is sound.

IN (India) is particularly intriguing because no other country formats their currency in quite the same manner:

##,##,##,###.##

Most countries have a format similar to the US: ###,###,###.## with some variation in the decorator characters (the symbol grouping thousands and the decimal point). SZ (Swaziland) is similar to the US except that it inserts a space after the comma. JP (Japan) does not use any grouping symbol (that we know of) and the standard decimal point, so you’ll find that its CurrencyFormat specification in the Countries table is NULL.

The final result for each case will retain the 3 decimal digits, but not to worry as we’ll be taking care of that later.

A SQL FUNCTION for Universal Currency Formatting

Now that we’ve identified a basic algorithm for our formatting, we shall construct a Table Valued Function (TVF) that utilizes this along with some final formatting of the result to include things like the currency symbol and the appropriate decimal point symbol.

Note that we have changed the type of our OutputResult to NVARCHAR because our currency symbols are mostly UNICODE characters (up to 4). We also included the OPTION(MAXDOP 1) to avoid SQL parallelizing the query, which although unlikely could have a negative impact on the results, as the rows must be processed in a serial fashion.

Finally, here’s a small test harness that you can use to proof the results.

Selected results returned by the test harness with a description of what you’ll see:

A timing test of this TVF found that about 100,000 numbers can be formatted in about 12.5 seconds on a Core i5 (2.5 GHz) laptop.

We explored several other options for this approach and found this one to be a reasonable compromise between speed and flexibility.

We sincerely hope that this example and test data is a useful addition to your SQL tool chest.

Load comments

About the author

Dwain Camps

See Profile

Dwain Camps has been a project manager for many years. Because performance of applications can be a critical success factor for projects, he has been evangelizing on the need to develop highly performing SQL. By mentoring and authoring articles on SQL, he hopes to train a future generation of software engineers on the right and wrong ways to deliver SQL code. He also has a special interest in developing solutions to complex, data intensive problems using high performance SQL because the declarative nature of SQL allows development of algorithmically unique solutions that procedural languages may not be capable of.

Dwain Camps's contributions