The Helper Table Workbench

Sometimes, when writing TSQL code in functions or procedures, it is tempting to do iterations, or even worse, a cursor, when it isn’t really necessary. Cursors and iterations are both renowned for slowing down Transact SQL code. SQL Server just isn’t designed for it.

However, there is usually a way to do such operations in a set-based way. If you do so, then your routines will run a lot faster, with speed at least doubling. There are a lot of tricks to turning a problem that seems to require an iterative approach into a set-based operation, and we wish we could claim we’d invented one of them. Probably the most useful technique involves that apparently useless entity, the ‘helper’ table. This workshop will concentrate on this, because it is probably the most widely used.

The most common Helper table you’ll see is a table with nothing but the numbers in a sequence from 1 upwards. These tables have a surprising number of uses. Once you’ve understood the principles behind helper tables, then you’ll think of many more. We’ll be providing several examples where a helper table suddenly makes life easier. The objective is to show the principles so that you’ll try out something similar the next time you have to tackle a tricky operation in TSQL. As always, you’re encouraged to load the example script into Query Analyser or Management Studio, and experiment!

Our examples include:

Splitting Strings into table-rows, based on a specified delimiter
Encoding and decoding a string
Substituting values into a string
Extracting individual words from a string into a table
Extracting all the numbers in a string into a table
Removing all text between delimiters
Scrabble score
Moving averages
Getting the ‘Week beginning’ date in a table
Calculating the number of working days between dates.

Note. These examples use VARCHAR(8000) just so they compile on both SQL Server 2000 and 2005. If you are using SQL Server 2005, you’ll probably want to change them to VARCHAR(MAX).

Before we start, we’ll need a helper table of numbers. Our examples aren’t going to require high numbers, but we’ve parameterised the size of the table that the routine creates

Creating the helper table

Here is a routine that checks to see if such a ‘helper’ table called numbers exists, and, if not, creates it.

CREATE PROCEDURE spMaybeBuildNumberTable

@size INT=10000

BEGIN

SET NOCOUNT ON

IF NOT EXISTS (SELECT * FROM dbo.sysobjects

WHERE id = OBJECT_ID(N'[dbo].[Numbers]')

AND OBJECTPROPERTY(id, N'IsUserTable') = 1)

BEGIN

CREATE TABLE [dbo].[Numbers](

[number] [int],

CONSTRAINT [Index_Numbers] PRIMARY KEY CLUSTERED

(

[number] ASC

) ON [PRIMARY]

DECLARE @ii INT

SELECT @ii=1

WHILE (@ii<=@size)

BEGIN

INSERT INTO NUMBERS(NUMBER) SELECT @II

SELECT @II=@II+1

END

Once you have one of these tables, which we’ve seen described as the Transact SQL developer’s ‘Swiss Army Knife’, you will not want to be without it.

Splitting Strings into table-rows, based on a specified delimiter

Imagine you have a string which you want to break into words to make into a table.

DECLARE @ordinal VARCHAR(255)

SELECT @Ordinal=

'first second third fourth fifth sixth seventh eighth ninth tenth'

This can be done very simply and quickly through the following single SQL Select statement:

(Make sure you have executed the spMaybeBuildNumberTable procedure first!)

SELECT SUBSTRING(@Ordinal+' ', number,

CHARINDEX(' ', @Ordinal+' ', number) - number)

FROM Numbers

WHERE number <= LEN(@Ordinal)

AND SUBSTRING(' ' + @Ordinal,

number, 1) = ' '

ORDER BY number RETURN

/*----with the result

first

second

third

fourth

fifth

sixth

seventh

eighth

nineth

tenth

(10 row(s) affected)

You can then enshrine this principle into a table function that will take any delimiter to split a string. (We believe that the credit for this clever routine should go to Anith Sen).

CREATE FUNCTION [dbo].[uftSplitString]

(

@String VARCHAR(8000),

@Delimiter VARCHAR(255)

)

RETURNS

@Results TABLE

(

SeqNo INT IDENTITY(1, 1),

Item VARCHAR(8000)

)

BEGIN

INSERT INTO @Results (Item)

SELECT SUBSTRING(@String+@Delimiter, number,

CHARINDEX(@Delimiter, @String+@Delimiter, number) - number)

FROM Numbers

WHERE number <= LEN(REPLACE(@String,' ','|'))

AND SUBSTRING(@Delimiter + @String,

number,

LEN(REPLACE(@delimiter,' ','|'))) = @Delimiter

ORDER BY number RETURN

END

This is the fastest means I have come across in TSQL to split a string into its components as rows. Try this, which give you a table with the integer and the nominal name.

SELECT * FROM dbo.uftSplitString(

'one,two,three,four,five,six,seven,eight,nine,ten',',')

--or this, which

SELECT * FROM dbo.uftSplitString(

'Monday--Tuesday--Wednesday--thursday--friday--saturday--sunday','--')

Encoding and decoding a string

You can use the same principle for a lot of string operations.

Here is one that will URLencode a string so it can be used in a POST or GET HTTP operation. The string is converted into a table with one character per row and, after being operated on, it is re-assembled.

CREATE FUNCTION ufsURLEncoded

(

@String VARCHAR(MAX)

)

RETURNS VARCHAR(MAX)

BEGIN

DECLARE @URLEncodedString VARCHAR(MAX)

SELECT @URLEncodedString=''

SELECT @URLEncodedString=@URLEncodedString+

CASE WHEN theChar LIKE '[A-Za-z0-9()''*-._!]'

THEN theChar

ELSE '%'

+ SUBSTRING ('0123456789ABCDEF',

(ASCII(theChar) / 16)+1,1)

+ SUBSTRING ('0123456789ABCDEF',

(ASCII(theChar) % 16)+1,1)

END

FROM

(

SELECT [theChar]=SUBSTRING(@string,number,1)

FROM numbers

WHERE number <= LEN(@String) ) Characterarray

-- Return the result of the function

RETURN @URLEncodedString

END

This sort of routine is a lot less complex than the iterative methods and is, as one would expect, a lot faster. Just to show that this is no fluke, here is the reverse function to decode a URL Query string.

CREATE FUNCTION ufsURLDecoded

(

@String VARCHAR(MAX)

)

RETURNS VARCHAR(MAX)

BEGIN

SELECT @string=

REPLACE(@string,escapeString, TheCharacter)

FROM

(SELECT

[escapeString]=SUBSTRING(@string,number,3),

[theCharacter]=CHAR(

(CHARINDEX(

SUBSTRING(@string,number+1,

1),

'0123456789ABCDEF')-1)*16

+CHARINDEX(

SUBSTRING(@string,number+2,

1),

'0123456789ABCDEF')-1)

FROM numbers

WHERE number <= LEN(@String)

AND SUBSTRING(@string,number,1) = '%'

WHERE CHARINDEX(escapeString,@string)>0

RETURN @String

END

Here is a simple function that acts as an int to hex converter.

CREATE FUNCTION ufsIntToHex

(

@integer INT

)

RETURNS VARCHAR(20)

BEGIN

DECLARE @result VARCHAR(80)

SELECT @result=''

IF @integer>0

SELECT @result=CASE WHEN @integer>0 THEN

SUBSTRING('0123456789ABCDEF',(@integer%16)+1,1)

ELSE '' END+@result,

@integer=@integer/16

FROM numbers WHERE number<20

ELSE SELECT @result=0

RETURN '0x'+@result

END

It works this way:

1	SELECT dbo.ufsIntToHex(1024)

And what about TSQL that converts hex to int?

CREATE FUNCTION ufiHexToInt

(

@HexString VARCHAR(50)

)

RETURNS INT

BEGIN

DECLARE @result INT

SELECT @result=0

SELECT @Result=(@result*16)+

(CHARINDEX(SUBSTRING(@HexString,number,1),'0123456789ABCDEF')-1)

FROM numbers WHERE number<=LEN(@HexString)

AND SUBSTRING(@HexString,number,1) LIKE '[0-9ABCDEF]'

RETURN @Result

END

And now we can test it out:

SELECT COUNT(*)

FROM numbers

WHERE dbo.ufiHexToInt(dbo.ufsIntToHex(number))<>number

Substituting values into a string

Next, we have a function that uses these principles to do macro substitution of values into a string. It will work for replacing XHTML placeholders with a value, or producing error messages in a variety of languages. In fact, uses keep popping up for this sort of function. In this version, one can specify what strings are used for substitutions (the default is %1, %2, %3 etc,) and what you use as the delimiter of your list of macros and values.

CREATE FUNCTION [dbo].[ufsSubstitute]

(

@Template VARCHAR(8000),

@macroList VARCHAR(8000),

@valueList VARCHAR(8000),

@Delimiter VARCHAR(255)

)

RETURNS

VARCHAR(8000)

BEGIN

DECLARE @macros TABLE (MyID INT IDENTITY(1,1),variable VARCHAR(80))

DECLARE @values TABLE (MyID INT IDENTITY(1,1),[value] VARCHAR(8000))

--get all the variables

INSERT INTO @macros (variable)

SELECT SUBSTRING(@MacroList+@Delimiter, number,

CHARINDEX(@Delimiter, @MacroList+@Delimiter, number) - number)

FROM Numbers

WHERE number <= LEN(@MacroList)

AND SUBSTRING(@Delimiter + @MacroList, number, LEN(@delimiter))

= @Delimiter

ORDER BY number

INSERT INTO @values ([value])

SELECT SUBSTRING(@ValueList+@Delimiter, number,

CHARINDEX(@Delimiter, @ValueList+@Delimiter, number) - number)

FROM Numbers

WHERE number <= LEN(@ValueList)

AND SUBSTRING(@Delimiter + @ValueList, number,

LEN(@delimiter)) = @Delimiter

ORDER BY number

SELECT @Template=

REPLACE(@Template,COALESCE(variable,

'%'+CAST(v.MyID AS VARCHAR)),

[value])

FROM @values v

LEFT OUTER JOIN @macros m ON v.MyID=m.MyID

WHERE CHARINDEX(COALESCE(variable,'%'+CAST(v.MyID AS VARCHAR))

,@Template)>0

RETURN (@Template)

END

There are several ways that we can use this routine in practical applications. Try out these and see what happens!

SELECT dbo.ufsSubstitute (NULL,NULL,'',',')

SELECT dbo.ufsSubstitute ('','','',',')

SELECT dbo.ufsSubstitute (

' views','','6',',')

SELECT dbo.ufsSubstitute ('

Dear $1 $2,

It has come to our attention that your $3 account is $4

to the extent of Â£$5.

Please phone our adviser, $6 $7 on $8 who will inform you of

the various actions that need to be taken',

'$1,$2,$3,$4,$5,$6,$7,$8',

'Mrs,Prism,current,overdrawn,5678,Mr,Grabbitas,04585 725938',

',')

SELECT dbo.ufsSubstitute ('To @Destination;

Report dated @Date

The @table table is now @rows long. please @action'

,'@Destination|@Date|@Table|@rows|@action',

'Phil Factor|12 Apr 2007|Log|1273980|truncate it at once','|')

SELECT dbo.ufsSubstitute (

'I thought that your present of a %1 was %2. Thank you very much.

The %1 will come in handy for %3'

,''

,'trowel|absolutely wonderful|gardening','|')

Extracting individual words from a string into a table

One can do rather cleverer things than this. For example, one can extract all the words from a string into a table, a row for each
word.

CREATE FUNCTION [dbo].[uftWords]

(

@String VARCHAR(8000)

)

RETURNS

@Results TABLE

(

SeqNo INT IDENTITY(1, 1),

Word VARCHAR(8000)

)

BEGIN

INSERT INTO @Results(word)

SELECT [word]=LEFT(RIGHT(@string,number),

PATINDEX('%[^a-z]%',RIGHT(@string,number)+' ')-1)

FROM Numbers

WHERE number <= LEN(@String)

AND PATINDEX('%[a-z]%',RIGHT(@string,number))=1

AND PATINDEX('%[^a-z]%',RIGHT(' '+@string,number+1))=1

ORDER BY number DESC

RETURN

END

--and you can get the words (we use it for inversion indexes)

SELECT * FROM dbo.uftWords ('One can do rather cleverer

things than this. <>! For example, one can extract all the

words from a string into a table, a row for each word.')

--or a word count

SELECT COUNT(*) FROM dbo.uftWords ('It is extraordinary

how easy it is to get a wordcount using this ')

Extracting all the numbers in a string into a table

Even more useful than this is a function that picks out all the numbers from a string into a table. You can therefore easily pick out the third or fourth string simply, because the table has the order as well as the number itself. Were it not for the unary minus operator, this would have been a delightfully simple function.

If you are using this routine, you’ll want to cast these numbers into the number type of your choice. We supply them as strings.

CREATE FUNCTION [dbo].[uftNumbers]

(

@String VARCHAR(8000)

)

RETURNS

@Results TABLE

(

SeqNo INT IDENTITY(1, 1),

number VARCHAR(100)

)

BEGIN

INSERT INTO @Results(number)

SELECT

CASE LEFT(RIGHT(' '+@String,number),1)

WHEN '-' THEN '-' ELSE '' END+

SUBSTRING( RIGHT(@String,number-1),1,

PATINDEX('%[^0-9.]%',

RIGHT(' '+@String,number-1)+' ')-1)

FROM Numbers

WHERE number <= LEN(REPLACE(@String,' ','!'))+1

AND PATINDEX('%[^0-9.][0-9]%',RIGHT(' '

+@String,number))=1

ORDER BY number DESC

RETURN

END

--So we try out a few examples just to see. It removes anything

--that doesn't look kile a number

SELECT * FROM dbo.uftNumbers('there are numbers like 34.56,

-56, 67.878, maybe34; possibly56, and a few others like <789023>')

SELECT * FROM dbo.uftNumbers('23,87986,56.78,67,09,23,30')

SELECT * FROM dbo.uftNumbers('')

SELECT * FROM dbo.uftNumbers('

<div>')

SELECT * FROM dbo.uftNumbers('there are numbers like 34.56,

-56, 67.878, maybe34; possibly56, and a few others like <789023>')

SeqNo number

----------- ------------

1 34.56

2 -56

3 67.878

4 34

5 56

6 789023

(6 row(s) affected)

Removing all text between delimiters

This is a handy little routine for looking at the strings in HTML code, but seems to earn its keep in a lot of other ways. You specify the opening and closing delimiter. At the moment, only single-character delimiters are allowed. Can anyone re-write it to allow multi-character delimiters?.

CREATE FUNCTION [dbo].[ufsRemoveDelimited]

(

@String VARCHAR(8000),

@OpeningDelimiter CHAR(1),

@ClosingDelimiter CHAR(1)

)

RETURNS

VARCHAR(8000)

BEGIN

DECLARE @newString VARCHAR(8000)

IF @OpeningDelimiter = @ClosingDelimiter

BEGIN

RETURN NULL

END

IF @OpeningDelimiter+@ClosingDelimiter+@String IS NULL

BEGIN

RETURN @String

END

SELECT @NewString=''

SELECT @newString =@newString +SUBSTRING(@String,number,1)

FROM numbers

WHERE number<=LEN (REPLACE(@string,' ','|'))

AND

CHARINDEX (@OpeningDelimiter,@string+@OpeningDelimiter,number)

CHARINDEX (@ClosingDelimiter,@string+' '+@closingDelimiter,number)

AND number <> CHARINDEX (@OpeningDelimiter,@string,number)

RETURN @NewString

END

So we can try it out with brackets:

1 2	SELECT dbo.ufsRemoveDelimited( 'this will appear(but not this),)will this?(','(',')')

Or if you want to take out tags:

SELECT dbo.ufsRemoveDelimited(

'this will appear

<div>and this </div>and this','<','>')

--or this

SELECT dbo.ufsRemoveDelimited(

'<?xml version="1.0" encoding="us-ascii"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<head>

</head>

<body>

<h2>

Weather forecast

</h2>

<p>

The rain it raineth every day<br />

upon the just and unjust fellah<br />

but mainly on the just, because,<br />

the unjust pinched the just''s umbrella<br />

</p>

</div>

</body>

</html>

','<','>')

Scrabble Score

And as a slightly silly example of the sort of chore that crops up occasionally when analysing strings, character by character, here is a way of scoring a string for Scrabble, assuming it is all on ordinary squares!

CREATE FUNCTION ufiScrabbleScore

(

@String VARCHAR(100)

)

RETURNS INT

BEGIN

DECLARE @Total INT

SELECT @total=0

SELECT @Total=@total+

CASE WHEN SUBSTRING(@String,number,1) NOT LIKE '[a-z]' THEN NULL

ELSE

CAST(SUBSTRING('00000000001122223333347799',

CHARINDEX(SUBSTRING(@String,number,1),

'EAIONRTLSUDGBCMPFHVWYKJXQZ')

,1) AS INT)+1 END

FROM numbers WHERE number <=LEN(@String)

RETURN @Total

END

And now we try it out!

1 2	SELECT dbo.ufiScrabbleScore('Quiz') --22 SELECT dbo.ufiScrabbleScore('Robyn') --10

Now we find out which are the highest scorers, assuming an illegal quantity of tiles. We use the WordList from Phil’s Blog on the Fireside Fun of Decapitations.

Don’t try running this without the WordList!

SELECT dbo.ufiScrabbleScore(word),word

FROM wordlist ORDER BY dbo.ufiScrabbleScore(word) DESC /*

49 razzamatazz

48 razzmatazz

45 pizzazz

43 quizzically

39 squeezeboxes

38 quizzical

38 psychoanalyzing

37 psychoanalyzed

37 squeezebox

..etc....

As well as slicing and dicing strings, once one has one’s helper table,
suddenly time-interval based reporting becomes much easier.

Moving averages

How about moving averages? Here is a simple select statement that gives you the moving average (over a week) of the number of log entries for every day for a year. This can be adapted to give weighted and exponential moving averages over arbitrary time periods. You use this technique for ironing out ‘noise’ from a graph in order to accentuate the underlying trend.

To execute this, you will need a table to try it on.

DROP TABLE #cb

CREATE TABLE #cb (insertionDate datetime)--quantity

--for once, we need to iterate to shuffle the pack

DECLARE @ii INT

SELECT @ii=0

WHILE @ii<20000

BEGIN

INSERT INTO #cb(insertionDate)

SELECT DATEADD(Hour,RAND()*8760,'1 jan 2006')

SELECT @ii=@ii+1

END

--and put an index on it

CREATE CLUSTERED INDEX idxInsertionDate ON #cb(insertionDate)

Now we can try out a moving average!

SELECT start,[running average]=COUNT(*)/7

FROM

(

SELECT [order]=number,

[start]=DATEADD(DAY,number,'1 Jan 2006'),

[end]=DATEADD(DAY,number+7,'1 Jan 2006')

FROM numbers

WHERE DATEADD(DAY,number,'1 Jan 2006')

BETWEEN '1 Jan 2006' AND '1 Jan 2007')f

LEFT OUTER JOIN [#cb]

ON [#cb].insertionDate BETWEEN f.start AND f.[end]

GROUP BY start

ORDER BY start

Getting the ‘Week beginning’ date in a table

Here is a UDF that lists all the Mondays (or whatever you want) between two dates.

CREATE FUNCTION uftDatesOfWeekday

(

@Weekday VARCHAR(10),

@StartDate datetime,

@EndDate DateTime

)

RETURNS TABLE

RETURN

(

SELECT

[start]=DATEADD(DAY,number-1,@StartDate)

FROM numbers

WHERE DATEADD(DAY,number-1,@StartDate)< @EndDate

AND DATENAME(dw,DATEADD(DAY,number-1,@StartDate))=@Weekday

)

And you can try it out by finding how many mondays there are between

the first of January and 1st june this year. */

SELECT * FROM dbo.uftDatesOfWeekday('monday','1 Jan 2007','1 Jun 2007')

Number of Working Days between two dates

Or, how about a UDF that tells you the number of working days between two dates? (you can alter it if Saturday and Sunday are not your days off!)

CREATE FUNCTION ufiWorkingDays

(

@StartDate datetime,

@EndDate DateTime

)

RETURNS INT

BEGIN

RETURN

(SELECT COUNT(*)

FROM numbers

WHERE DATEADD(DAY,number-1,@StartDate)< @EndDate

AND DATENAME(dw,DATEADD(DAY,number-1,@StartDate))

NOT IN ('saturday','sunday'))

END

So how many working days until Christmas?

1 2	SELECT dbo.ufiWorkingDays( GETDATE(),'25 Dec '+DATENAME(YEAR,GETDATE()))

We can go on for ever with example of using a numeric Helper table but we won’t because most of you will have wandered off even before you reach this point. We hope that you’ll now take over and create some more examples, try them out against iterative solutions that do the same thing. We guarantee you’ll be pleased with the result!

Register for Simple Talk

The Helper Table Workbench

Our examples include:

Creating the helper table

Splitting Strings into table-rows, based on a specified delimiter

Encoding and decoding a string

Substituting values into a string

Extracting individual words from a string into a table

Extracting all the numbers in a string into a table

Removing all text between delimiters

Scrabble Score

Getting the ‘Week beginning’ date in a table

Number of Working Days between two dates

See also other Workbenches at Simple-Talk

Article tags

About the author

Phil Factor

Phil Factor's contributions

Articles

Books

Top topics

Phil Factor's latest contributions:

Accessibility, and the Need for Ingenious Simplicity in Computer Scripts, Languages and Programs

Data Documents and Common Sense

AI and Databases

About the author

Robyn Page

Robyn Page's contributions

Articles

Books

Top topics

Robyn Page's latest contributions:

SQL Server Matrix Workbench

Database Design Workbench – Keys

NET Performance Cribsheet

Our examples include:

Creating the helper table

Splitting Strings into table-rows, based on a specified delimiter

Encoding and decoding a string

Substituting values into a string

Extracting individual words from a string into a table

Extracting all the numbers in a string into a table

Removing all text between delimiters

Scrabble Score

Getting the ‘Week beginning’ date in a table

Number of Working Days between two dates

See also other Workbenches at Simple-Talk

Article tags

Recommended

About the author

Phil Factor

Phil Factor's contributions

Articles

Books

Top topics

Phil Factor's latest contributions:

Accessibility, and the Need for Ingenious Simplicity in Computer Scripts, Languages and Programs

Data Documents and Common Sense

AI and Databases

About the author

Robyn Page

Robyn Page's contributions

Articles

Books

Top topics

Robyn Page's latest contributions:

SQL Server Matrix Workbench

Database Design Workbench – Keys

NET Performance Cribsheet