Using XML to pass lists as parameters in SQL Server

(Updated 14th Jan 2012, 26th Jan 2012, and 3rd Aug 2017)

Every so often, before SQL Server 2016, the question came up on forums of how to pass a list as a parameter to a SQL procedure or function.

Nowadays, of course, one uses the STRING_SPLIT() function ….

1 2	DECLARE @tags VARCHAR(200) = 'Yan,Tyan,Tethera,Methera,Pimp,Sethera,Lethera,Hovera,Dovera,Dik,Yanadik,Tyanadik,Tethera dik,Methera dik,Bumfitt,Yanabumfit,Tyanabumfitt,Tetherabumfitt,Metherabumfitt,Giggot' SELECT value FROM STRING_SPLIT(@tags, ',')

There was a time that I used to love this question because one could spread so much happiness and gratitude by showing how to parse a comma-delimited list into a table. Jeff Moden has probably won the laurels for the ultimate list-parsing TSQL function, the amazing ‘DelimitedSplit8K’. Erland Sommarskog has permanently nailed the topic with a very complete coverage of the various methods for using lists in SQL Server on his website.. With Table-Valued parameters, of course, the necessity for having any lists in SQL Server is enormously reduced, though it still crops up.

Element-based XML seems, on the surface, to provide a built-in way of handling lists as parameters. No need for all those ancillary functions for splitting lists into tables, one might think. Yes, indeed, but be careful of the XQuery syntax that you use, as we’ll see.

Let’s just take the very simplest example of taking a list of integers and turning it into a table of integers. One can use a simple element-based XML list based on a fragment like this…

1 2	DECLARE @XMLlist XML SELECT @XMLList = '<list><i>2</i><i>4</i><i>6</i><i>8</i><i>10</i><i>15</i><i>17</i><i>21</i></list>'

…to give a table like this ..

IDs

-----------

(8 row(s) affected)

..by using this TSQL expression

1 2	SELECT x.y.value('.','int') AS IDs FROM @XMLList.nodes('/list/i') AS x ( y )

It isn’t as intuitive as a simple comma-delimited list to be sure, but it is bearable. In small doses, this shredding of element-based XML lists works fine, but,<added 26th January> using this particular syntax for getting the value of the element </added 26th January> one soon notices the sigh from the server as the length of the list starts to increases into four figures. Try it on a list of a hundred thousand integers and you’ll have time to eat a sandwich.

OK. Let’s do a few measurements and do a graph.

TheFirstGraph

Jeff Moden’s Split function, scales beautifully as shown by the red line, and it is difficult to measure it, it is so quick. The XML eShred technique by contrast, using the element-based list, and this XQuery syntax, exhibits horrible scaling. This sort of curve is enough to strike terror into the developer. Just as a comparison, I did the process of taking a list and turning it into a table by creating a VALUES expression from the list. Even this took longer than Jeff’s function, presumably because of the overhead of compiling the expression. Of course, the comparison is rather unfair because this third approach , the ‘Using Values Script’ approach, does no validation, but still one wonders what on earth the XML expression is DOING all that time. It must be gazing out the window, reading the paper and scratching itself, one assumes. No. Actually, the CPU is warming up on the task, so it involves frantic activity.

TheSecondGraph

Operations involving XML can be startlingly fast in SQL Server, but this particular critter seems startlingly slow once one gets to a three-figure number in the list that you’re turning into a relation that can then be used in SQL Expressions. Imagine this getting loose in a production system when someone starts passing more values in the list and the database suddenly slows down. How would you track the problem down?

<added 26th January>The big problem here is one of the expression. As pointed out by a reader of the blog, if you modify the expression slightly to

1 2	SELECT x.y.value('.','int') AS IDs FROM @XMLList.nodes('/list/i/text()') AS x ( y )

The shredding suddenly goes like a rocket</added 26th January>.

There is a second, and more wordy version of the simple XML list, The attribute-based list. Here

1 2	DECLARE @XMLlist XML SELECT @XMLList = '<list><y i="2" /><y i="4" /><y i="6" /><y i="8" /><y i="10" /><y i="15" /><y i="17" /><y i="21" /></list>'

…to give exactly the same table ..

IDs

-----------

(8 row(s) affected)

by using a very similar syntax like this…

1 2	SELECT x.y.value('@i','int') FROM @XMLList.nodes('list/y') AS x( y )

‘Eh? What difference could this make?’, you’re wondering. Well, rather a lot, it turns out. Thanks to a reader comment on the first version of this blog by wBob, we can reveal that this version scales as well as Jeff’s amazing ‘split’ function. I must admit to being rather surprised at the difference in performance of the two. By using the Attribute-based list, or the modified XQuery expression, you can cheerfully pass lists of substantial length to procedures without worrying, it seems. It reminded me of an argument I had some years ago with a distinguished Database Developer who was advising us all not to use XML-based lists because of their performance problems. I’d been using attribute-based lists simply because I copied Bob Beauchamin’s example code in ‘The Book’, and hadn’t hit any problem. We couldn’t convince each other. Perhaps we’d hit this performance difference.

<added 26th January>We can even improve on this. as Sviridov Konstantin has pointed out, you can tweak it to the point that it goes around twice the speed of Jeff’s TSQL split function by using syntax like this…

1 2	SELECT x.y.value('.','int') FROM @XMLList.nodes('list/y/@i') AS x( y )

Which will give a performance comparison with the the TSQL split function. These modified XML shreds must be the fastest non-CLR method of transferring list-based data as a variable in SQL.

FinalShootout

</added 26th January>XML in SQL Server seems to remind me of the old poem by Henry Wadsworth Longfellow. (1807-1882)

There was a little girl, who had a little curl
Right in the middle of her forehead,
And when she was good, she was very, very good,
But when she was bad she was horrid.’.

Just as a comparison, here is the horrid XML performing against the first attribute-based XML Shred, and the TSQL split function

TheThirdGraph

As always, one should measure everything you can, and take nothing for granted if you have to deliver a scalable application.

Here is the simple code I used to do the metrics. The results were simply pasted into excel and graphed.

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

--in order to record the timings, we prepare a log.

DECLARE @log TABLE (

Log_Id INT IDENTITY(1, 1),TheeVent VARCHAR( 2000 ),

[Values] INT, CPU FLOAT DEFAULT @@CPU_BUSY,

DateAndTime DATETIME DEFAULT GETDATE()) ;

--define working variables

DECLARE @List VARCHAR(MAX),@XMLList XML,@AttributeBasedXMLList XML,

@buildScript VARCHAR(MAX), @XMLbuildScript NVARCHAR(MAX),

@ii INT, @Values INT;

--and table variables to receive the results/relations

DECLARE @ByteBucket TABLE (TheNumeral INT)

DECLARE @ByteBucket2 TABLE (TheNumeral INT)

DECLARE @ByteBucket3 TABLE (TheNumeral INT)

DECLARE @ByteBucket4 TABLE (TheNumeral INT)

DECLARE @ByteBucket5 TABLE (TheNumeral INT)

DECLARE @ByteBucket6 TABLE (TheNumeral INT)

SET NOCOUNT ON

SELECT @Values=1 --start with one item in the list

WHILE @Values<=3000

BEGIN

--build up the list with random integers

SELECT @List='1',@ii=@Values

WHILE @ii>1

BEGIN

SELECT @List=@List+','+CONVERT(VARCHAR(3),CONVERT(INT,RAND()*100))

SELECT @ii=@ii-1 -- .. to the required length

END

--and pre-prepare the XML List and VALUES script

SELECT @XMLList='<list><i>'+REPLACE(@List,',','</i><i>')+'</i></list>'

SELECT @BuildScript='SELECT x FROM (values ('+REPLACE(@List,',','),(')+'))d(x)'

--and pre-prepare the XML List and VALUES script

SELECT @XMLBuildScript='SELECT @AttributeBasedXMLList = ( SELECT x AS "@x" FROM (values ('

+REPLACE(@List,',','),(')+'))d(x) FOR XML PATH(''y''), ROOT(''root''), TYPE )'

EXEC sp_executesql @XMLBuildScript, N'@AttributeBasedXMLList XML OUT',

@AttributeBasedXMLList OUT

--try doing the delimited list function

INSERT INTO @log (TheEvent, [Values]) SELECT 'Using Split function',@Values

INSERT INTO @ByteBucket (TheNumeral)

SELECT item FROM dbo.DelimitedSplit8K (@list,',')

--use the XML Shred trick

INSERT INTO @log (TheEvent, [Values]) SELECT 'Element-based XML',@Values

INSERT INTO @ByteBucket2 (TheNumeral)

SELECT x.y.value('.','int') AS IDs

FROM @XMLList.nodes('/list/i') AS x ( y )

--use the XML Shred trick

INSERT INTO @log (TheEvent, [Values]) SELECT 'Element-based XML (text())',@Values

INSERT INTO @ByteBucket6 (TheNumeral)

SELECT x.y.value('.','int') AS IDs

FROM @XMLList.nodes('/list/i/text()') AS x ( y )

--try the VALUES method

INSERT INTO @log (TheEvent, [Values]) SELECT 'Using Values script',@Values

INSERT INTO @ByteBucket3 (TheNumeral)

EXECUTE (@BuildScript)

--use the XML Shred trick

INSERT INTO @log (TheEvent, [Values]) SELECT 'Attribute-based XML',@Values

INSERT INTO @ByteBucket4 (TheNumeral)

SELECT x.y.value('@x','int')

FROM @AttributeBasedXMLList.nodes('root/y') AS x(y)

INSERT INTO @log (TheEvent, [Values]) SELECT 'finished',@Values

--use the XML Shred trick

INSERT INTO @log (TheEvent, [Values]) SELECT 'Attribute-based XML (@)',@Values

INSERT INTO @ByteBucket5 (TheNumeral)

SELECT x.y.value('.','int')

FROM @AttributeBasedXMLList.nodes('root/y/@x') AS x(y)

INSERT INTO @log (TheEvent, [Values]) SELECT 'finished',@Values

SELECT @Values=@Values+100

END

--Yes, we need to check that they all agree!

SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket

UNION ALL

SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket2

UNION ALL

SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket3

UNION ALL

SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket4

UNION ALL

SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket5

UNION ALL

SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket6

SELECT

TheStart.[Values],

MAX(CASE WHEN TheStart.TheeVent ='Using Split function'

THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

ELSE 0

END) AS [Using Split function],

MAX(CASE WHEN TheStart.TheeVent ='Element-based XML'

THEN DATEDIFF( ms, TheStart.DateAnddTime, Theend.DateAndTime )

ELSE 0

END) AS [Element-based XML],

MAX(CASE WHEN TheStart.TheeVent ='Using Values script'

THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

ELSE 0

END) AS [Using Values script],

MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML'

THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

ELSE 0

END) AS [XML Attributes],*/

MAX(CASE WHEN TheStart.TheeVent ='Element-based XML (text())'

THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

ELSE 0

END) AS [Element-based XML (text())],

MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML (@)'

THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

ELSE 0

END) AS [Attribute-based XML (@)]

FROM @log TheStart

INNER JOIN @log Theend

ON Theend.Log_Id = TheStart.Log_Id + 1

WHERE TheStart.TheEvent<>'finished'

GROUP BY TheStart.[Values];

SELECT

TheStart.[Values],

MAX(CASE WHEN TheStart.TheeVent ='Using Split function'

THEN theEnd.cpu-TheStart.cpu

ELSE 0

END) AS [Using Split function],

MAX(CASE WHEN TheStart.TheeVent ='Element-based XML'

THEN theEnd.cpu-TheStart.cpu

ELSE 0

END) AS [Element-based XML],

MAX(CASE WHEN TheStart.TheeVent ='Using Values script'

THEN theEnd.cpu-TheStart.cpu

ELSE 0

END) AS [Using Values script],

MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML'

THEN theEnd.cpu-TheStart.cpu

ELSE 0

END) AS [XML Attributes],

MAX(CASE WHEN TheStart.TheeVent ='Element-based XML (text())'

THEN theEnd.cpu-TheStart.cpu

ELSE 0

END) AS [Element-based XML (text())],

MAX(CASE WHEN TheStart.TheeVent ='Attribute-based XML (@)'

THEN theEnd.cpu-TheStart.cpu

ELSE 0

END) AS [Attribute-based XML (@)]

FROM @log TheStart

INNER JOIN @log Theend

ON Theend.Log_Id = TheStart.Log_Id + 1

WHERE TheStart.TheEvent<> 'finished'

GROUP BY TheStart.[Values];

Register for Simple Talk

Using XML to pass lists as parameters in SQL Server

About the author

Phil Factor

Phil Factor's contributions

Articles

Books

Top topics

Phil Factor's latest contributions:

Accessibility, and the Need for Ingenious Simplicity in Computer Scripts, Languages and Programs

Data Documents and Common Sense

AI and Databases

Recommended

About the author

Phil Factor

Phil Factor's contributions

Articles

Books

Top topics

Phil Factor's latest contributions:

Accessibility, and the Need for Ingenious Simplicity in Computer Scripts, Languages and Programs

Data Documents and Common Sense

AI and Databases