{"id":256,"date":"2007-04-26T00:00:00","date_gmt":"2007-04-25T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/sql-server-grouping-workbench\/"},"modified":"2021-09-29T16:22:19","modified_gmt":"2021-09-29T16:22:19","slug":"sql-server-grouping-workbench","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/sql-server-grouping-workbench\/","title":{"rendered":"SQL Server Grouping Workbench"},"content":{"rendered":"<p>This Workbench is about using the <code>GROUP BY<\/code> clause in a SQL Statement. It is designed so it can be pasted straight into SSMS or the Query Analyser, though to run the examples, you&#8217;ll need the data file too. Just click the &#8220;CODE DOWNLOAD&#8221; link at the bottom of this article. Also included in the download bundle is the original SQL file, if you need it.<\/p>\n<h2>Contents<\/h2>\n<ul>\n<li><a href=\"#first\">Introduction<\/a><\/li>\n<li><a href=\"#second\">Removing Duplicates<\/a><\/li>\n<li><a href=\"#third\">Simple Reporting<\/a><\/li>\n<li><a href=\"#fourth\">Grouping and Pivot Tables<\/a><\/li>\n<li><a href=\"#fifth\">Enumerating the Group members<\/a><\/li>\n<li><a href=\"#sixth\">Ranging<\/a><\/li>\n<li><a href=\"#seventh\">Grouping as a utility<\/a><\/li>\n<li><a href=\"#eighth\">Questions<\/a><\/li>\n<\/ul>\n<h2>Introduction<\/h2>\n<p>Like all these workbench articles, the whole point is to try different things out, improve on the code, and learn in a practical way. If you&#8217;re like us, and find you can take in information only after real practice doing things for real then this article is for you.<\/p>\n<p>If you think that it is too elementary, move to the end of the article and test out your knowledge by answering all the questions (Yes, the editor is giving away prizes for the best answers he receives. Insist on the glorious USB Red-Gate pendant).<\/p>\n<p>Aggregations in SQL are very simple to do, but can cause a lot of confusion due to the word &#8216;Group&#8217;. The result you get from a grouping does not consist of the individual records but a collection or &#8216;aggregation&#8217; of records.<\/p>\n<p>A pack of playing cards is a group of 52 playing cards. If you were to group a pack of cards by colour, you&#8217;d have two &#8216;rows&#8217; of 26 cards; if you grouped by suit, there would be four &#8216;rows&#8217; with 13 in each. You could group them by their rank\/court attribute, or in other ways. The point is that the result is always an aggregate, even if there is only one member in the collection.<\/p>\n<p>Each row in the result of a <code>GROUP BY<\/code> represents a collection, or grouping. You can get the values for the number of records in each collection, or grouping, or the sum, average, and so on of numeric values, or the max or min of strings, but to ask for individual values makes no sense if you haven&#8217;t grouped by that value.<\/p>\n<p>Sometimes, you will use an aggregate function in a <code>SELECT<\/code> Statement without using a <code>GROUP BY<\/code> clause. There is an implicit grouping of the entire result into one row, which is why you can only use aggregate functions in such a case, and will get one row returned.<\/p>\n<p><code>GROUP BY<\/code> statements are the bedrock of reporting in SQL.<\/p>\n<p>The built-in aggregate functions are&#8230;<\/p>\n<p><b>Common ones <\/b><\/p>\n<pre>sum([all|distinct] expr.)   Total of the (distinct) values in the expression \r\navg([all|distinct] expr.)   Average of the (distinct) values in the expression \r\ncount([all|distinct] expr.) Number of (distinct) non-null values in the expression \r\ncount(*)                    Number of selected rows \r\nmax(expr.)                  Highest value in the expression \r\nmin(expr.)                  Lowest value in the expression \r\n<\/pre>\n<p><b>Statistical functions <\/b><\/p>\n<pre>stdev(expr.)               The standard deviation of the values in the expression \r\nStdevp(expr.)              Population standard deviation of the expression values \r\nVar(expr.)                 The Variance of the expression values \r\nVarp(expr.)                    The population Variance of the expression values \r\n<\/pre>\n<p><b>Checksum functions <\/b><\/p>\n<pre>Binary_checksum(expr.)      Returns the binary checksum value \r\nChecksum(expr.)                Returns the checksum value \r\nChecksum(*)                    checksum is over all the columns of the table \r\nChecksum_agg \r\n([all|distinct] expr.)     Returns the checksum of the NON-NULL values in a group.  \r\n<\/pre>\n<p>(You can use sum and avg with numeric columns only \u2013 <code>int, smallint, TinyIntBigint   decimal, numeric, float, SmallMoney<\/code> and <code>money<\/code>. You cannot use <code>min<\/code> and <code>max<\/code> with bit datatypes. You cannot use aggregate functions other than <code>count(*)<\/code> with text and image datatypes.)<\/p>\n<p>We&#8217;ll start with some simple manoevres and then finish off with a few wheelies and handbrake-turns.<\/p>\n<p>As sample data, we&#8217;ll take a list of all oil-producing countries in the world and their oil-production.<\/p>\n<p>Create a new database or use your &#8216;practice&#8217; development Database and then create this.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">CREATE TABLE [dbo].[OilProducers]( \r\n   [[OilProducer_ID] [int] IDENTITY(1,1) NOT NULL, \r\n   [country] [varchar](30) NOT NULL, \r\n   [BarrelsPerDay] [int] NOT NULL, \r\n   [Continent] [varchar](80) NOT NULL, \r\n   [Region] [varchar](80) NOT NULL \r\n) ON [PRIMARY] \r\n\r\nGO \r\n<\/pre>\n<p>And we&#8217;ll get our data in. Download the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/384-GroupingWorkbench_Code.zip\">data file<\/a>, and load the data as follows:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">-- substitute the full path to your file for the name 'OilProducers.txt' \r\nBULK INSERT dbo.OilProducers \r\n   FROM ' OilProducers.txt' \r\n\r\n\/* or \r\nDeclare @Command varchar(8000) \r\nselect @Command='bcp ' \r\n   +db_name() \r\n   +'.dbo.OilProducers in OilProducers.txt -c -S '+@@Servername+' -E -T' \r\nexecute xp_cmdshell @command--assumes Windows Security \r\n<\/pre>\n<h2 id=\"second\">Removing Duplicates<\/h2>\n<p>Before we get stuck into doing reporting, I&#8217;ll illustrate how simple it is to remove duplicates from tables using grouping.<\/p>\n<p>Of course, if you are going to remove duplicates, you will really want to move them to another table for forensic analysis to find out how it happened and whether they are valid entries. Here is our Duplicates table:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">CREATE TABLE [dbo].[Duplicates]( \r\n   [OilProducer_ID] [int] IDENTITY(1,1) NOT NULL, \r\n   [Duplicate_ID] [int], \r\n   [country] [varchar](30) NOT NULL, \r\n   [BarrelsPerDay ] [int] NOT NULL, \r\n   [Continent] [varchar](80) NOT NULL, \r\n   [Region] [varchar](80) NOT NULL \r\n) ON [PRIMARY] \r\n<\/pre>\n<p>Let&#8217;s maliciously create some duplicates!<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">INSERT INTO OilProducers(Country,BarrelsPerDay,continent, region) \r\n   SELECT Country,BarrelsPerDay,continent, region  \r\n         FROM OilProducers WHERE country LIKE 'A%' \r\nINSERT INTO OilProducers(Country,BarrelsPerDay ,continent, region) \r\n   SELECT Country,BarrelsPerDay,continent, region  \r\n        FROM OilProducers WHERE country BETWEEN 'A' AND 'E' \r\n        <\/pre>\n<p>You can then easily see these duplicates by:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT 'First_ID'=MAX(OilProducer_ID), 'No.InTable'=COUNT(*) \r\n   FROM OilProducers \r\n   GROUP BY country  \r\n   HAVING COUNT(*)&gt;1\r\n   <\/pre>\n<p>Notice that we use a <code>HAVING<\/code> clause. This is because we are selecting only those rows that contain more than one row in the original table (we could have checked only part of the original table by using the <code>WHERE<\/code> clause. A <code>WHERE<\/code> clause cannot contain an aggregate function of course, but the <code>HAVING<\/code> clause can.<\/p>\n<p>So we move them to another table to check them to make sure. We wrap this up in a transaction just in case anything goes wrong, as we don&#8217;t want to lose a record that might turn out to be legit.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">-- start of routine\r\nDECLARE @found INT \r\nDECLARE @Errorhappened INT \r\nDECLARE @OtherErrorhappened INT \r\nBEGIN TRANSACTION \r\nWHILE COALESCE(@found,1)&gt;0  \r\n       AND COALESCE(@ErrorHappened,0) =0 \r\n       AND COALESCE(@OtherErrorHappened,0)=0 \r\n   BEGIN \r\n   INSERT INTO Duplicates (Duplicate_ID,country,BarrelsPerDay,continent,region) \r\n       SELECT OilProducer_ID,country,BarrelsPerDay,continent,region \r\n       FROM OilProducers WHERE oilProducer_ID IN ( \r\n           SELECT MAX(OilProducer_ID) \r\n                FROM OilProducers \r\n               GROUP BY country  \r\n               HAVING COUNT(*)&gt;1) \r\n   SELECT @Errorhappened=@@Error, @found=@@Rowcount \r\n   DELETE FROM OilProducers WHERE oilProducer_ID IN ( \r\n           SELECT MAX(OilProducer_ID) \r\n               FROM OilProducers \r\n               GROUP BY country  \r\n               HAVING COUNT( *)&gt;1)--the HAVING clause has to be used as \r\n                                  --we have to select rows after the \r\n                                  --aggregation \r\n   SELECT @OtherErrorHappened=@@Error \r\n   END \r\nIF @errorHappened&lt;&gt;0 or @OtherErrorHappened &lt;&gt;0 ROLLBACK TRANSACTION \r\nELSE COMMIT TRANSACTION    \r\n--end of routine\r\n<\/pre>\n<p>Now look in the duplicate table:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT * FROM Duplicates \r\n<\/pre>\n<p>And check the main table for duplicates with the SQL I&#8217;ve already given.<\/p>\n<h2 id=\"third\">Simple Reporting<\/h2>\n<p>What about getting a summary of oil production per continent?<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT [Barrels per day ]=SUM(BarrelsPerDay), continent \r\n   FROM OilProducers  \r\n   GROUP BY continent --each row should be a continent \r\n   ORDER BY SUM(BarrelsPerDay) DESC --in descending order \r\n   <\/pre>\n<p>We are grouping by continent and get back seven records, each of which represents a collection of oil producers in the same continent.<\/p>\n<p>&#8230;or by region ordered by production?<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT  [production]=SUM(BarrelsPerDay),  \r\n       [average]=AVG(BarrelsPerDay),  \r\n       Region,  \r\n       [ Producers in region]=COUNT(*) \r\n   FROM OilProducers  \r\n   GROUP BY region --each row is a collection of all the records for a region \r\n   ORDER BY SUM(BarrelsPerDay) DESC --order by production \r\n   <\/pre>\n<p>Grand totals are easy:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT   \r\n   [continent]=COALESCE(continent,'Total (world production)'),  \r\n   [Barrels per day]=SUM(BarrelsPerDay) \r\n   FROM OilProducers  \r\n   GROUP BY continent WITH rollup--each row should be a continent \r\n    ORDER BY grouping(continent) ASC,SUM(BarrelsPerDay) DESC \r\n    <\/pre>\n<p>We used the <code>grouping()<\/code> function to order the total at the end we used <code>WITH ROLLUP <\/code>to compute the total. It can be very useful for running sums and running averages.<\/p>\n<p>&#8230;but someone is bound to ask for just the top five:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT  TOP 5 [production]=SUM(BarrelsPerDay),  \r\n       [average]=AVG(BarrelsPerDay), Region, [Producers in region]=COUNT(*) \r\n   FROM OilProducers  \r\n   GROUP BY region --each row is a collection of all the records for a region \r\n   ORDER BY SUM(BarrelsPerDay) DESC --order by production \r\n   <\/pre>\n<p>&#8230;or maybe the bottom five!<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT * FROM \r\n(SELECT  TOP 5 [production]=SUM(BarrelsPerDay), [average]=AVG(BarrelsPerDay),  \r\n                                     Region, [Producers in region]=COUNT(*) \r\n   FROM OilProducers  \r\n   GROUP BY region--each row is a collection of all the records for a region \r\n   ORDER BY SUM(BarrelsPerDay) ASC)f --order by production \r\nORDER BY production DESC \r\n<\/pre>\n<p>Or, more tiresomely, for the top five, the others as an &#8216;Others&#8217; row, <b>and<\/b> the sum total!<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT  \r\n[region]=CASE WHEN g.n=1 THEN t ELSE Region END, \r\n[production]=CASE WHEN t='other'  \r\n   THEN production - \r\n       (SELECT SUM(PRODUCTION) \r\n           FROM     \r\n               (SELECT  TOP 5 [production]=SUM(BarrelsPerDay) \r\n                   FROM OilProducers  \r\n               GROUP BY region  \r\n               ORDER BY SUM(BarrelsPerDay) DESC)F) \r\n   ELSE production END \r\nFROM (SELECT [n]=0,[t]='',[TheOrder]=1  --trick to duplicate the null row  the \r\n       UNION ALL          --'to get 'others' row \r\n       SELECT 1,'Other',2  \r\n       UNION ALL SELECT 1,'Total',3)g \r\nINNER JOIN \r\n(SELECT  TOP 6 [production]=SUM(BarrelsPerDay),--the basic data \r\n       [n]=grouping(region),  \r\n       Region \r\n   FROM OilProducers  \r\n   GROUP  BY region WITH rollup-- \r\n   ORDER BY SUM(BarrelsPerDay) DESC \r\n   )f  \r\nON g.n=f.n \r\nORDER BY g.TheOrder ASC, production DESC  \r\n<\/pre>\n<p>And then there will be the guy who wants a list of just those regions who produce more than the average.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT   Region, [production]=SUM(BarrelsPerDay),  [Producers in region]=COUNT(*) \r\n   FROM OilProducers  \r\n   GROUP BY region-- each row is a collection of all the records for a region \r\n   HAVING SUM(BarrelsPerDay)&gt; \r\n       (SELECT AVG(average) FROM \r\n(SELECT [average]=AVG(BarrelsPerDay) FROM OilProducers GROUP BY region)f) \r\n   ORDER BY SUM(BarrelsPerDay) DESC --order by production \r\n   <\/pre>\n<p>Or possibly the five countries closest to the average:<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT TOP 5 country, BarrelsPerDay, theAverage \r\nFROM OilProducers \r\nCROSS JOIN \r\n(SELECT [theAverage]=AVG(BarrelsPerDay) FROM OilProducers)f \r\nORDER BY  ABS(barrelsPerDay-theaverage)   ASC \r\n<\/pre>\n<p>Or conceivably, region closest to the average<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT TOP 5 * FROM  \r\n   (SELECT region, [sumBPD]=SUM(BarrelsPerDay)--, theAverage \r\n       FROM OilProducers GROUP BY region \r\n   )h \r\nCROSS  JOIN \r\n   (SELECT [theAverage]=AVG(production) FROM  \r\n       (SELECT [production]=SUM(BarrelsPerDay)  \r\n           FROM OilProducers GROUP BY region \r\n        )g \r\n   )f \r\nORDER BY  ABS(sumBPD-theaverage) ASC \r\n<\/pre>\n<h2 id=\"fourth\">Grouping and Pivot Tables<\/h2>\n<p>So let&#8217;s get a bit harder and do a pivot table relating regions and continents with column and row totals.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">SELECT [region\/continent ]=CONVERT(CHAR(25),COALESCE(region,'Total')), \r\n   'Asia'=SUM(CASE WHEN continent = 'Asia' THEN BarrelsPerDay ELSE 0 END), \r\n   'North A.'=SUM(CASE WHEN continent ='North America'  \r\n                                            THEN BarrelsPerDay ELSE 0 END) , \r\n   'Latin A.'=SUM(CASE WHEN continent ='Latin America and the Caribbean'  \r\n                                            THEN BarrelsPerDay ELSE 0 END), \r\n   'Africa'=SUM(CASE WHEN continent ='Africa' THEN BarrelsPerDay ELSE 0 END), \r\n   'Russia'=SUM(CASE WHEN continent ='Russia' THEN BarrelsPerDay  ELSE 0 END), \r\n   'Europe'=SUM(CASE WHEN continent ='Europe' THEN BarrelsPerDay ELSE 0 END), \r\n   'Oceania'=SUM(CASE WHEN continent ='Oceania' THEN BarrelsPerDay ELSE 0 END), \r\n   'sum'=SUM(BarrelsPerDay) \r\nFROM OilProducers \r\nGROUP BY region \r\n   WITH rollup \r\n   ORDER  BY grouping(region),SUM(BarrelsPerDay) DESC \r\n   <\/pre>\n<p>The &#8216;order by grouping&#8217; trick ensures that the total comes in the right place on the last row!<\/p>\n<h2 id=\"fifth\">Enumerating the Group members<\/h2>\n<p>This is all OK but a lot of people want, and expect, to have a list of all the constituents of their grouping in the result. They don&#8217;t like DBAs laughing and saying it isn&#8217;t possible. There are now CLR Aggregation routines that do it, but here is a method that works on SQL 2000. Not a cursor in sight! (it works a lot better in 2005 with <code>VaRCHAR(MAX))<\/code>.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">DECLARE @list VARCHAR(8000) \r\nSELECT @List=COALESCE(@list+',','') +'|'+region +'|'+ country+'|'+region +'|' \r\nFROM OilProducers ORDER BY region \r\n\r\nSELECT [region]=CONVERT(CHAR(26),region),  \r\n   [Oil-producing countries]=LEFT(members,CHARINDEX(',|',members+',|')-1) FROM \r\n(SELECT region, 'members'= \r\n        REPLACE( \r\n           SUBSTRING(list,CHARINDEX('|'+regions.region+'|',list),8000), \r\n           '|'+region+'|', \r\n           '') \r\nFROM (SELECT 'list'=@List)f \r\nCROSS JOIN ( \r\n   SELECT region FROM oilproducers GROUP BY region \r\n   )regions \r\n)g \r\n\r\n\/* \r\nResults (Truncated to the right) \r\nregion                     Oil-producing countries \r\n-------------------------- ----------------------------------------------------... \r\nAustralia and New Zealand  Australia,New Zealand \r\nCaribbean                  Virgin Islands,Cuba,Trinidad and Tobago,Aruba,Barbad... \r\nCentral America            Belize,Nicaragua,Guatemala,Mexico \r\nEastern Africa             Zambia,Madagascar \r\nEastern Asia               Korea, North,Mongolia,Taiwan,Korea, South,China,Japan \r\nEastern Europe             Czech Republic,Poland,Belarus,Ukraine,Hungary,Romani... \r\nMelanesia                  Papua New Guinea \r\nMiddle Africa              Congo,Chad,Equatorial Guinea,Gabon,Angola,Cameroon \r\nNorth America              Canada, United States \r\nNorthern Africa            Algeria,Egypt,Libya,Tunisia,Sudan,Morocco \r\nNorthern Europe            Finland,Sweden,Lithuania,Estonia,United Kingdom,Norw... \r\nRussia                     Russia \r\nSouth America              Ecuador,Argentina,Brazil,Colombia,Venezuela,Peru,Sur... \r\nSouth-central Asia         Kyrgyzstan,Bangladesh,Pakistan,Tajikistan,Turkmenist... \r\nSouth-eastern Asia         Malaysia,Indonesia,Brunei,Vietnam,Thailand,Singapore... \r\nSouthern Africa            South Africa \r\nSouthern Europe            Italy,Croatia,Spain,Greece,Albania,Slovenia \r\nWestern Africa             Sierra Leone,Ghana,Nigeria,Mauritania,Cote d'Ivoire \r\nWestern Asia               Turkey,Bahrain,Yemen,Syria,United Arab Emirates,Saud... \r\nWestern Europe             Switzerland,Belgium,Germany,Netherlands,France,Austr... \r\n*\/ \r\n<\/pre>\n<p>Now this technique could get quite interesting if combined with &#8216;ranging&#8217; where you can impose categories onto the data of your choice (Date ranges are very common).<\/p>\n<h2 id=\"sixth\">Ranging<\/h2>\n<p>By ranging, I mean imposinbg aribrary value ranges, and grouping by them you can, of course, use a helper table to do this much more elegantly.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">-- Ranging can be by column \r\nSELECT \r\n[range]='No. of Countries',  \r\n[&lt; 10,000 bpd]=SUM(CASE WHEN Barrelsperday&lt;10000 THEN 1 ELSE 0 END), \r\n[10,000 to 100,000]=SUM(CASE WHEN Barrelsperday BETWEEN 10000 AND 100000   \r\n                        THEN 1 ELSE 0  \r\n                    END), \r\n[100,000 to 1m bpd]=SUM(CASE WHEN Barrelsperday BETWEEN 100001 AND 1000000  \r\n                        THEN 1 ELSE 0  \r\n                    END), \r\n[1m to 10m bpd]=SUM(CASE WHEN Barrelsperday BETWEEN 1000001 AND 10000000  \r\n                         THEN 1 ELSE 0  \r\n                    END), \r\n[more than 10m bpd]=SUM(CASE WHEN Barrelsperday &gt; 10000000  THEN 1 ELSE 0 END) \r\nFROM OilProducers \r\n\r\n--or by row... \r\nSELECT [category]=CASE WHEN Barrelsperday&lt;10000 THEN '1\/ &gt; 10,000 bpd'  \r\n   WHEN Barrelsperday BETWEEN 10000   AND 100000 THEN '2\/ 10,000 to 100,000 ' \r\n    WHEN Barrelsperday BETWEEN 100001  AND 1000000 THEN '3\/ 100,000 to 1m bpd' \r\n    WHEN Barrelsperday BETWEEN 1000001 AND 10000000 THEN '4\/ 1m to 10m bpd' \r\n    WHEN Barrelsperday &gt; 10000000  THEN '5\/ more than 10m bpd' END, \r\n   [total]=SUM(BarrelsperDay), \r\n   [No. of Countries]=COUNT(*) \r\nFROM OilProducers \r\nGROUP BY CASE WHEN Barrelsperday&lt;10000  THEN '1\/ &gt; 10,000 bpd'  \r\n   WHEN Barrelsperday BETWEEN 10000   AND 100000 THEN '2\/ 10,000 to 100,000' \r\n    WHEN Barrelsperday BETWEEN 100001  AND 1000000 THEN '3\/ 100,000 to 1m bpd' \r\n    WHEN Barrelsperday BETWEEN 1000001 AND 10000000 THEN '4\/ 1m to 10m bpd' \r\n    WHEN Barrelsperday &gt; 10000000  THEN '5\/ more than 10m bpd' END \r\n    <\/pre>\n<p>The code above is much more efficient than it looks, but why not make a User-defined function to do it?<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">CREATE FUNCTION OilOutputCategory \r\n( \r\n   @OilOutput INT \r\n) \r\nRETURNS VARCHAR(30) \r\nAS \r\nBEGIN \r\nRETURN CASE WHEN @OilOutput&lt;10000 THEN '1\/ &gt; 10,000 bpd'  \r\n   WHEN @OilOutput BETWEEN 10000   AND 100000 THEN '2\/ 10,000 to 100,000' \r\n    WHEN @OilOutput BETWEEN 100001  AND 1000000 THEN '3\/ 100,000 to 1m bpd' \r\n    WHEN @OilOutput BETWEEN 1000001 AND 10000000 THEN  '4\/ 1m to 10m bpd' \r\n    WHEN @OilOutput &gt; 10000000  THEN '5\/ more than 10m bpd' END \r\n\r\nEND \r\nGO \r\n--so the query becomes... \r\nSELECT [category]=dbo.OilOutputCategory(BarrelsPerDay), \r\n   [total]=SUM(BarrelsperDay), \r\n   [producers]=COUNT(*) \r\nFROM OilProducers \r\nGROUP BY dbo.OilOutputCategory(BarrelsPerDay) \r\n<\/pre>\n<p>So let&#8217;s combine ranging and enumeration!<\/p>\n<p>We can list all the countries that fall in each range category&#8230;.<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">DECLARE @list VARCHAR( 8000) \r\nSELECT @List=COALESCE(@list+',','')+catfudge \r\nFROM (--the concatenation of the string didn't work until we did this! \r\n   SELECT TOP 1000 [catfudge]= \r\n   ('|'+dbo.OilOutputCategory(BarrelsPerDay) +'|'+ country+'|' \r\n         +dbo.OilOutputCategory(BarrelsPerDay) +'|'), \r\n   [cat]=dbo.OilOutputCategory(BarrelsPerDay) \r\nFROM  OilProducers ORDER BY cat)f \r\n\r\nSELECT [category]=CONVERT(CHAR(26),category),  \r\n   [Oil-producing countries]=LEFT(members,CHARINDEX(',|',members+',|')-1) FROM \r\n(SELECT category, 'members'= \r\n       REPLACE( \r\n           SUBSTRING(list,CHARINDEX('|'+ categories.category+'|',list),8000), \r\n           '|'+categories.category+'|', \r\n           '') \r\nFROM (SELECT 'list'=@List)f \r\nCROSS JOIN ( \r\n   SELECT [category]=dbo.OilOutputCategory(BarrelsPerDay) FROM oilproducers  \r\n       GROUP BY dbo.OilOutputCategory(BarrelsPerDay) \r\n   )categories \r\n)g \r\n\r\n\/* \r\nResults (Truncated to the right) \r\ncategory                   Oil-producing countries \r\n-------------------------- -------------------------------------------------... \r\n1\/ &gt; 10,000 bpd            Singapore,Burma,Finland,Taiwan,Ghana,Korea, South,... \r\n2\/ 10,000 to 100,000       Tunisia,Netherlands,Ukraine,Mauritania,France,Cuba... \r\n3\/ 100,000 to 1m bpd       Syria,Sudan,Oman,Qatar,Romania,Italy,Malaysia,Chad... \r\n4\/ 1m to 10m bpd           Venezuela,United Arab Emirates,United Kingdom,Indo... \r\n5\/ more than 10m bpd       Saudi Arabia \r\n*\/ \r\n<\/pre>\n<h2 id=\"seventh\">Grouping as a utility<\/h2>\n<p>One can use grouping in a variety of ways that have nothing to do with reporting . Here is an example of using <code>GROUP BY<\/code> to help produce a table by splitting a delimited list. It needs the Number Helper Table (You&#8217;ll have to refer to the &#8216;Helper Table Workbench to find out about that).<\/p>\n<pre class=\"lang:tsql theme:ssms2012\">CREATE FUNCTION [dbo].[uftSplitStringGroup] \r\n( \r\n @DelimitedList VARCHAR (8000), \r\n @Delimiter VARCHAR(20)=',' \r\n) \r\nRETURNS \r\n@Results TABLE \r\n( \r\n SeqNo INT IDENTITY(1, 1), \r\n Item VARCHAR(8000) \r\n) \r\nAS \r\nBEGIN \r\nDECLARE @String VARCHAR(8000) \r\nDECLARE @LenDelimiter INT \r\n\r\nSELECT @String=@Delimiter+@DelimitedList+@Delimiter, \r\n   @LenDelimiter=LEN( @Delimiter) \r\n\r\nINSERT INTO @results (item) \r\n   SELECT SUBSTRING(@string, \r\n                     s1.number+@LenDelimiter, \r\n                     MIN(s2.number)-s1.number-@LenDelimiter) \r\n     FROM    (SELECT [string]= @String )f \r\n   CROSS JOIN numbers s1  \r\n   INNER JOIN numbers s2 ON s1.number&lt; s2.number \r\n   WHERE s1.number&lt;=LEN (@string) \r\n       AND  s2.number&lt;=LEN (@string) \r\n       AND SUBSTRING(@string,s1.number,@LenDelimiter)=@Delimiter \r\n       AND SUBSTRING(@string,s2.number,@LenDelimiter)=@Delimiter \r\n   GROUP BY s1.number \r\n RETURN \r\nEND \r\n\/* \r\nselect * from uftSplitStringGroup('first,second,third',default) \r\nselect * from uftSplitStringGroup('&lt;&gt;this is something&lt;&gt;something else&lt;&gt;','&lt;&gt;') \r\nselect * from uftSplitStringGroup( \r\n 'Turkey,Bahrain,Yemen,Syria,United Arab Emirates,Saudi Arabia,Qatar,Kuwait',',') \r\n <\/pre>\n<h2 id=\"eighth\">Questions<\/h2>\n<p>Send your answers to <a href=\"mailto:editor@simple-talk.com\">editor@simple-talk.com<\/a>. The first three correct entries will receive a much-coveted Simple-Talk goodie bag (polo shirt, USB key, bottle opener, beermat, pen).<\/p>\n<ol>\n<li>How would you get the count of the number of rows in a table with NULLS in a particular column, using GROUP BY, but without using a <code>COALESCE<\/code> or<code> CASE<\/code> statement?<\/li>\n<li>What is the <code>GROUPING()<\/code> function useful for?<\/li>\n<li>Can a <code>WHERE<\/code> clause contain an aggregate function?<\/li>\n<li>When would you need to use a <code>HAVING<\/code> clause?<\/li>\n<li>What does the <code>ROLLUP<\/code> do? How would you use it?<\/li>\n<li>Can you use UDFs (user-defined scalar-valued functions) in <code>GROUP BY<\/code> clauses?<\/li>\n<\/ol>\n<div class=\"note\">\n<h3>See also other Workbenches at Simple-Talk<\/h3>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/robyn-pages-sql-server-datetime-workbench\/\">Robyn Page&#8217;s SQL Server DATE\/TIME Workbench<\/a>, Robyn Page Date calculation and formatting in SQL Server can be surprisingly tricky. Robyn Page&#8217;s &#8220;hands-on&#8221; workbench will lead you through the minefield.<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/robyn-pages-sql-server-string-manipulation-workbench\/\">Robyn Page&#8217;s SQL Server String Manipulation Workbench<\/a>, Robyn Page String searching and manipulation in SQL Server can be error-prone and tedious&#8230;unless you&#8217;re armed with the techniques described in Robyn&#8217;s string manipulation workbench&#8230;<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/t-sql-programming\/sql-server-error-handling-workbench\/\">SQL Server Error Handling Workbench<\/a>, Grant Fritchey Grant Fritchey steps into the workbench arena, with an example-fuelled examination of catching and gracefully handling errors in SQL 2000 and 2005, including worked examples of the new TRY..CATCH capabilities.<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/robyn-pages-sql-server-cursor-workbench\/\">Robyn Page&#8217;s SQL Server Cursor Workbench<\/a>, Robyn Page The topic of cursors is the ultimate &#8220;hot potato&#8221; in the world of SQL Server. Everyone has a view on when they should and mainly should not be used. By example and testing Robyn Page proves that, when handled with care, cursors are not necessarily a &#8220;bad thing&#8221;.<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/learn-sql-server\/robyn-pages-sql-server-data-validation-workbench\/\">Robyn Page&#8217;s SQL Server Data Validation Workbench<\/a>, Robyn Page Robyn Page provides essential techniques for ensuring the validity of the data being entered into your SQL Server tables.<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/t-sql-programming\/sql-server-excel-workbench\/\">Robyn Page&#8217;s Excel Workbench<\/a>, Robyn Page and Phil Factor The need to produce Excel reports from SQL Server is very common. Here, Robyn Page and Phil Factor present practical techniques for creating and manipulating Excel spreadsheets from SQL Server, using linked servers and T-SQL. The pi\u00e8ce de r\u00e9sistance is a stored procedure that uses OLE Automation&#8230;<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/t-sql-programming\/sql-server-security-workbench-part-1\/\">Robyn Page&#8217;s SQL Server Security Workbench<\/a>, Robyn Page and Phil Factor Robyn Page and Phil Factor present practical T-SQL techniques for controlling access to sensitive information within the database, and preventing malicious SQL injection attacks.<\/p>\n<p><a href=\"http:\/\/www.simple-talk.com\/sql\/t-sql-programming\/the-helper-table-workbench\/\">SQL Server Helper Table Workbench<\/a>, Robyn Page and Phil Factor Cursors and iterations are both renowned for slowing down Transact SQL code, but sometimes seem unavoidable. In this workbench, Robyn Page and Phil Factor demonstrate some set-based techniques for string manipulation and time interval-based reporting, which use helper tables instead.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A gentle lesson about <code>GROUP BY<\/code> on the Nursery Slopes develops gradually into a wild ride off-piste amongst the pine-trees.&hellip;<\/p>\n","protected":false},"author":154613,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143531],"tags":[4761,4760,4731,4763,4762,4150,4151,4183,4252,4460],"coauthors":[6813,6814],"class_list":["post-256","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server","tag-aggregate-functions","tag-group-by","tag-pivot-tables","tag-ranging","tag-reporting-queries","tag-sql","tag-sql-server","tag-t-sql","tag-t-sql-programming","tag-workbench"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/256","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/154613"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=256"}],"version-history":[{"count":11,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/256\/revisions"}],"predecessor-version":[{"id":74377,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/256\/revisions\/74377"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=256"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=256"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=256"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=256"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}