{"id":73146,"date":"2015-09-30T15:43:19","date_gmt":"2015-09-30T15:43:19","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/uncategorized\/introduction-to-user-defined-aggregate-functions\/"},"modified":"2021-07-14T13:07:20","modified_gmt":"2021-07-14T13:07:20","slug":"introduction-to-user-defined-aggregate-functions","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/introduction-to-user-defined-aggregate-functions\/","title":{"rendered":"Introduction to User-Defined Aggregate Functions"},"content":{"rendered":"<p>OK, let&#8217;s speed past the easy bits, the parts we all already know: standard aggregate functions. Aggregate functions, unlike regular functions, take values from multiple rows as their input. The category includes those aggregate functions that are so ordinary they&#8217;re almost invisible \u2013 SUM, COUNT, MAX \u2013 and a couple that most of us never use \u2013 such as APPROX_COUNT_DISTINCT.<\/p>\n<p>In spite of such arcane functions there are always gaps that need plugging (indeed functions as un-extraordinary as median and mode were only added at 10g). I&#8217;m sure the guys at Oracle are all coffee-addled geniuses who work 24 and a half hours every day, but they&#8217;ll never be able to add every aggregate function to meet every single need.<\/p>\n<h5>User-Defined Aggregate Functions<\/h5>\n<p>Which is why, with 9i, the ability to create our own aggregate functions was added to Oracle. Before this, you&#8217;d have needed to write PL\/SQL to replicate the capabilities of an aggregate function. Now you can create your own aggregate function, and call it directly from your SQL.<\/p>\n<p>However, before we go crazy and dive right in, let&#8217;s agree on a problem to solve with our user-defined aggregate function. Let&#8217;s imagine that the standard EMP and DEPT tables belong to a small company that we run together, Scott Tiger Enterprises. Let&#8217;s also imagine that the government has just introduced a new law stating that the tax our company will pay is equal to the sum of the three largest salaries we pay our employees. An aggregate function will be perfect for determining the figure.<\/p>\n<h5>The ODCI Aggregate Routines<\/h5>\n<p>To write a user-defined aggregate function, we must have a passing acquaintance with Oracle Data Cartridges. Data cartridges, using the Oracle Extensibility Architecture framework, extend the abilities of the Oracle server.\u00a0 Don&#8217;t worry too much about it at this point; just know that we&#8217;ll be using an API from the data cartridge (ODCI) arsenal to build our aggregate function.<\/p>\n<p>The 4 ODCI functions we will use are as follows:<\/p>\n<ul>\n<li><em>ODCIAggregate<strong>Initialize<\/strong>(ctx IN OUT our_aggregate_object_type) RETURN NUMBER<\/em><br \/>\nThis is the function within which we initialize our persistent variables.\u00a0 It executes once for each distinct member of the group by field (or just once, if your eventual query does not contain a group by).<\/li>\n<li><em>ODCIAggregate<strong>Iterate<\/strong>(self IN OUT our_aggregate_object_type, our_parameter_to_aggregate IN some_data_type) RETURN NUMBER<\/em><br \/>\nThis function executes once for <em>each row<\/em> returned by the query. This is where the meat of our aggregate function will live.<\/li>\n<li><em>ODCIAggregate<strong>Merge<\/strong>(self IN OUT our_aggregate_object_type, ctx2 IN our_aggregate_object_type) RETURN NUMBER<\/em><br \/>\nFor our purposes here, let&#8217;s just say that the merge function simply returns<em> ODCIconst.success<\/em> (a numeric constant). There&#8217;s more to it, but it&#8217;s outside the scope of what we&#8217;re talking about today. You can build your aggregate function without knowing it.<\/li>\n<li><em>ODCIAggregate<strong>Terminate<\/strong>(self IN our_aggregate_object_type, returnvalue OUT some_data_type, flags IN NUMBER) RETURN NUMBER<\/em><br \/>\nThis function executes once for each distinct member of the group by field.<\/li>\n<\/ul>\n<p>To use these functions, we will need to create an object type to implement them.\u00a0 And then, finally, we will create our user-defined aggregate function using our object.<\/p>\n<p>It might all seem a little overwhelming at first, but, in truth, the steps are simple:<\/p>\n<ol>\n<li>Create an object spec containing any persistent variables we might need, and declaring ODCIAggregate<strong>Initialize<\/strong>, ODCIAggregate<strong>Iterate<\/strong>, ODCIAggregate<strong>Merge<\/strong> and ODCIAggregate<strong>Terminate<\/strong>.<\/li>\n<li>Create an object body containing the bodies of your ODCI functions. Bear in mind that:\n<ul>\n<li>ODCIAggregate<strong>Initialize<\/strong> will run once at the start of the process;<\/li>\n<li>ODCIAggregate<strong>Iterate<\/strong> runs for each row, and is therefore where you will do all of your work;<\/li>\n<li>and ODCIAggregate<strong>Terminate<\/strong> runs once at the end of the whole process.<\/li>\n<\/ul>\n<\/li>\n<li>Create your aggregate function based on your object.<\/li>\n<\/ol>\n<h5>Example User-Defined Aggregate Function<\/h5>\n<p>An example will probably help. Let&#8217;s create an aggregate function, Taxbill, to find out how much\u00a0 Scott Tiger Enterprises will have to pay. Remember it&#8217;s the sum of the three highest salaries on our books.<\/p>\n<p>First we need to create our object spec. Let&#8217;s call it newTaxFncs<\/p>\n<pre>CREATE OR REPLACE TYPE newTaxFncs AS OBJECT (\r\n\tvTaxBill        NUMBER,\r\n\tvHighest\tNUMBER,\r\n\tv2ndHigh\tNUMBER,\r\n\tv3rdHigh\tNUMBER,\r\n\t\r\n\tSTATIC FUNCTION ODCIAggregateInitialize (ctx IN OUT newTaxFncs) RETURN NUMBER,\r\n\tMEMBER FUNCTION ODCIAggregateIterate (self IN OUT newTaxFncs, pSal IN NUMBER) RETURN NUMBER, \r\n\tMEMBER FUNCTION ODCIAggregateMerge(self IN OUT newTaxFncs, ctx2 newTaxFncs) RETURN NUMBER,\r\n\tMEMBER FUNCTION ODCIAggregateTerminate (self IN newTaxFncs, taxVal OUT NUMBER, flags IN NUMBER) RETURN NUMBER\r\n\t);\r\n<\/pre>\n<p>Now we need to write our object type body. Remember that ODCIAggregateInitialize initializes our variables, vTaxBill, vHighest, v2ndHigh and v3rdHigh; ODCIAggregateIterate does all the legwork; and ODCIAggregateTerminate returns the final figure.<\/p>\n<pre>CREATE OR REPLACE TYPE BODY newTaxFncs IS \r\n\r\n\tSTATIC FUNCTION ODCIAggregateInitialize (ctx IN OUT newTaxFncs) RETURN NUMBER IS\r\n\tBEGIN \r\n                -- Initialize our variables: vTaxBill, vHighest, v2ndHigh, v3rdHigh\r\n\t\tctx\t:= newTaxFncs(0,0,0,0);\r\n\t\tRETURN ODCIConst.success;\r\n\tEND ODCIAggregateInitialize;\r\n\t\r\n\tMEMBER FUNCTION ODCIAggregateIterate (self IN OUT newTaxFncs, \r\n                                              pSal IN NUMBER) \r\n        RETURN NUMBER IS\r\n\t\r\n\tBEGIN \r\n                -- If this is one of the top 3 salaries, save it in one of the variables.\r\n\t\tIF pSal &gt;= self.vHighest THEN \r\n\t\t\tself.v3rdHigh\t:= self.v2ndHigh;\r\n\t\t\tself.v2ndHigh\t:= self.vHighest;\r\n\t\t\tself.vHighest\t:= pSal;\r\n\t\tELSIF pSal &gt;= self.v2ndHigh THEN \r\n\t\t\tself.v3rdHigh\t:= self.v2ndHigh;\r\n\t\t\tself.v2ndHigh\t:= pSal;\r\n\t\tELSIF pSal &gt;= self.v3rdHigh THEN\r\n\t\t\tself.v3rdHigh\t:= pSal;\r\n\t\tEND IF;\r\n\t\t\r\n\t\tRETURN ODCIconst.success;\r\n\tEND ODCIAggregateIterate;\r\n\t\r\n\tMEMBER FUNCTION ODCIAggregateMerge (self IN OUT newTaxFncs, \r\n                                            ctx2 IN newTaxFncs) \r\n        RETURN NUMBER IS\r\n\t\r\n\tBEGIN \r\n\t\tRETURN ODCIconst.success;\r\n\tEND ODCIAggregateMerge;\r\n\t\r\n\tMEMBER FUNCTION ODCIAggregateTerminate (self   IN newTaxFncs, \r\n                                                taxVal OUT NUMBER,\r\n\t                                        flags  IN NUMBER) \r\n        RETURN NUMBER IS \r\n\t\r\n\tBEGIN \r\n                -- The variable returned OUT by Terminate is what the aggregate function will use. \r\n\t\ttaxVal\t:= self.vHighest + self.v2ndHigh + self.v3rdHigh;\r\n\t\tRETURN ODCIconst.success;\r\n\tEND ODCIAggregateTerminate;\r\nEND;\r\n\/\r\n<\/pre>\n<p>Finally, we need to create our aggregate function using the object that we&#8217;ve just created. Pay attention to the syntax &#8216;cos it&#8217;s slightly different from that of a standard function.<\/p>\n<pre>CREATE OR REPLACE FUNCTION taxbill (pSal NUMBER) AS \"Taxbill\"\r\nRETURN NUMBER\r\nAGGREGATE USING newTaxFncs;\r\n<\/pre>\n<p>Now we can use it. Just like any other aggregate function. How cool is that?<\/p>\n<pre>SELECT taxbill(sal)\r\n  FROM emp;\r\n<\/pre>\n<table>\n<tbody>\n<tr>\n<td width=\"616\"><strong>Taxbill<\/strong><\/td>\n<\/tr>\n<tr>\n<td width=\"616\">11000<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<pre>SELECT deptno, taxbill(sal) AS \"Taxbill\"\r\n  FROM emp\r\nGROUP BY deptno;\r\n<\/pre>\n<table>\n<tbody>\n<tr>\n<td><strong>DEPTNO<\/strong><\/td>\n<td><strong>Taxbill<\/strong><\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>8750<\/td>\n<\/tr>\n<tr>\n<td>20<\/td>\n<td>8975<\/td>\n<\/tr>\n<tr>\n<td>30<\/td>\n<td>5950<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h5>Finally<\/h5>\n<p>I feel there are a couple of things I need to mention before we move on.<\/p>\n<ol>\n<li>Our example user-defined aggregate function was number-based, but Oracle allows them accept VARCHAR2, dates, and other scalar data types, and complex ones too.<\/li>\n<li>Our object type is built on PL\/SQL; however, we could have used any supported language, such as C, C++ or Java. I&#8217;m illiterate in those languages, so it was an easy decision for me.<\/li>\n<\/ol>\n<h5><a href=\"https:\/\/allthingsoracle.com\/introduction-to-analytic-functions-part-1-2\/\">Analytic Functions<\/a><\/h5>\n<p>Remember back in the day when all new televisions had stickers that read <em>HD-ready<\/em> plastered all over them? Well, I feel that user-defined aggregate functions should come with an &#8220;Analytic Function-ready&#8221; sticker on them. That is because you can use any aggregate functions you build as analytic functions.<\/p>\n<p>Imagine we wanted to see how our tax bill has changed over the years with each new member of staff.<\/p>\n<pre>SELECT ename, job, hiredate, sal, taxbill(sal) OVER (order by hiredate) AS \"Taxbill\"\r\n  FROM emp\r\nORDER BY hiredate;\r\n<\/pre>\n<table>\n<tbody>\n<tr>\n<td><strong>ENAME<\/strong><\/td>\n<td><strong>JOB<\/strong><\/td>\n<td><strong>HIREDATE<\/strong><\/td>\n<td><strong>SAL<\/strong><\/td>\n<td><strong>Taxbill<\/strong><\/td>\n<\/tr>\n<tr>\n<td>SMITH<\/td>\n<td>CLERK<\/td>\n<td>12\/17\/1980<\/td>\n<td>800<\/td>\n<td>800<\/td>\n<\/tr>\n<tr>\n<td>ALLEN<\/td>\n<td>SALESMAN<\/td>\n<td>02\/20\/1981<\/td>\n<td>1600<\/td>\n<td>2400<\/td>\n<\/tr>\n<tr>\n<td>WARD<\/td>\n<td>SALESMAN<\/td>\n<td>02\/22\/1981<\/td>\n<td>1250<\/td>\n<td>3650<\/td>\n<\/tr>\n<tr>\n<td>JONES<\/td>\n<td>MANAGER<\/td>\n<td>04\/02\/1981<\/td>\n<td>2975<\/td>\n<td>5825<\/td>\n<\/tr>\n<tr>\n<td>BLAKE<\/td>\n<td>MANAGER<\/td>\n<td>05\/01\/1981<\/td>\n<td>2850<\/td>\n<td>7425<\/td>\n<\/tr>\n<tr>\n<td>CLARK<\/td>\n<td>MANAGER<\/td>\n<td>06\/09\/1981<\/td>\n<td>2450<\/td>\n<td>8275<\/td>\n<\/tr>\n<tr>\n<td>TURNER<\/td>\n<td>SALESMAN<\/td>\n<td>09\/08\/1981<\/td>\n<td>1500<\/td>\n<td>8275<\/td>\n<\/tr>\n<tr>\n<td>MARTIN<\/td>\n<td>SALESMAN<\/td>\n<td>09\/28\/1981<\/td>\n<td>1250<\/td>\n<td>8275<\/td>\n<\/tr>\n<tr>\n<td>KING<\/td>\n<td>PRESIDENT<\/td>\n<td>11\/17\/1981<\/td>\n<td>5000<\/td>\n<td>10825<\/td>\n<\/tr>\n<tr>\n<td>FORD<\/td>\n<td>ANALYST<\/td>\n<td>12\/03\/1981<\/td>\n<td>3000<\/td>\n<td>10975<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Or imagine we wanted to pay our tax department by department.<\/p>\n<pre>SELECT ename, job, sal, deptno, taxbill(sal) OVER (PARTITION BY deptno) AS \"Taxbill\"\r\n  FROM emp\r\nORDER BY deptno;\r\n<\/pre>\n<table>\n<tbody>\n<tr>\n<td><strong>ENAME<\/strong><\/td>\n<td><strong>JOB<\/strong><\/td>\n<td><strong>SAL<\/strong><\/td>\n<td><strong>DEPTNO<\/strong><\/td>\n<td><strong>Taxbill<\/strong><\/td>\n<\/tr>\n<tr>\n<td>CLARK<\/td>\n<td>MANAGER<\/td>\n<td>2450<\/td>\n<td>10<\/td>\n<td>8750<\/td>\n<\/tr>\n<tr>\n<td>MILLER<\/td>\n<td>CLERK<\/td>\n<td>1300<\/td>\n<td>10<\/td>\n<td>8750<\/td>\n<\/tr>\n<tr>\n<td>KING<\/td>\n<td>PRESIDENT<\/td>\n<td>5000<\/td>\n<td>10<\/td>\n<td>8750<\/td>\n<\/tr>\n<tr>\n<td>FORD<\/td>\n<td>ANALYST<\/td>\n<td>3000<\/td>\n<td>20<\/td>\n<td>8975<\/td>\n<\/tr>\n<tr>\n<td>SCOTT<\/td>\n<td>ANALYST<\/td>\n<td>3000<\/td>\n<td>20<\/td>\n<td>8975<\/td>\n<\/tr>\n<tr>\n<td>JONES<\/td>\n<td>MANAGER<\/td>\n<td>2975<\/td>\n<td>20<\/td>\n<td>8975<\/td>\n<\/tr>\n<tr>\n<td>SMITH<\/td>\n<td>CLERK<\/td>\n<td>800<\/td>\n<td>20<\/td>\n<td>8975<\/td>\n<\/tr>\n<tr>\n<td>ADAMS<\/td>\n<td>CLERK<\/td>\n<td>1100<\/td>\n<td>20<\/td>\n<td>8975<\/td>\n<\/tr>\n<tr>\n<td>WARD<\/td>\n<td>SALESMAN<\/td>\n<td>1250<\/td>\n<td>30<\/td>\n<td>5950<\/td>\n<\/tr>\n<tr>\n<td>MARTIN<\/td>\n<td>SALESMAN<\/td>\n<td>1250<\/td>\n<td>30<\/td>\n<td>5950<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h5><strong>Conclusion<\/strong><\/h5>\n<p>So that&#8217;s it.<\/p>\n<p>Once created, your aggregate functions will slide in right beside Oracle&#8217;s standard ones, and enjoy all the advantages &#8211; and disadvantages &#8211; that they have. \u00a0Try running one against an unindexed column in a very large table, and you&#8217;ll suffer a performance hit. Unsurprisingly. \u00a0Use one to condense a very complicated process into a single line of SQL, and you&#8217;ll enjoy a readability bonus. Again, unsurprisingly.<\/p>\n<p>So, go forth, and use your new knowledge wisely.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OK, let&#8217;s speed past the easy bits, the parts we all already know: standard aggregate functions. Aggregate functions, unlike regular functions, take values from multiple rows as their input. The category includes those aggregate functions that are so ordinary they&#8217;re almost invisible \u2013 SUM, COUNT, MAX \u2013 and a couple that most of us never use \u2013 such as APPROX_COUNT_DISTINCT&hellip;<\/p>\n","protected":false},"author":221907,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143533],"tags":[48386,48536],"coauthors":[],"class_list":["post-73146","post","type-post","status-publish","format-standard","hentry","category-oracle-databases","tag-data-cartridges","tag-user-defined-aggregate-functions"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73146","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221907"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=73146"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73146\/revisions"}],"predecessor-version":[{"id":91642,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/73146\/revisions\/91642"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=73146"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=73146"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=73146"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=73146"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}