{"id":102750,"date":"2024-07-16T00:14:13","date_gmt":"2024-07-16T00:14:13","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=102750"},"modified":"2026-03-18T01:08:58","modified_gmt":"2026-03-18T01:08:58","slug":"making-temporal-databases-work-part-2-computing-aggregates-across-temporal-versions","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/making-temporal-databases-work-part-2-computing-aggregates-across-temporal-versions\/","title":{"rendered":"PostgreSQL Temporal Aggregates: SUM, AVG &#038; COUNT Across Time"},"content":{"rendered":"<p>Computing aggregates (SUM, AVG, COUNT, MIN, MAX) in a PostgreSQL temporal database requires special handling because column values can change within the query period. A standard SQL AVG on salary would produce misleading results if an employee\u2019s salary changed mid-quarter. Temporal aggregates solve this by weighting each value by the fraction of the query period during which it was valid &#8211; for example, if a salary of $80K was valid for two months and $90K for one month of a quarter, the temporal average is ($80K \u00d7 2\/3) + ($90K \u00d7 1\/3). This article defines custom PostgreSQL aggregate functions that handle these fractional period calculations and demonstrates them with working SQL examples.<\/p>\n<h2>Introduction<\/h2>\n<p>The temporal database is a database that can keep information on time when the facts represented in the database were, are, or will be valid. We briefly described major concepts of temporal databases and discussed types of queries that such databases can support in <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/saving-data-historically-with-temporal-tables-part-1-queries\/\">part 1 of this article series<\/a>. The content of part 1 is essential for understanding the part 2.<\/p>\n<p>In this part 2 we discuss what kind of aggregates can be obtained from a temporal database and how to express these aggregations in the SQL language. The examples in the part 2 are based on table definitions and data from part 1 of this article.<\/p>\n<p>Before we start with discussion of aggregates, we remind that a row of the temporal table has a period of validity represented as a pair of timestamps. The lower boundary (start of the period is included into period and the upper boundary (the end of the period) is not. A validity period is associated with every value, but we assume that periods of all attributes of a row are the same. We also assume that the periods represent <em>valid<\/em> time (rather than system time), that is, the time when the information was or is correct in the reality but not necessarily stored in the database.<\/p>\n<p><em>Note: If you want to restart with a clean database, the code that creates the structure and data from Chapter 1 is located in <a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/06\/Making-Temporal-Databases-Work.-Part-2-Computing-Aggregates-CreatePart1ObjectsAndData.txt\">this file<\/a> on the Simple Talk site.<\/em><\/p>\n<h2>Why Aggregates are Special?<\/h2>\n<p>An aggregate is a value calculated using column values from several rows. Usually aggregates represent information about large amount of data in a compact form, typically as a single value that is easy for human interpretation.<\/p>\n<p>A single numeric value of an aggregate may represent thousands or even millions of values in the OLAP queries. The most common aggregates defined in the SQL standard are <code>count<\/code>, <code>sum<\/code>, <code>avg<\/code> (average), <code>max<\/code>, and <code>min<\/code>.<\/p>\n<p>As discussed in the part 1, any non-temporal query, of course, including queries containing aggregates, can be easily converted into a temporal point-in-time query. So, what is the problem with aggregates?<\/p>\n<p>As we mentioned in part 1, a point-in-time query produces non-temporal output because rows of the result have associated periods. Therefore, the output of such query is not a temporal table. To obtain the full power of SQL, we need to combine query output with other temporal tables in more complex queries. Also, sometimes aggregated values are naturally associated with certain periods in the real world.<\/p>\n<p>However, an attempt to calculate aggregates for a period may result in unexpected and\/or misleading results. For example, the following query sums the salary of each person as many times as there are rows for that person that overlap with the desired target period:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">select sum(salary),'2023-01-01', '2024-01-01'  \nfrom emp_temporal;<\/pre>\n<p>The output of this is silly, first, because inflates the salaries of people who changed salary amounts as time goes on. Second, there is little value to add such data together in the first place.<\/p>\n<p>A more meaningful result can be obtained if we split the target period (the whole year) into smaller periods such that the value of salary does not change during any small period. The result will contain a separate row for each small period.<\/p>\n<p>In general, the period associated with a value is calculated as intersections of all validity periods used to calculate the output value. This does not produce any difficulties for joins, unions and other relational operations that combine small number of arguments (usually two). However, for aggregates combining thousands or millions of rows the calculation of periods becomes challenging:<\/p>\n<ul>\n<li>It is computationally complex.<\/li>\n<li>The result, most likely, will contain huge number of tiny periods. So, it will not be compact anymore and hence will not be easy for humans.<\/li>\n<\/ul>\n<p>We need to define temporal aggregates that produce a single value for a given period even if the values being aggregated have different validity periods not necessarily containing the target period. The target period may be specified as a part of a query or calculated from other rows (for example, joined from another table). In both cases the period for the aggregate is not based on periods of aggregated values.<\/p>\n<p>For example, the company management may be interested in average salary of employees during the second quarter of 2023. However, the salary of Eja has been changed on April 20, so the function avg(salary) will produce wrong value, based on when the salary was changed.<\/p>\n<p>Management does not want two values: one before April 20 and another after, just a single value is needed. Instead of averaging the values stored in the table, we compute a single value characterizing the salary during the period requested in the query (that is, second quarter) for each employee and then compute average of these values.<\/p>\n<h2>Aggregating Temporal Data<\/h2>\n<p>To make aggregation more useful we need to define more carefully how to calculate the aggregate values.<\/p>\n<p>Although it is possible to express these calculations directly in SQL, we would like to define new aggregate functions that produce single number of any target period <code>[query_ts, query_te]<\/code> specified in a query and passed to the aggregate function as an argument.<\/p>\n<p>Our aggregate functions should return the same result as the corresponding SQL function if the values of the column do not change during the query period. In particular, if the query period is a point, then our functions will return the same output as standard SQL functions.<\/p>\n<p>Further, if the intersection of a row period with the query period is smaller than the query period, we use only a fraction of the aggregated value. For example, in some scenarios, if the duration of the query period is 3 months, but a person left the project after one month, we would like to count this person as 1\/3, rather than 1 against the cost of a project. If this person worked on another project for remaining two months, we\u2019ll count 2\/3 to the second project.<\/p>\n<p>Similarly, if a person has salary <em>S<\/em><sub>1<\/sub> for two months and <em>S<\/em><sub>2<\/sub> for remaining month of the target period, we use the value <em>S<\/em><sub>1<\/sub>*2\/3+<em>S<\/em><sub>2<\/sub>*1\/3 as a single number representing the salary of the person during the whole period. Of course, this value is not precise, but it gives a reasonable estimation.<\/p>\n<p>So, we need to calculate fractions. Unfortunately, some of widely used calendar periods may have varying duration: neither months nor quarters are equal, and even leap years give us different lengths of year. This makes our function approximate, but we must measure duration of periods in the same units no matter how long or short periods are. Fortunately, it is possible to express duration of any period, even down to the second.<\/p>\n<p>To do this, we will use SQL function <code>extract <\/code>with a first parameter of<code>(epoch \u2026)<\/code> (that returns the duration of an interval in seconds) to calculate durations of periods. For example, the duration of the query period in seconds is calculated by the following expression:<\/p>\n<pre class=\"lang:none theme:none\">extract(EPOCH FROM query_te - query_ts)<\/pre>\n<p>Note that the difference of timestamps is an interval in the PostgreSQL type system. The duration in seconds may be very large for long periods, but we need only ratio of durations which will be always less than 1.<\/p>\n<p>Similarly, we can calculate the duration of the intersection, using greatest, least, and <code>EXTRACT<\/code> functions, take some precautions to avoid a <code>zero_divide<\/code> exception, multiply by the value of an aggregated column and use all that as an argument for SQL sum function.<\/p>\n<p>To avoid long constant values of timestamps and to make queries more readable, we define psql variables that contain the start and the end of the target query period:<\/p>\n<pre class=\"lang:none theme:none\">\\set query_ts  $2023-04-01$::timestamptz\n\\set query_te  $2023-08-01$::timestamptz<\/pre>\n<p>These variables are proceeded with \u2018:\u2019when used in queries. Please note that the values are substituted before the query is sent to the server (so, they are NOT bind variables). For example, the string<\/p>\n<pre class=\"lang:none theme:none\">select :query_ts;<\/pre>\n<p>is sent to the DBMS server as<\/p>\n<pre class=\"lang:none theme:none\">select $2023-04-01$::timestamptz;<\/pre>\n<hr \/>\n<p><em>Editor note: in DBeaver, I was able to use the following:<\/em><\/p>\n<pre class=\"lang:none theme:none\">@set query_ts = $2023-04-01$::timestamptz\n@set query_te = $2023-08-01$::timestamptz<\/pre>\n<p><em>For more information on the settings required, <a href=\"https:\/\/stackoverflow.com\/questions\/56566119\/dbeaver-how-to-declare-variables-and-use-them\">this stackoverflow post<\/a> covers it well.<\/em><\/p>\n<hr \/>\n<p>If your client does not support any similar kind of variables, you can just substitute them with constants. We are now ready to write an SQL query that computes an aggregated value for the target period:<\/p>\n<pre class=\"lang:none theme:none\">set search_path to temp_agg;\n\nselect project, sum(\n  case\n  when salary is not null \n  then salary * \n     extract(EPOCH FROM least(emp_te, :query_te) -\n       greatest(emp_ts,:query_ts)) \/\n       extract(EPOCH FROM :query_te - :query_ts)\n  else null\n  end) as temporal_sum\nfrom emp_temporal\nwhere (emp_ts, emp_te) overlaps (:query_ts, :query_te)\ngroup by project;<\/pre>\n<p>The output of this query is:<\/p>\n<pre class=\"lang:none theme:none\">project |      temporal_sum      \n--------+------------------------\np11     |  5940.9836065573770492\np20     |  3360.6557377049180328\np15     | 13853.2786885245901640\n(3 rows)<\/pre>\n<p>The query above computes only one aggregate, but we need much more in our examples. To make subsequent queries shorter, we wrap the code into PostgreSQL functions and define PostgreSQL user-defined aggregates <code>temporal_sum<\/code>, <code>temporal_count<\/code>, and <code>temporal_avg<\/code>. These functions have four additional arguments specifying the boundaries of the validity period of the aggregated value and the query period.<\/p>\n<p>The complete code of these aggregates can be found in the <a href=\"#Appendix\">appendix<\/a>.<\/p>\n<p>All queries below produce temporal results, that is, each row of the result has associated pair of timestamps. However, to avoid too long rows, we do not include these timestamps into the list of output columns if the periods for all output rows are the same.<\/p>\n<p>Our first query uses temporal aggregates with <code>GROUP BY name<\/code> clause. Names are unique in our table, so such query would include exactly one row into each group and the value of <code>count<\/code> would be 1.<\/p>\n<pre class=\"lang:none theme:none\">-- Note, you must create the objects in the Appendix or\n-- the function code will give you erros.\nselect \n    name, \n    temporal_count(salary, emp_ts, emp_te, \n                  :query_ts, :query_te) AS temporal_cnt,\n    temporal_avg(salary, emp_ts, emp_te, \n                  :query_ts, :query_te) AS temporal_avg,\n    temporal_sum(salary, emp_ts, emp_te, \n                  :query_ts, :query_te) AS temporal_sum\nfrom emp_temporal\nwhere (emp_ts,emp_te) overlaps (:query_ts, :query_te) \ngroup by name;<\/pre>\n<p>This returns:<\/p>\n<pre class=\"lang:none theme:none\"> name  | temporal_cnt |    temporal_avg    |    temporal_sum    \n-------+--------------+--------------------+------------------\n Antti |          0.5 |               3600 |              1800\n Merja |            1 |               4200 |              4200\n Anne  |            1 |               4000 |              4000\n Timo  |            1 |  4850.819672131147 | 4850.819672131147\n Esa   |            1 |  3249.180327868852 | 3249.180327868852\n Eja   |            1 | 4753.2786885245905 |4753.2786885245905\n(6 rows)<\/pre>\n<p>The number of returned rows (6) is correct, but one of the values for count is 0.5. This is because Antti joined the company right in the middle of the period of the qery. So, our aggregate calculates a weighted average for each business key and then compute sum, average, or count.<\/p>\n<p>A more interesting (from a business perspective) query calculates the aggregates with grouping by project:<\/p>\n<pre class=\"lang:none theme:none\">select \n    project, \n    temporal_count(salary, emp_ts, emp_te,\n                  :query_ts, :query_te) AS temporal_cnt,\n    temporal_avg(salary, emp_ts, emp_te,\n                  :query_ts, :query_te) AS temporal_avg,\n    temporal_sum(salary, emp_ts, emp_te,\n                  :query_ts, :query_te) AS temporal_sum\nfrom emp_temporal\nwhere (emp_ts,emp_te) overlaps (:query_ts, :query_te) \ngroup by project;<\/pre>\n<p>This returns:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"527\" height=\"96\" class=\"wp-image-102751\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/06\/a-number-and-text-on-a-white-background-descripti.png\" alt=\"A number and text on a white background\n\nDescription automatically generated\" \/><\/p>\n<p>This output shows how intensive (and how expensive) the projects are. The non-integer values for count may look contra-intuitive. However, if a person worked for 3 different projects during 3 months, our calculation will show 1\/3 for each project and the total 1 for all 3 projects. Counting this person as 1 for each project would be misleading. Our calculation is not cutting the body of a person into fractions, it only shows that the person\u2019s effort wat not fully dedicated to a single project.<\/p>\n<p>The reader is encouraged to run this query for different query periods. The results will be only slightly different. If you expect that salary for period of 4 months will be approximately 2 times lagrger than for 2 months, this will not happen. The reason is that the salary in the table is per month, no matter how long the query period is.<\/p>\n<p>How to calculate the total to be paid during the period? We have yet another aggregate, <code>temporal_accum<\/code> for that. This aggregate has one more argument: the interval of time for which the aggregated value is specified. In our case this interval is equal to one month because the table stores salary per month. If you changed that to annual or per week salalry, the interval must be one year or one week.<\/p>\n<p>Consider completely different application domain. Let the table store values of speed (for example, GPS readings) expressed in kilometers (or miles) per hour, then the interval will be one hour and the aggregate <code>temporal_accom<\/code> will calculate the distance covered during the query period.<\/p>\n<p>Let\u2019s now return to our table <code>emp_temporal<\/code>. The following query calculates the totals for the query period:<\/p>\n<pre class=\"lang:none theme:none\">select  project,\n        temporal_accum(1, make_interval(months =&gt; 1),\n              emp_ts, emp_te, :query_ts, \n                  :query_te) AS person_months,\n        temporal_accum(salary, make_interval(months =&gt; 1),\n              emp_ts, emp_te, :query_ts, \n                  :query_te) AS total_cost\nfrom emp_temporal\nwhere (emp_ts,emp_te) overlaps (:query_ts, :query_te) \ngroup by project;<\/pre>\n<p>This returns the following.<\/p>\n<pre class=\"lang:none theme:none\">project|person_months     |total_cost        \n-------+------------------+------------------\np11    |               6.6|22733.333333333336\np20    |3.0999999999999996|13866.666666666668\np15    |12.666666666666666|56336.666666666664\n(3 rows)<\/pre>\n<p>The first column accumulates the value 1 with the interval on month, so the values in this column are mythical person-months spent on the project during the query period.<\/p>\n<p>According to the title of a famous book \u2018<a href=\"https:\/\/www.amazon.com\/Mythical-Man-Month-Software-Engineering-Anniversary\/dp\/0201835959\">The Mythical Man-Month<\/a>\u2019 by F. Brooks, the man-months were mythical in 1970-ies. Man-months are now renamed with the more appropriate person-months, but we think they are still mythical. Note that the unit of measure is also important here: if we replace the interval parameter with one year instead of month, we\u2019ll get mythical person-years.<\/p>\n<p>Our last query calculates aggregates for each quarter of 2023. There is no query period for this query because different rows must have different periods (quarters). Therefore, the periods for which the aggregates are calculated are coming from the database (table <code>qu<\/code>), rather than from query parameters (as in previous queries).<\/p>\n<p>Our example tables do not contain any data for periods outside of 2023. In more realistic scenarios we should also explicitly specify the query period so that only quarters of required year would be selected.<\/p>\n<p>We rounded all values except count using <code>ceil<\/code> function to make the output more readable, which rounds up the specified number where needed. The value of <code>temporal_count<\/code> is rouned using the function <code>round<\/code>.<\/p>\n<pre class=\"lang:none theme:none\">select q.code,\n     ceil(temporal_avg(e.salary, e.emp_ts, e.emp_te,      \n                          q.qu_ts,q.qu_te)) per_mon_avg,\n     ceil(temporal_sum(e.salary, e.emp_ts, e.emp_te,    \n                          q.qu_ts,q.qu_te)) per_mon,\n     ceil(temporal_accum(e.salary, make_interval(months=&gt;1),\n           e.emp_ts, e.emp_te, q.qu_ts,q.qu_te)) per_qu,\n     round(temporal_count(e.salary, e.emp_ts, e.emp_te, \n                       q.qu_ts,q.qu_te)::decimal,3) cnt\nfrom emp_temporal e\n   join qu q \n      on (e.emp_ts, e.emp_te) overlaps (q.qu_ts,q.qu_te)\ngroup by q.code;<\/pre>\n<p>This should return something like this:<\/p>\n<pre class=\"lang:none theme:none\"> code | per_mon_avg | per_mon | per_qu |        cnt        \n------+-------------+---------+--------+-------------------\n q1   |        4140 |   20700 |  62072 | 5.000\n q2   |        4214 |   22459 |  68124 | 5.330\n q4   |        4500 |   22500 |  69032 | 5.000\n q3   |        4330 |   23107 |  70860 | 5.337\n(4 rows)<\/pre>\n<p>The above query produces an overall summary for all projects. Of course, <code>GROUP BY q.code, e.project clause<\/code> will produce more detailed data on each project per quarter. Probably <code>ORDER BY<\/code> clause is also welcome to make the management completely happy.<\/p>\n<p>All values produced by our aggregates are approximate. The reason is that different months differ in duration. Of course, the problem is in the calendar rather than in the implementation of aggregates. So, these aggregates are probably good for planning, reporting, and analytics, but not for accounting.<\/p>\n<p>We discussed <code>count<\/code>, <code>sum<\/code>, and <code>avg<\/code> functions so far. What about <code>in<\/code> and <code>max<\/code>? These two functions are much easier: the values from all rows that overlap with the target period produces meaningful result.<br \/><strong>Read also:\u00a0<\/strong><a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/sql-server-temporal-tables\/\">SQL Server temporal tables<\/a><\/p>\n<h2>Conclusions<\/h2>\n<p>Aggregation in temporal databases over time periods requires special attention. Straightforward use of standard SQL aggregates may produce unexpected or incorrect output.<\/p>\n<p>The aggregates described in this article produce compact output that have associated meaningful periods of validity. Consequently, these aggregates can be used in queries producing temporal tables.<\/p>\n<p>Finally, although our implementation uses PostgreSQL-specific features, we tried to reduce dependency on the specific DBMS.<\/p>\n<h2>References<\/h2>\n<p>This section is identical to the corresponding section in the part 1 of this article.<\/p>\n<p>During first decades of research on temporal database a complete bibliography was maintained. More information on this bibliography is available in [1]. The book [2] highlights the major outcome of that research. A systematic presentation of theoretical viewpoint on temporal databases can be found in [3].<\/p>\n<p>An article [4] provides an overview of temporal features in SQL Stanard 2011 (that weren\u2019t significantly changed in subsequent editions of the Standard). It also contains rationale for decisions made in the Standard.<\/p>\n<p>One of several practical approaches to implementation of temporal features is described in [5]. The authors introduce asserted time dimension and describe advantages of bi-temporal data model based on effective time dimensions.<\/p>\n<p>An article [6] introduces an alignment operation that provides an extension of relational algebra supporting temporal operations for one-dimensional time.<\/p>\n<p>Finally, an emotionally rich annotated bibliography is available at [7].<\/p>\n<ol>\n<li>Michael D. Soo. 1991. Bibliography on temporal databases. SIGMOD Rec. 20, 1 (March 1991), 14\u201323. <a href=\"https:\/\/doi.org\/10.1145\/122050.122054\">https:\/\/doi.org\/10.1145\/122050.122054<\/a><\/li>\n<li>Abdullah Tansel, James Clifford, Shashi Gadia, Sushil Jajodia, Arie Segev, and Richard T. Snodgrass (editors).\u00a0Temporal Databases: Theory, Design, and Implementation.\u00a01993.<\/li>\n<li>C. J. Date, Hugh Darwen, Nikos Lorentzos.\u00a0Time and Relational Theory, Second Edition: Temporal Databases in the Relational Model and SQL. 2nd edition, 2014.<\/li>\n<li>Krishna Kulkarni and Jan-Eike Michels. \u201cTemporal Features in SQL:2011\u201d. SIGMOD Record, September 2012<\/li>\n<li>Tom Johnston and Randall Weis. Managing Time in Relational Databases: How to Design, Update and Query Temporal Data. 2010.<\/li>\n<li>Anton Dign\u00f6s, Michael H. B\u00f6hlen, and Johann Gamper. 2012. Temporal alignment. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD &#8217;12). Association for Computing Machinery, New York, NY, USA, 433\u2013444. https:\/\/doi.org\/10.1145\/2213836.2213886<\/li>\n<li>Temporal Databases Annotated Bibliography. <a href=\"https:\/\/illuminatedcomputing.com\/posts\/2017\/12\/temporal-databases-bibliography\/\">https:\/\/illuminatedcomputing.com\/posts\/2017\/12\/temporal-databases-bibliography\/<\/a><\/li>\n<\/ol>\n<h2><a id=\"Appendix\"><\/a>Appendix<\/h2>\n<p>This appendix contains the code of the temporal aggregates for PostgreSQL. A PostgreSQL aggregate consists of state variable and few functions. We use function <code>sfunc<\/code> that is invoked for every aggregated row and calculates new value of the state. For all functions except <code>temporal_avg<\/code> the final value of the state is just the value to be returned by the aggregate function. The state of <code>temporal_avg <\/code>consists of values for <code>sum<\/code> and <code>count<\/code>, so one more function is needed to return the final value of the aggregate function.<\/p>\n<p>The code is also available <a href=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/06\/Making-Temporal-Databases-Work.-Part-2-Computing-Aggregates-FunctionCode.txt\">here on the Simple Talk<\/a> site.<\/p>\n<pre class=\"lang:none theme:none\"> set search_path to temp_agg;\n--- sum  \ncreate or replace function temporal_sum_next (\n    accum double precision, \n    val double precision, \n    val_s timestamptz,\n    val_e timestamptz,\n    q_s  timestamptz,\n    q_e timestamptz)\n    returns double precision\nIMMUTABLE language SQL \nreturn case \n when q_s = q_e and val_s &lt;= q_s and q_s &lt; val_e \n    and val is not null\n then coalesce(accum,0) + val\n when (val_s, val_e) overlaps (q_s,q_e) and q_s &lt; q_e and val is not null \n then\n      coalesce(accum,0) + val *\n       EXTRACT(EPOCH FROM least(val_e, q_e) - greatest(val_s,q_s)) \/\n       EXTRACT(EPOCH FROM q_e-q_s)\n else accum\n end;\ndrop AGGREGATE if exists temporal_sum (\n    double precision, timestamptz, timestamptz, timestamptz, timestamptz);\nCREATE AGGREGATE temporal_sum (\n    double precision, timestamptz, timestamptz, timestamptz, timestamptz\n    ) (\n    sfunc= temporal_sum_next,\n    STYPE = double precision\n    );\n---- COUNT\ncreate or replace function temporal_count_next (\n    accum double precision, \n    val double precision, \n    val_s timestamptz,\n    val_e timestamptz,\n    q_s  timestamptz,\n    q_e timestamptz)\n    returns double precision\nIMMUTABLE language SQL \nreturn case \n when q_s = q_e and val_s &lt;= q_s and q_s &lt; val_e and val is not null\n then coalesce(accum,0::double precision) + 1::double precision\n when (val_s, val_e) overlaps (q_s,q_e) and q_s &lt; q_e and val is not null \n then\n    coalesce(accum,0::double precision) + \n       EXTRACT(EPOCH FROM least(val_e, q_e) - greatest(val_s,q_s)) \/\n       EXTRACT(EPOCH FROM q_e-q_s)\n else accum\n end;\ndrop AGGREGATE if exists temporal_count (\n    double precision, timestamptz, timestamptz, timestamptz, timestamptz);\n   \nCREATE AGGREGATE temporal_count (\n    double precision, timestamptz, timestamptz, timestamptz, timestamptz\n    ) (\n    sfunc= temporal_count_next,\n    STYPE = double precision\n    );\n---  Accumulate\ncreate or replace function temporal_accum_next (\n    accum double precision, \n    val double precision, \n    per interval,\n    val_s timestamptz,\n    val_e timestamptz,\n    q_s  timestamptz,\n    q_e timestamptz)\n    returns double precision\nIMMUTABLE language SQL \nreturn case \n when q_s = q_e and val_s &lt;= q_s and q_s &lt; val_e and val is not null\n then  0::double precision\nwhen (val_s, val_e) overlaps (q_s,q_e) and q_s &lt; q_e and val is not null \n then\n      coalesce(accum,0::double precision) + val *\n       EXTRACT(EPOCH FROM least(val_e, q_e) - greatest(val_s,q_s)) \/\n       EXTRACT(EPOCH FROM per) \n else accum\n end;\nDO $ BEGIN\n   \nCREATE AGGREGATE temporal_accum (\n    double precision, interval, timestamptz, timestamptz, timestamptz, timestamptz\n    ) (\n    sfunc= temporal_accum_next,\n    STYPE = double precision\n    );\nEXCEPTION\n    WHEN duplicate_function THEN NULL;\nEND $;\n   \n---- average\nDO $\nBEGIN\n    IF NOT EXISTS (SELECT 1 FROM pg_type WHERE typname = 'temporal_avg_state') THEN\n         create type temporal_avg_state as (\n            state_sum double precision,\n            state_cnt double precision);\n    END IF;\n    --more types here...\nEND$;\ncreate or replace function temporal_avg_next (\n    accum temporal_avg_state, \n    val double precision, \n    val_s timestamptz,\n    val_e timestamptz,\n    q_s  timestamptz,\n    q_e timestamptz)\n    returns temporal_avg_state\nIMMUTABLE language SQL \n return case \n when q_s = q_e and val_s &lt;= q_s and q_s &lt; val_e and val is not null\n then (\n  coalesce(accum.state_sum,0) + val,\n  coalesce(accum.state_cnt,0) + 1\n  )::temporal_avg_state \n when (val_s, val_e) overlaps (q_s,q_e) and q_s &lt; q_e and val is not null \n then (\n    coalesce(accum.state_sum,0) + val *\n       EXTRACT(EPOCH FROM least(val_e, q_e) - greatest(val_s,q_s)) \/\n       EXTRACT(EPOCH FROM q_e-q_s),\n    coalesce(accum.state_cnt,0) +\n       EXTRACT(EPOCH FROM least(val_e, q_e) - greatest(val_s,q_s)) \/\n       EXTRACT(EPOCH FROM q_e-q_s)\n       )::temporal_avg_state \n else accum\n end;\ncreate or replace function temporal_avg_final(\n     s temporal_avg_state) \nreturns double precision\nIMMUTABLE language SQL \nreturn \n   case when s.state_sum is not null and s.state_cnt &lt;&gt; 0 \n   then s.state_sum \/ s.state_cnt \n   else null end; \ndrop AGGREGATE if exists temporal_avg (\n    double precision, timestamptz, timestamptz, timestamptz, timestamptz);\nCREATE AGGREGATE temporal_avg (\n    double precision, timestamptz, timestamptz, timestamptz, timestamptz\n    ) (\n    sfunc= temporal_avg_next,\n    STYPE = temporal_avg_state,\n    FINALFUNC = temporal_avg_final\n    );<\/pre>\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>Frequently Asked Questions (FAQs)<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. How do you calculate aggregates in a temporal database?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Standard SQL aggregates don\u2019t work correctly on temporal data because they don\u2019t account for values that change within the query period. Temporal aggregates weight each value by the fraction of the query period during which it was valid, using the ratio of the intersection of the row\u2019s validity period with the query period divided by the total query period duration. In PostgreSQL, you calculate these fractions using EXTRACT(EPOCH FROM &#8230;) to convert timestamp intervals to seconds for ratio computation.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. What is the difference between temporal and non-temporal aggregates?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Non-temporal aggregates treat each row equally regardless of how long the value was valid. Temporal aggregates weight each row\u2019s contribution by its duration within the query period. This means COUNT can return fractional values (e.g., 0.5 if a person was only employed for half the query period), and AVG reflects the time-weighted average rather than a simple arithmetic mean of stored values.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>How to compute temporal aggregates in PostgreSQL when values change over time. Covers custom SUM, AVG, COUNT, MIN, and MAX functions for temporal tables that handle fractional validity periods and weighted calculations across time ranges.&hellip;<\/p>\n","protected":false},"author":343250,"featured_media":103123,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143534],"tags":[158978],"coauthors":[159054],"class_list":["post-102750","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-featured","category-postgresql","tag-postgresql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/102750","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/343250"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=102750"}],"version-history":[{"count":9,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/102750\/revisions"}],"predecessor-version":[{"id":109291,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/102750\/revisions\/109291"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/103123"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=102750"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=102750"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=102750"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=102750"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}