
A Note on Running Sums and Products in SQL
[There is a recording on this article here: Tweet, and also in my GitHub project, Small SQL projects.]
It’s a common requirement to compute running sums over time in SQL; for example, to find sales volumes to date. This is easy to do using analytic functions, like this, schematically:
SELECT SUM(sales) OVER (PARTITION By partition_key ORDER BY date)
FROM ...
The ORDER BY clause implicitly adds a window range of UNBOUNDED PRECEDING, and if you omit it you get the overall totals by partition_key.
Recently I needed to compute the products of some numeric factors across records over time. This kind of requirement often arises in financial calculations, and is slightly more tricky since Oracle doesn’t have an analytic function for products as it does for sums. However, we can achieve the same functionality using the well known mathematical equivalence:
log(xy) = log(x) + log(y)
and therefore:
xy = exp(log(x) + log(y))
More generally, if we have a set of N records , and a function, f, defined on a record, we can write the running sum, for function f, and n in (1..N) as:
and the running product as:
Then from the above equivalence, we can write:
This means we can get the running products in SQL using the analytic SUM combined with the (natural) log (LN in Oracle SQL) and exp functions. Let’s see how it works using Oracle’s HR demo schema. We’ll take the employees table and use:
- department_id as the partition key
- employee_id as the dimension to order by
- salary as the measure
To start with, let’s get the running and total sums, showing results for department_id = 60:
SELECT department_id, employee_id, salary,
SUM(salary) OVER (PARTITION BY department_id ORDER BY employee_id) running_sum,
SUM(salary) OVER (PARTITION BY department_id) total_sum
FROM employees
ORDER BY department_id, employee_id
/
Sums
DEPARTMENT_ID EMPLOYEE_ID SALARY RUNNING_SUM TOTAL_SUM
------------- ----------- ---------- ----------- ----------
60 103 9000 9000 28800
60 104 6000 15000 28800
60 105 4800 19800 28800
60 106 4800 24600 28800
60 107 4200 28800 28800
Next, let’s use the above equivalence to get the running and total products of the expression (1 + salary/10000), which we’ll call mult:
SELECT department_id, employee_id, salary, (1 + salary/10000) mult,
EXP(SUM(LN((1 + salary/10000))) OVER (PARTITION BY department_id ORDER BY employee_id)) running_prod,
EXP(SUM(LN((1 + salary/10000))) OVER (PARTITION BY department_id)) total_prod
FROM employees
ORDER BY department_id, employee_id
/
Products
DEPARTMENT_ID EMPLOYEE_ID SALARY MULT RUNNING_PROD TOTAL_PROD
------------- ----------- ---------- ---------- ------------ ----------
60 103 9000 1.9 1.9 9.45551872
60 104 6000 1.6 3.04 9.45551872
60 105 4800 1.48 4.4992 9.45551872
60 106 4800 1.48 6.658816 9.45551872
60 107 4200 1.42 9.45551872 9.45551872
If we didn’t have this technique we could compute the results using explicit recursion, either by MODEL clause, or by recursive subquery factors. Let’s do it those ways out of interest. First here’s a MODEL clause solution:
WITH multipliers AS (
SELECT department_id, employee_id, salary, (1 + salary/10000) mult,
COUNT(*) OVER (PARTITION BY department_id) n_emps
FROM employees
)
SELECT department_id, employee_id, salary, mult, running_prod, total_prod
FROM multipliers
MODEL
PARTITION BY (department_id)
DIMENSION BY (Row_Number() OVER (PARTITION BY department_id ORDER BY employee_id) rn)
MEASURES (employee_id, salary, mult, mult running_prod, mult total_prod, n_emps)
RULES (
running_prod[rn > 1] = mult[CV()] * running_prod[CV() - 1],
total_prod[any] = running_prod[n_emps[CV()]]
)
ORDER BY department_id, employee_id
/
Finally, here’s a solution using recursive subquery factors:
WITH multipliers AS (
SELECT department_id, employee_id, salary, (1 + salary/10000) mult,
Row_Number() OVER (PARTITION BY department_id ORDER BY employee_id) rn,
COUNT(*) OVER (PARTITION BY department_id) n_emps
FROM employees
WHERE department_id = 60
), rsf (department_id, employee_id, rn, salary, mult, running_prod) AS (
SELECT department_id, employee_id, rn, salary, mult, mult running_prod
FROM multipliers
WHERE rn = 1
UNION ALL
SELECT m.department_id, m.employee_id, m.rn, m.salary, m.mult, r.running_prod * m.mult
FROM rsf r
JOIN multipliers m
ON m.rn = r.rn + 1
AND m.department_id = r.department_id
)
SELECT department_id, employee_id, salary, mult, running_prod,
Last_Value(running_prod) OVER (PARTITION BY department_id) total_prod
FROM rsf
ORDER BY department_id, employee_id
/
You can see the scripts and full output on my new GitHub project, Small SQL projects, in the sums_products folder.
You can get the full detail on using analytic functions from the Oracle doc: SQL for Analysis and Reporting
See also: Analytic and Recursive SQL by Example