Articles

SQL Aggregate calculations

Statistical functions

The statistical functions are standard to SQL and allow the calculation of statistical results on the rows of a table.

  • AVG (column) Average values
  • SUM (column) Sum of values
  • MIN (column) Minimum value
  • MAX (column) Maximum value
  • COUNT (column) Number of values in the column
  • COUNT (DISTINCT column) Number of distinct values
  • COUNT (*) Number of rows (without regard to NULLs)

SUM and AVG must process numeric values, while MIN, MAX, and COUNT can handle any type of value. Adding the word DISTINCT determines that a particular value is computed only once, regardless of the number of times it appears in the table.

Note:
NULLs are never considered, i.e. they are not part of the calculated result.

COUNT (*) counts lines, not values.

How many inventory releases are there?

SELECT COUNT (*) FROM output;

Groupings

Used alone, statistical functions express a result calculated from all the rows of the table. Used with the GROUP BY clause, they express a result by group.

SELECT list of columns, list of statistical functions FROM table GROUP BY the same list of columns as in the SELECT clause;

Thus, the selected statistical functions present a result for each of the group values existing in the table.

Note:
The (only) columns that appear in the GROUP BY clause must appear in the SELECT
clause, and vice versa.

How many inventory outputs are there for each project?

SELECT project_code, COUNT (*) FROM output GROUP BY project_code;

Put a condition on groupings

In the same way that the WHERE clause makes it possible to set conditions on the rows of a table, the HAVING clause makes it possible to set conditions on groupings. A SELECT statement can then look like this:

SELECT list of columns, function (s) statistic (s)
FROM table WHERE row condition
GROUP BY list of columns HAVING group condition;

The HAVING clause is used to check conditions applicable to a group and not to a row. For this purpose, a statistical function can be used in the HAVING clause whereas it can not be used in a WHERE clause. An important point is that you can use statistical functions in the HAVING clause, which is not possible in the WHERE clause.

Which projects led to more than 10 inventory outflows?

SELECT project_code FROM output GROUP BY project_code HAVING COUNT (*)> 10;

Move to the next article: SQL Subqueries