Joe Celko's SQL for Smarties: Advanced SQL Programming, Third Edition

One of the major purposes of a database system is to turn data into information. This usually means doing some statistical summary from that data. Descriptive statistics measure some property of an existing data set and express it as a single number. Though there are very sophisticated measures, most applications require only basic, well-understood statistics. The most common summary functions are the count (or tally), the average (or arithmetic mean), and the sum (or total). This minimal set of descriptive statistical operators is built into the SQL language, and vendors often extend these options with others. These functions are called set functions in the ANSI/ISO SQL standard, but vendors, textbook writers, and everyone else usually call them aggregate functions, so I will use that term.
Aggregate functions first construct a column of values as defined by the parameter. The parameter is usually a single column name, but it can be an arithmetic expression with scalar functions and calculations. Pretty much the only things that cannot be parameters are other aggregate functions (e.g., SUM(AVG(x)) is illegal) and subqueries (e.g., AVG(SELECT col1 FROM SomeTable WHERE ...) is illegal). A subquery could return more than one value, so it would not fit into a column, and an aggregate function would have to try to build a column within a column.
Once the working column is constructed, all the NULLs are removed ...