Joe Celko's SQL for Smarties: Advanced SQL Programming, Third Edition

Chapter 23: Statistics in SQL

Overview

SQL is not a statistical programming language. However, there are some tricks that will let you do simple descriptive statistics. Many vendors also include other descriptive statistics in addition to the required ones. Other sections of this book give portable queries for computing some of the more common statistics. Before using any of these queries, you should check to see if they already exist in your SQL product. Built-in functions will run far faster than these queries, so you should use them if portability is not vital. The most common extensions are the median, the mode, the standard deviation, and the variance.

If you need to do a detailed statistical analysis, then you can extract data with SQL and pass it along to a statistical programming language, such as SAS or SPSS.

However, you can build a lot of standard descriptive statistics using what you do have. First, the basic analysis of a single column can start with this VIEW or query to get cumulative and absolute frequencies and percentages:

WITHSELECT F1.x, COUNT(F1.x)       CAST(ROUND(100. *              (COUNT(F1.x)) /             (SELECT COUNT(*) FROM Foobar),0)             AS INTEGER) FROM Foobar AS F1GROUP BY F1.x) AS F2(x, abs_freq,...

UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Category: Data Warehousing Software
Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.