All models are wrong (or: lies, damned lies and statistics)*

November 10, 2010

The statistician is seen with a certain amount of disdain (or possibly sympathy) by their pure mathematical brethren. And it is with that firmly in mind that I (as a fledgling statistician) take the reins of this worthy blog.

We have some idea of what mathematics is from Adam’s posts; but what is statistics?  Statistics is applied maths with uncertainty. In statistics mathematical techniques are used to model and quantify our uncertainty about reality. Modelling climate change, predicting the outcome of elections, wrecking the financial system and ensuring the casino always wins: statistics is everywhere. And uncertainty is the key to statistics.

In order to get across an understanding of what uncertainty is I will try to describe some of the different kinds we face and how statistics deals with them.  The five levels in the following taxonomy lie on a continuum running from complete certainty to complete uncertainty, and provide a means of measuring the range and limitations of statistics in different situations.** The further we go along this continuum the less effective statistics is at prediction and inference, and many problems in statistics and quantitative social sciences like economics come from not recognising just how far along the continuum we are.

Read the rest of this entry »