A very interesting piece by Howard Wainer in the latest American Scientist (May-June 2007) concerns dangerous equations, which he describes as falling into two classes:
- equations that are dangerous because we know them - they "may pose danger because the secrets within its bounds open doors behind which lies terrible peril," with E=mc2 the most obvious candidate
- equations that are dangerous because we don’t know them - mot because there is no theory that has yet yielded these equations, but rather because they are not known by those who need to know them. This is especially true for policy makers that base their decision on mathematical models, and specifically statistical models.
Wainer’s top choice for most dangerous statistical equation is due to Abraham de Moivre, who showed in 1730 that the standard error of the mean of a sample is the standard error of the mean of the population divided by the square root of the sample size. A significant prediction of this equation is that small sample sizes lead to large fluctuations in sample means. It is this simple statement:
small samples → large fluctuations in sample means,
that provides the biggest danger when not used, or not understood, by both policy makers and the average citizen.
Most with the most rudimentary knowledge of statistics know that reported poll figures come with some degree of uncertainty because of sample size (e.g. a typical political poll will yield a statement such as "Candidate A lead the race with 48%, plus or minus 3 percentage points"). The point that Wainer raises is that, when faced with seeing a large number of sample means, we all tend to forget the main point we should see quite a variation of these means if the sample sizes are small.
Wainer goes on to list five examples in which a large number of reported sample means measured in a large number of small-size samples leads to unwarranted conclusions:
- maps of disease rates by county
- the relation of student performance to class size
- the relation between safe cities and city size
- gender difference in academic performance
In each of these cases Wainer convincingly illustrates how De Moive’rs equations is ignored by policy makers, with disastrous results - often resulting in the mis-allocation of scarce resources. (e.g. are small class sizes the best way of using school tax funds to increase student learning?)
There’s even an illustrative story of determining standards for gold coins in 12th century England - a case where a quite a few cases of extreme punishment for certain minters. Of course, de Moive was 600 years in the future. It was only then that those punished could be considered "exonerated."
Wainer’s article is a thorough, fascinating, and extremely well-written look at the dangers of basing decisions on mathematical models without knowing what the models are really saying. Given its readability, it can easily be used in a Quantitative Literacy course as well as a traditional Stat class.