Sunday, July 27, 2008

Doggerel #159: "Statistics Can Say Anything!"

Welcome back to "Doggerel," where I ramble on about words and phrases that are misused, abused, or just plain meaningless.

It's often been said that there are three kinds of lies: Lies, damned lies, and statistics. There's some level of truth in that, but it's not because statistics are inherently dishonest. It's because there's a lot of room for lies of omission. How were the data counted? What definitions were used? An unscrupulous person can take advantage of that, since too many people don't get into the guts of the numbers.

The best defense against being lied to with statistics is curiosity: Ask for the source. And by source, I don't mean some guy on TV or in a lab coat who said so. How did they count? How did they measure? What did they really measure? Though you may need someone to give you a crash course in statistics to know the difference between a one-tailed or two-tailed t-test, or why everyone favors Tukey comparison tests, many people who misunderstand or lie with statistics make more fundamental errors. Sometimes people will broaden or narrow definitions to conveniently include the irrelevant or exclude the critical. Sometimes they'll collect samples in a biased manner, asking only in areas where people are sympathetic, recording large numbers of anecdotes from nonexperts instead of looking for objective numbers, and so on.

I'm sure we're all guilty of keeping the numbers in our heads without references, but we should all resolve to find sources before quoting willy-nilly.

1 comment:

Rhoadan said...

Here's an oldie, but goody about interpreting statistics for non-mathematicians. There are newer books on the subject, but some tricks and cheats haven't changed.