tomclegg.net |
Statistics 1 Posted May 20, 2001 Today's topic is statistics. One of the many books I haven't read is called "Mathematics, Queen and Servant of Science." Sounds pretty exciting. Whether it's Queen or Servant or both, I don't know, but there is definitely some connection. So today I'm going to talk about everyone's favourite branch of math: statistics! But don't go away yet, because we're going to listen to some music first. --- On the very first episode of Mostly Mozart, or at least the first one that contained a science segment, I talked about the so-called Mozart Effect. The idea that listening to Mozart makes you smart is a great example of how the outcome of science experiments is misrepresented by mass media. The outcome of the experiment was that the people who did their IQ test immediately after listening to Mozart did better than the people who had been listening to other music, or sitting in silence. The outcome of the CNN investigation was that listening to Mozart makes you smarter. This is not a very good interpretation of the experimental result. For one thing, I've always doubted whether IQ tests mean anything at all. If they do, it's probably because they cover lots of different indicators of intelligence. After all, there are lots of different ways to be smart. But the IQ test used in the Mozart experiment did not cover lots of different indicators. It consisted of exactly one exercise: watching a piece of paper as it was folded and cut, and predicting what the paper would look like when unfolded. So if it shows anything, the experiment showed that listening to Mozart increases your ability to predict what cut-up pieces of paper will look like when you unfold them. Which is probably not what most people mean when they say "smart." Another problem has to do with statistics. Like I said in November, if you tossed a coin 10 times, and came up with 6 heads and 4 tails, you would be wrong to conclude that heads is more likely than tails. For that matter, even if you came up with 10 heads -- well, you might check the coin to make sure it actually had two different sides -- but you would probably think twice before betting a lot of money on it coming up heads the 11th toss. After all, if you've just tossed 10 heads, it's more likely than ever that you'll get tails next time, isn't it? Well, unfortunately for gamblers, no, you're just as likely to get heads on the 11th toss. The coin doesn't remember how many times it's come up heads -- and besides, if it did, it might also remember coming up tails four hundred times before, so it might still think it should give out more heads to make up for them. Most people hate statistics. I don't, although I sure did when I was studying it in school. It's so boring, and nobody ever explains what it's for. They just want you to learn a bunch of formulas, so you can come up with some apparently useless numbers with silly names like "standard deviation." So after this customary music break, I'll try to tell you what standard deviation means and how statistical analysis is supposed to be useful. --- Mostly Mozart is sponsored by Comfort and Joy, a unique children's store. My name is Tom Clegg, and I came here to tell you something about science. But I don't know anything about science today, so I'm talking about statistics instead. That also starts with S. In fact, according to my computer's dictionary, about 11% of all English words start with S. The computer might be wrong, but at least it gives me the same answer every time. The average English word is 9 letters long. When I say average, of course, I really mean "mean". So I should say that the mean English word is 9 letters long. And that illustrates the first problem with statistics. The phrase "mean English word" seems to refer to one particular word. Just like the "average North American family" seems to refer to a family. It's probably better to say that the mean length of an English word is 9 letters. The mean number of children per North American family is 2.4. Which, of course, does not imply that there are any North American families that actually have 2.4 children -- or any North American families that are "average" in the mathematical sense. The average, or mean, is an indirect measure of a group, obtained by measuring the individuals. A good portion of popular statistics are of the form, "9 out of 10 doctors agree." You can't be sure what that means. Does it mean they asked 10 doctors, and 9 agreed? Or did they ask all doctors and find that 90% agreed? Well, if this is from a TV ad, you can assume that they didn't ask any doctors at all. But let's assume they asked 10, and 9 said yes. What does this tell you? The only thing you know for sure is that somewhere there are 9 doctors who agree and 1 who doesn't. Obviously you are supposed to assume that those 10 doctors are representative of all doctors everywhere; in other words, you're supposed to think that if you did ask all doctors, 90% would agree. The 10 doctors who you actually did ask are called your sample. And jumping to the 90%-of-all-doctors conclusion is called extrapolating. You have to be careful with statistics. Like they say, statistics don't lie, but people lie with statistics all the time. Usually they don't realize that they're lying, because they slept through their statistics classes just like you. In fact, things can get so bad that the word "extrapolation" is used as a euphemism for "wild guesses." The fun part of statistics is knowing when to extrapolate. Is a sample size of 10 big enough to justify a general statement about doctors? Obviously, you'll get a different answer depending on which 10 doctors you ask, so how do you choose the right ones? Even if you choose the perfect set of 10 doctors, you won't get the right answer if the actual average over the whole population is 8.5 out of 10. So even with the perfect sampling technique, it's still good to calculate how close your statistic is to the "real" answer. And since it's impossible to know the real answer without asking all doctors, there is also a way to measure how certain you should be of your result. The answer to half of those questions is randomness. I won't tell you any more than that because I haven't time, but I will say that randomness will be coming up in future shows about quantum physics. I need to talk about quantum physics, because Catherine wants me to explain string theory, and that's an attempt to reconcile quantum physics and gravity. So tune in next week and the week after that, and you're sure to hear something about that. Well, 90% sure. Thanks for listening and I'll see you next week. |