1. Amongst many fascinating stories in Malcolm Gladwell's Outliers, chapter eight has one that brings back a strange memory for me. And one that should remind us to always question whether the data we are shown really supports the conclusions we draw from it.
    Consider the following table of scores achieved by kids in Baltimore public schools across 1st-5th grade. (The test referred to is the California Achievement Test, but that's not important to the example.)
    What conclusions might we draw from this? I think we might reasonably start to think that Baltimore public schools were failing low income pupils. They start off with only a slight disadvantage from their moneyed peers (32 poits), but end school significantly under-performing them (73 points).
    You get a very different story if you look just at what happens during the time pupils were in school. Karl Alexander tested pupils at the start and at the end of every school year, enabling him to measure how many points they gained while actually in school. Here are the results:
    Now it seems that, if anything, schools are of more benefit to poorer kids. Across the five grades, during the school years, they gaines 189 points, while the wealthy kids gained only 184 points. The difference between the first table and the second lies in how many points the pupils gained or lost during the long summer holidays:
    And if we know this, we know that the story isn't about education at all. Its about what happens in the school holidays. Poor kids on average neither gain nor loose points over holidays. But richer kids consistently gain. So by the end of the school period, they outperform their peers.
    So the analytical lesson here is to always be careful that the data you are using really supports the conclusions you are drawing from it. Ask yourself: If you cut the data differently, might it tell a completely different story? If so, give it a shot.
    And what of the Beano you might ask? I have an odd memory in primary school of my class gathering around the teacher after the long summer break one year. He asked us to each pick one book we'd read over the summer and tell the class about it. One by one the pupils in my class told their classmates about one of the books they had enjoyed. When the teacher pointed to me, I had to ask whether a comic counted, since that was all I'd read over the summer.
    0

    Add a comment

  2. ... or so it might seem if you take polling results seriously.
    In a recent poll, 9% of New Yorkers said they were planning to head to DC for the event. There are about 16 million adults in the New York area, suggesting 1.4 million people planned to make the trip. The lesson here is that there are some things you shouldn't use polling for!
    From pollster.com: One problem with a question like this one may be that it lends itself to social desirability bias. As we know, citizens tend to over-report the extent to which they will (or did) vote in elections. In a similar way, some respondents may be proclaiming that they will attend the inauguration when they don't have any real intention of going. They may do so because they hear of so many others who are attending and they feel as though it is something they should be doing as well.
    This reminds me of another article pointing out something else you shouldn't use polls for: asking people whether they were at historic events.
    From Political Animal: I remember reading, years ago when I lived in Miami, that a significant percentage of the population of South Florida believes they were in attendance for the famous Dolphins-Charges playoff game in 1982. That's impossible, of course, since the capacity of the Orange Bowl was only about 75,000, and the population of Miami-Dade is in the millions, but locals remembered the game so fondly, they'd fooled themselves into thinking they actually saw the game in person. It's similar to the phenomenon of the number of people claiming to have been on hand for Woodstock in 1969 -- more people believe it than could have possibly shown up.
    You also shouldn't use polls to ask people whether they did things that turned out to be a very, very bad idea:
    Again from Political Animal: [in a recent poll] only 33 percent of respondents admit to having voted for the guy twice, while 52 percent said they'd never voted for him at all. If that were actually true, of course, Bush would never have had the chance to run the country so firmly into the ground that people are now pretending they never liked him
    ... so what does that leave us that polling IS good for?
    0

    Add a comment

  3. There are some occasions where you need to keep your data gathering quiet!
    From the excellent XKCD
    I also love this one. I wish I'd have thought of it!!
    0

    Add a comment

Labels
If you like this you'll like:
Info Clarity Archive
Loading