The basic idea of polling is very simple. Say you have a coin that comes up heads 50% of the time. If you flip it a thousand times, it should come up heads 500 times, with a standard deviation of just under 16. That means there’s a two-thirds chance that the number of times it will come up heads is between 484 and 516, and it means there’s about a ten percent chance that it comes up heads 520 times or more.
Now say you have a coin and you don’t know that percentage of time it comes up heads. Say you flip it a thousand times and it comes up heads 520 or more times. Then it probably isn’t a coin that comes up heads 50% of the time in general.
If you view taking a random sample of 1000 voters as flipping a coin 1000 times, then it means that if you’re up 52-48 in a poll, you probably really are ahead. Now, being 52-48 is only barely outside the “margin of error” (which is 1.6% for each candidate, for a total of 3.2% on the difference), so some might scream STATISTICAL DEAD HEAT. But it’s not, it’s probably not a fair coin. In fact, there’s just as a good a chance that’s is a 54-46 coin as a 50-50 coin.
Actual polling is more imprecise because of issues with getting a good sample. The so-called “margin of error” is assuming that your sample is a perfect sample of the people who end up voting. There may be a bias within your sample — maybe Democrats are more likely to take your call or something — and some pollsters adjust their sample to fit what they think the population of people who vote will look like (certain percentage of olds, of youngs, of Democrats, of Republicans, etc.) and that’s hardly a perfect science.
But if you take a whole
punch bunch of different pollsters with different methodologies and look at their numbers in aggregate, you can probably estimate the outcome pretty accurately if you use the right models or simulations. Maybe not in theory in the strictest sense but in practice.
When people say THE POLLS WERE WRONG, DEWEY BEATS TRUMAN, BRADLEY EFFECT, they usually mean 2 or 3 polls, not a hundred. If a hundred polls by several different pollsters all point in one direction, it’s just not that likely the election goes in the other.
And it shouldn’t be surprising that with some knowledge of statistics and election history, a person like Sam Wang or Nate Silver can come up with a reasonably accurate model for forecasting elections.
No one at a bank says “is it worth $1000 or $1200, I don’t know, let’s give up”. They come up with models to price things. And the argument that their models caused the financial crisis so we shouldn’t trust Nate Silver’s models is a nonsensical argument. Markets have bubbles because the only measure of price is what people are willing to pay. There’s no good analog of polling in that world.
Reader MK sends along an article about how old-time baseball people hated Nate Silver too. Does this remind you of anything?
PECOTA wasn’t any sort of dark magic. PECOTA was in essence a smart analyst who knows his history, draws defensible conclusions, chips away at uncertainty, and never claims to know more than he actually does. A good pundit, in other words. And that’s what made it and its ilk such a threat to the sports world’s pundit class. Statistics can’t replace good writing, but it can expose the bad, and sabermetrics represented a direct threat to the bad writers who had gotten away with being bad for far too long. These were the writers who used the same old false narratives to reach the same old misguided conclusions. (They used stats, too, incidentally, just the wrong stats—the noisy old metrics like RBIs and batting average.) Sportswriting isn’t a monolith, and many writers like Joe Posnanski combined their experience and access with the new methodology. But a lot of those pundits made their money on that margin of uncertainty in sports, yammering about heart and grit and all that ineffable crap that was never so ineffable that a hack couldn’t write 500 words about it for the early edition. And so they remained in the dark, stubbornly entrenched, missing out on a new way to analyze the game they were paid to follow.