Many people have had their eyes on election polls recently. These polls typically report percentages supporting each candidate, alongside a "margin of error." What does this number mean? If you read the fine print, you'll sometimes see that this margin of error is a "95% confidence interval." Although this makes it sound like the election result should fall between these bounds 95% of the time, this interpretation is wrong.1 Polls may be off for several reasons: The reported margin of error addresses only the third concern. It tells you how much the findings might change if you re-ran the poll using the same methodology. It does not tell you whether your poll is capturing a representative sample of voters. In practice, survey respondents are never perfectly representative of the voting population: they tend to be older, whiter, and more educated. Pollsters tweak their survey methodology to try to minimize this bias, but will never eliminate it. For this reason, every reputable pollster reports a weighted average of responses. If you know that African Americans will make up 10% of the electorate but are only 6% of your poll respondents, then you can think of each African American respondent as speaking for 10/6 = 1.66 voters, and overweight their answers accordingly. Although the idea is simple, calculating appropriate weights is difficult in practice. One reason is that nobody knows what turnout will be among different demographics: pollsters are forced to make assumptions. A second more subtle point is that it's not even obvious what demographics to consider.2 Finally, there are questions about how to deal with overlapping categories: how much weight should be given to a 23 year old white male, versus a 64 year old black female? Getting into the details of different methodologies is beyond the scope of this post (much to my chagrin). The main point to remember is that even after raw polling data has been collected, pollsters use difficulty-to-assess assumptions about turnout to convert this data into a final number. To highlight this point, in 2016 the New York Times gave different polling analysists access to the same raw data, and noted that the resulting predictions varied by 5%.3 This variation is not captured by the reported margin of error, despite the fact that it (i) is often larger in magnitude and (ii) does not go away as more polls are conducted. It is also the reason why certain pollsters consistently produce results that are more favorable to one candidate or the other. Polls differ not only in who they survey, but also in their assumptions about who will vote. As we get more polling data, we learn how each group will vote, but we don't do any better at predicting how many people from each group will vote. The reported margin of error keeps shrinking, but the error in our assumptions remains. This error is unreported, making the margin of error a misleading statistic. To better predict election outcomes, what we really need is not more polls, but better models for who is going to vote. This seems hard in normal years, and especially so given that (i) we are in the middle of a global pandemic in which record numbers of people are voting early or by mail, and (ii) one of our major political parties is openly working to make sure their votes don't count. In the short term, we may not be able to eliminate polling errors, but we should at least try to communicate them more accurately. One step would be for pollsters to be more transparent about the assumptions that they use. Another approach (which might be easier for people to understand) would be to conduct analyses under different assumptions about turnout, and report results for each scenario. If you have other ideas for how to reduce polling error or communicate it more effectively, please post below! There is a subtle statistical reason that this is wrong, which is not the focus of this post. Even under ideal circumstances, the claim that the confidence interval contains the true outcome 95% of the time holds ex ante, and not for each realization of the confidence interval. In practice the more important point (discussed in this post) is that we are far from ideal circumstances, and there is no reason to believe that confidence intervals will contain the true outcome 95% of the time.↩ For example, most pollsters in 2016 corrected for race, age, and gender, but did not account for level of education. College educated voters were more likely to vote for Clinton and more likely to respond to polls, causing some polls to overstate support for Clinton. Although most polls now take level of education into account, there are an endless number of other possible factors to consider.↩ Sadly, the most accurate prediction in hindsight was one made by a team that included my classmate Sam Corbett-Davies, which showed Trump with a 1 point lead.↩Multiple Sources of Error
Take-Home Message