(Editor’s Note: If you want to skip right to our forecasts, they’re at the end. But we highly recommend that you read the preamble for background and context.)
We started CivicScience five years ago to develop new ways to measure public opinion at a time when traditional methods of polling were becoming more and more difficult to sustain. Landline phone ownership continues to decline and fewer people have the time or inclination to respond to lengthy surveys, which means there is little remaining randomness in who participates. Meanwhile, pervasive advertising, biased 24-hour news cycles, and group-think social media can cause public sentiment to shift sometimes daily. The longstanding model of calling people on the telephone is broken and not likely to get any better.
We were far from the first people to see this coming. The advent of Robo-calling aimed to reduce the cost of polling so that more people could be called more often. Larger firms began augmenting their data with cell-phone respondents to put expensive duct tape on a romantic but fleeting obsession with random probability. Pioneers like Doug Rivers at YouGov introduced new models derived from captive “panelists,” recruited to answer online surveys in return for rewards. But the costs associated with panelist incentives, combined with biases among the people who have the time and inclination to join panels (Do you belong to one?), put a potential ceiling on this creative method.
Now, we are seeing a new frontier in opinion research, spearheaded by rock stars like Nate Silver and websites like RealClearPolitics.com. These innovators surmise that inherent flaws in traditional polling can be normalized by combining all the results published by reputable firms and producing an average of some sort. This is the first time we see the “Law of Big Data,” which suggests that more data can outperform clever algorithms, applied to opinion research. The problem with poll average models, however, is that we are merely aggregating disparate, small samples of non-representative responses, combined with precarious reweighting techniques. Just because these polls are all mashed together does not mean that Silver and others are solving this underlying problem: As phone-based polling becomes less reliable, so too will the resulting averages and dependent forecasts.
The CivicScience approach, while in some ways radically different, is the next stage in an evolution that moved from Gallup to PPP to YouGov to Nate Silver. Like Gallup, we believe in the fundamental premises of science and we work to achieve as much engagement, randomness, and representativeness as we can muster. Like PPP, arguably the leading Robo-calling firm in the country, we believe in speed, near-constant measurement, and reducing fixed costs for research. Like YouGov and Doug Rivers, we believe that the web represents a better way to engage more people than by calling them on the telephone. And, like Nate Silver, we believe that more data is better and that by applying advanced techniques in data mining, we can find signals and correlations that might otherwise be overlooked by the naked, politically-contaminated eye.
But we take everything a step further. By polling millions of people every week, meticulously organizing the data we collect, and automating the way those data are analyzed, we aspire to be the first true “Big Data” polling firm. Consider some of these numbers for context:
- In the past two years, we have collected over 191,000,000 poll responses from over 13,800,000 unique respondents, segmented by demographics, geography, and consumer behavior.
- Since January of 2012, we have collected over 16,300,000 responses to a total of 337 different poll questions related to campaigns, politics, policy, and ideology, including:
- 2,240,018 observations on voters’ reaction to specific negative campaign claims
- 619,539 observations on voters’ exposure to specific campaign ads
- 581,287 observations on President Obama’s approval rating
- 516,333 observations on who voters predicted would win the Presidential election in their state
- 311,765 observations voters’ intended choice in the Presidential election
- 287,611 observations on intended choice in the Republican Presidential Primary between February 8th and April 30th.
- 202,362 observations on who won the three Presidential and one VP debate, all collected within 24 hours of each debate.
- Over 100,000 observations each on sentiment toward policy issues like energy, consumer privacy, Voter ID regulations, charter schools, health care, government spending, public education, illegal immigration, and dozens more.
- Over 58,000 observations each, on media behaviors including how much people like Jon Stewart, Glenn Beck, Donald Trump, and what TV networks and newspapers they prefer
- Over 10,000 observations on key statewide races ranging from the US Senate elections in Missouri and Virginia to the Auditor General’s race in Pennsylvania.
- Yesterday, we asked 85,798 people how likely they were to vote today. As of 6pm today, we asked 49,24 people if they have voted yet.
Click to read more ...