Baseball Crank
Covering the Front and Back Pages of the Newspaper
November 15, 2012
POLITICS: Sometimes, It Really Is Different This Time - A Polling Post-Mortem (Part II of III)

The second part of my 3-part post-mortem on the polls and the 2012 election. See yesterday's Part I here.

IV. Likely vs. Registered Voters

A. Pay No Attention To The Man Behind The Screen

Near the heart of every major polling controversy this year was the issue of sampling and likely-voter screens. Polls traditionally report results among either "all adults" (whether or not registered to vote), "registered voters," or "likely voters." Historically, there's a well-recognized pattern: all-adults and registered-voter polls have tended to skew a couple of points in favor of the Democrats, and in the past this has usually been to the detriment of the accuracy of the polls. The pattern was especially pronounced this year; Bob Krumm noted before the election that Obama's strength in national polls was directly correlated with how lenient the poll's likely-voter screen was, and Nate Silver found the same effect in August, concluding:

It's all a bit of a mess, frankly. I suspect that part of the problem is that polling firms are applying likely voter methods that might have been designed 30 years ago to a modern polling universe of extremely low response rates (even the most thorough polling firms can only get about 10 percent of voters to return their calls), cellphone-only households, and an increasingly diverse and partisan electorate - and that is producing erratic and unpredictable results. There's always some uncertainty about just who will turn out to vote, but there is more of it than usual this year.

He also noted at the same time that Obama's support was strongest among those poll screens considered least likely to vote.

Those screens have worked in the past; if they didn't, the poll averages would have been useless a long time ago. This is why one of the regular rules of thumb in reading polls is that if a campaign is citing polls of registered rather than likely voters, especially late in the campaign, it's doomed. Yet that's exactly what Obama supporters were doing in the closing weeks, and more or less what Jim Messina was saying even after the election was over. For once, the registered-voter numbers were more accurate than the polls that put rigorous effort into likely-voter screening. The question is whether the pollsters actually had a good reason to do this, or whether they just got awfully lucky.

Part of what is supposed to make polling valuable is pollsters' ability to judge which voters are likely to show up to vote. They get to the likely-voter number by first, constructing a sample of registered voters, and second, applying a series of screening questions to determine which of those voters is likely to vote. That's what Messina was talking about in his argument that "traditional polling" was "broken" - the Obama campaign's theory throughout the election was that pollsters using the kinds of likely voter screens that have worked in the past (like Gallup and Rasmussen) would be wrong this time. The lesson we learned, at least this year, is that Messina was right and the traditional, professional pollsters were wrong - and that the nature of Obama's coalition made the application of likely voter screens particularly likely to affect the accuracy of the polls.

But determining which voters are likely to vote is the part of polling that is most inherently subjective and least scientific. Moreover, pollsters are often not very forthcoming about how they make these determinations, so sometimes when they release a poll, the best you can do is compare the number of registered and likely voters in the sample. As a result, a certain amount of deductive work is required to figure out why some polls give different results from others. There's only so much we can know from the outside, but it appears that pollsters like Gallup and Rasmussen were, for much of the cycle, using likely-voter screens that made traditional assumptions about who would make the effort to vote - and those assumptions just didn't hold this year, as illustrated by Gallup's final poll envisioning an electorate that was 78% white (the exits said 72%). By contrast, a number of the polls that were later vindicated were reporting results that defied all historical precedents, classifying as many as 99% of registered voters as likely voters. Their process seemed problematic precisely because it was so different from the things that made polls trustworthy in the past, but they got results.

One of the pollsters that projected a Democrat-friendly electorate and ended up getting high marks in the post-election rankings of final polls was PPP, a Democratic pollster employed by SEIU and Daily Kos, among other clients whose identities are not known. PPP's overall accuracy throughout election cycles is a longer story, but they did end up having a good record at the very end. Here's Tom Jensen, the principal of PPP, discussing how his firm determines who is likely to vote:

Jensen conceded that the secret to PPP's success was what boiled down to a well informed but still not entirely empirical hunch. "We just projected that African-American, Hispanic, and young voter turnout would be as high in 2012 as it was in 2008, and we weighted our polls accordingly," he explained. "When you look at polls that succeeded and those that failed that was the difference." Given the methodological challenges currently confronting pollsters, those hunches are only going to prove more important. "The art part of polling, as opposed to the science part," Jensen said, "is becoming a bigger and bigger part of the equation in having accurate polls."

In other words, the successful pollsters this cycle were doing exactly the same thing the poll skeptics were doing: making a more-or-less informed guess as to what the electorate would look like and weighting their results to match that. As Neil Stevens notes:

I don't remember anyone willing to say PPP was actively rigging the polls to reach chosen results, but there it is in black and white. Jensen decided in advance what he wanted the electorate to look like, and so tweaked the numbers until he got what he wanted. This isn't a whole lot different from what Research 2000 admitted to doing, folks.

In science, it's not just that you got the answer you wanted. It's the process that matters.

Research 2000, as you may recall, was PPP's predecessor as DailyKos pollster, but had to be canned for more or less manipulating its data to get to results it wanted; Kos eventually sued them for fraud, which was settled out of court. Here's what Nate Silver had to say about R2k at the time:

[I]n practice, a pollster will usually have enough knobs to twist between likely voter screens, weighting and sampling assumptions, etc., that they could back into almost any result they wanted more often than not. But there would usually be some scientific pretense for it.

In fact, Jensen's hunches changed over the course of the race. Sean Davis calculated the demographic composition of PPP's Florida polls over the course of the race, yielding the following percentages of white voters:

4/17: 71%...6/5: 70%...7/3: 69%...9/12: 70%...9/23: 69%...10/14: 66%......10/28: 64%...11/5: 66%

From April until September 23, PPP assumed an average white vote of 69.8%. From October 14 through November 5, PPP assumed an average white vote of 65%. What changed? Who knows?

It's entirely possible, of course, that Jensen has some other source of information he's not disclosing here, but taking him at his word, the "poll averages are science!" crowd should have just a little more humility about exactly what it is that they are placing their unquestioning faith in - Jensen believed that this year's electorate would favor the candidate he favored, and he was right, but right in roughly the same way pundits are right when they say their side will win. Nuclear physics, this is not.

Once Jensen set his targets, he abandoned the likely-voter screening that has worked in the past - while firms that clung to it got burned:

How PPP got it right while others, including polling titans Gallup and Rasmussen, got it so wrong goes back to a difference in method for how the firms identify likely voters and how long they conduct a poll.

Rasmussen, for example, conducts most of its polls in one night - a problem, Jensen said, because many of the voters who typically lean Democratic (including African-Americans, Latinos and young voters) are more difficult to reach in a single night. Meanwhile, Gallup uses a complicated screen with numerous questions to determine which voters are likely to turn up at the polls.

Like Rasmussen, PPP uses robocalling to conduct its polls. But its screen is much simpler than either of the other polling firms.

"We have a very simple likely voter screen," Jensen said, "'If you don't plan to vote in this fall's election, hang up now.'

"What we find is that if you're someone who's not willing to take the time to answer a telephone poll, you probably aren't going to vote. But if you are willing to take the time to answer a telephone poll, you probably are going to vote. So it's a much less-complicated voter screen than somebody like Gallup or Rasmussen has, but I think that it's a better barometer of the electorate."

It also means the likely voter screen - like Jensen's hunches - is completely opaque to the consumer of the poll. All you see is the opinions of the people Jensen decides should be polled, and who agree to talk to him.

Jensen's not the only one in his industry who describes a process that is less and less hard-science:

Even pollsters themselves conceded that the combination of demographic and technological changes had made their supposed science more inexact than ever. "We're in sort of what I would call polling's dark age," Jay Leve, who runs the polling firm Survey USA, told me earlier this fall. "We're coming out of a period of time where everyone agreed about the right way to conduct research, and we're entering into a time where no one can agree what the right way to conduct research is."

Jason Zengerle draws the obvious conclusion:

[Nate Silver's] appeal, of course, is that he's scientific. And last night, his science worked because the polls themselves worked. But as polls become more art than science, Silver's approach could become more tenuous. The good thing about pollsters - at least the good ones - is that they're constantly reassessing and tweaking their approaches. That's the bad thing, too, at least when it comes to having any certainty that about how they'll perform in the future.

B. Why The Screen Mattered So Much This Time

Why did these differences in projecting the electorate matter? In an ordinary election, they would not: ordinary winning presidential candidates have a broad enough base of support that you can see it coming pretty clearly without needing the right "hunch." But Obama was not an ordinary winning presidential candidate. The racially polarized electorate of the Obama era means that every slight shift in demographics can have an outsized effect on outcomes.

Let's look at what the exit polls tell us. 81% of the electorate was voters age 30 and up; Romney won those by 2 points, 50-48. Drilling into the state-by-state exit polls, here's a map of what the election would have looked like just among voters age 30 and up - losses with young voters cost Romney six states worth 95 electoral votes, more than enough to flip the election:

map.age.30.up.JPG

Historically, that is game-set-match; the last candidate to win the national popular vote while losing voters age 30 and up was Jimmy Carter in 1976:

voter.age.30.up.bar.JPG

Another 11% of the electorate was white voters under 30; Romney won those too, by 7 points, 51-44. These were Paul Ryan's "faded Obama posters" voters - they swung 17 points from Obama winning them by 10 in 2008. Obama's pop culture cache with young white voters had worn off by 2012 in the face of his record. That's 92% of the electorate accounted for, and Romney up 50-48 and with a decisive lead in electoral votes. In other words, the 8% of the electorate consisting of non-white voters too young to have voted in the Bush v. Gore race in 2000 accounted for the entirety of Obama's national margin of victory.

The "gender gap" was similarly a feature of race and racial turnout patterns. Romney won white women, who made up 38% of the electorate, by 14 points, 56-42; this was the biggest margin of victory among white women since Reagan in 1984. Obama in 2008 was the first winning candidate since Carter in 1976 to lose white women, but Carter lost them by 6, Obama last time by 7. Yet, Romney lost women overall by 11, 55-44. Why? He lost non-white women 85-15, including Hispanic women 76-23 and black women 96-3. Among non-white voters, Obama again maximized the group most favorable to him: black and Hispanic women were 14% of the electorate, compared to 10% black and Hispanic men, both of which Romney lost by less severely lopsided margins (Obama won black men 87-11 and Hispanic men 65-33). In other words, the black and Hispanic segment of the electorate was something on the order of 58% female; black voters were over 60% female. Romney had no similar redoubt of lockstep support - exit polls showed that even among Mormon voters, he didn't crack 80%. So polls measuring turnout had to match two highly asymmetric campaigns, one winning majority groups with support in the 50s and 60s, the other winning much smaller groups by enormous margins.

You can slice the exit polls a few different ways and see similar results along racial lines (without reference to age or gender). Nate Cohn notes that black turnout in general was key to winning Ohio, as black voters were 15% of the electorate there, up from 11% in 2008. If you look solely at white and black voters and leave out Obama's margins with Hispanic, Asian, and Native American voters, Romney wins five states he lost - Florida, Colorado, Nevada, Pennsylvania (very narrowly) and New Mexico - enough to swing the race:

map.b.w.JPG

Of course, white and black voters together are only 53% of the vote in New Mexico. Neither a campaign nor a poll analysis can safely ignore such segments of the electorate. But my point is that, because Obama's margin of victory (both nationally and in the critical swing states) was entirely the result of his outsized margins with very narrow but homogenous segments of the electorate, the accuracy of polls was highly sensitive to the relative size of this segment in turnout compared to other voters.

Yet, voters under 30 in particular have rarely been a reliable source of voter turnout; for years and years, it has almost invariably been the case that a campaign losing with the rest of the electorate but placing its entire faith on high turnout from young voters was a losing campaign. Even Obama in 2008 didn't do that: he won voters over 30, independents, and young white voters handily. His coalition was broader then, before he had a record.

It is true that Carter set a precedent in 1976 for appealing to the under-30 voters. But thanks to the Baby Boomers, the oldest of which were just hitting 30 at the time, voters under 30 were 32% of the electorate in 1976; today, thanks to shrinking birthrates and a graying population, they are just 19% and demographically likely to decline even further:

voter.age.percent.electorate.JPG

The last year in which under-30 voters were 20% of the electorate was 1992, not coincidentally the last election less than 20 years after Roe v. Wade. Take away 50 million abortions, and the demographics of the electorate look quite different in a race where the winning candidate will end up around 62 million votes. But while young voters are less numerous and traditionally a below-average turnout group, Obama for the second straight election cycle managed to increase them as a share of the electorate, closing in on their share of the population (Census data from 2000, 2004, 2008 and 2010 show 18-29 year olds as a steady 22% of the voting-age population; by contrast, with the Boomers graying, 30-44 year olds dropped in that time from 31% to 26% of the population, while 45-64 year olds rose from 30% to 35%). Here's the major age groups' turnout relative to their share of the general voting-age population:

voter.age.percent.turnout.JPG

And while the smallness of the age breakdowns among black and Hispanic voters creates rounding-error issues that make the math a little fuzzy, this chart illustrates rather vividly that the proportion of young voters among non-white voters as a whole was much, much larger than the proportion of young voters among white voters:

voter.age.race.JPG

Voters under 30 made up somewhere north of a third of all Latino voters, compared to less than 15% of all white voters. Partly that, too, is demographics; the median age of Hispanics is 27 compared to 42 for white non-Hispanics. But it's also the case that OFA maximized the showing of the few loyal groups that provided its entire margin of victory. Rasmussen came to a similar conclusion in evaluating why his polls were off:

A preliminary review indicates that one reason for this is that we underestimated the minority share of the electorate. In 2008, 26% of voters were non-white. We expected that to remain relatively constant. However, in 2012, 28% of voters were non-white. That was exactly the share projected by the Obama campaign. It is not clear at the moment whether minority turnout increased nationally, white turnout decreased, or if it was a combination of both. The increase in minority turnout has a significant impact on the final projections since Romney won nearly 60% of white votes while Obama won an even larger share of the minority vote.

Another factor may be related to the generation gap. It is interesting to note that the share of seniors who showed up to vote was down slightly from 2008 while the number of young voters was up slightly. Pre-election data suggested that voters over 65 were more enthusiastic about voting than they had been four years earlier so the decline bears further examination.

As Rasmussen notes, the demographic shift from 2008 could be higher non-white turnout, or lower white turnout (or both). Sean Trende has estimated that white voter turnout was down in absolute terms and in particular in proportion to white Americans' share of the voting-age population:

Had the same number of white voters cast ballots in 2012 as did in 2008, the 2012 electorate would have been about 74 percent white, 12 percent black, and 9 percent Latino (the same result occurs if you build in expectations for population growth among all these groups). In other words, the reason this electorate looked so different from the 2008 electorate is almost entirely attributable to white voters staying home. The other groups increased their vote, but by less than we would have expected simply from population growth.

Put another way: The increased share of the minority vote as a percent of the total vote is not the result of a large increase in minorities in the numerator, it is a function of many fewer whites in the denominator.

The 74% would be in line with Rasmussen's assumptions, which were more reasonable than Gallup's projection of a 78% white electorate. Byron York has more on the collapse of white voter turnout in Ohio by about 200,000 voters, which led to Romney getting fewer total votes there than John McCain in 2008.

The actual proportions in the voting-age population depend on how you read the Census data and break it out to exclude the non-voting-age. The Census showed non-Hispanic whites as 63.7% of the overall population (of all age groups) in 2010, compared to 69.7% in 2000, dropping to 63.4% in the 2011 Census estimate. Looking at Pew Hispanic Center data, the nation's 215 million eligible voters are 72% white, 13% black, 11% Hispanic and 4% Asian; the electorate was 72% white, 13% black, 10% Hispanic and 3% Asian, which when you do the math means that 59% of eligible black voters voted, 58% of white voters, 53% of Hispanic voters and 41% of eligible Asian voters. In other words: high black voter turnout, especially by historic standards; low Hispanic and Asian turnout but rising by historic standards and in particular rising relative to the rest of the electorate:

hispanic.turnout.JPG

Geographically, this map shows the distribution of states, with the higher percentages of non-Hispanic whites in darker blue:

map.white.people.JPG

Leaving race and age aside, other aspects of the exit polling, like the pre-election national polls and the internals of pre-election state polls, mostly present a picture of an incumbent president doomed to defeat in any ordinary political environment in recent memory. As noted, Romney won independents by 5 points; the last candidate to lose independents by more than 2 points and win the presidency or the popular vote was, again, Carter in 1976, who lost independents by 11 points but took advantage of a depressed, decimated and divided Republican base in the aftermath of Watergate and Reagan's primary challenge to Ford. With the economy the number one issue throughout the election, voters told exit pollsters they trusted Romney more than Obama, albeit narrowly, 49-48. 77% of exit poll respondents said the economy was in not so good or poor shape, and Romney won those voters by 22 points, 60-38. 59% of the voters cited the economy as the top issue; Romney won them 51-47, plus winning 66-32 among the 15% of voters who cited the budget deficit. And this exit poll question was perhaps the most dramatic of all:

exit.caring.JPG

In basically any American election before 2012, I would tell you with great confidence that a candidate, much less an incumbent, is toast if he (1) loses independents; (2) loses voters age 30 and up; (3) loses white women by double digits; (4) loses white voters under 30; (5) is less trusted than his opponent on the economy when 59% of voters cited the economy as the dominant issue in the election; and (6) loses voters who prioritized leadership, strong values or a vision for the future. That has never before been the electoral profile of a winning candidate. As I said before the election, if it that changed, we need to rethink everything we know about elections.

We do - at least, for now, when Obama's on the ballot. The OFA theory of the electorate was that "really, it's different this time" - that neither the economic doldrums nor any other factor would dampen the historic levels of enthusiasm for Obama among non-white voters under 30. And as it turned out, OFA was right: they turned out in numbers totally out of step with their historic turnout patterns relative to their share of the voting-age population, and delivered the entirety of Obama's national margin of victory.

Viewing this, to some extent, I feel like a guy who shorted the NASDAQ in 1998, reasoning that the tech bubble couldn't last forever, and ended up getting mocked by the guys who watched all their "new economy" stocks rise without an end in sight. History teaches us that those guys, of course, eventually lost their shirts - but they were awfully proud of their "this time, it's different" reasoning for quite a while. Time will tell if the believers in the new Democratic turnout model go the same way.

If it does, how will we know? Will polls run by partisan Democrats like Tom Jensen readjust their hunches? Or will we have to look outside the polls? I will look at these questions tomorrow in Part III.

Posted by Baseball Crank at 10:40 AM | Politics 2012 • | Poll Analysis | Comments (4) | TrackBack (0)
Comments

I think this is a good piece of analysis--but I want to note a few things.

1. While it's true that PPP was making assumptions about the electorate just like Rasmussen or Gallup were, the issue is not that PPP took a guess and Rasmussen took a guess and one happened to be right. Looking at just two polls makes it look like a coin-toss but it clearly wasn't: YouGov, for example, did very well. Reuters too, and so on. Across a spectrum as I'm sure you know from the Nate Silver analysis (as well as others) there were great, good, and just plain bad assumptions. Rasmussen and Gallup fell into the later grouping.

The point of averaging is to mitigate extreme choices on the part of pollsters and, again, as you know, the averages were on target. So not *everything* has changed: Averaging polls has been a good way to look at them for a long time.

2. Your most powerful single paragraph is your 6 points. These are key and make a strong statement--but I think there's a reverse statement that needs to be examined too. If someone tells me that 2008 was a fluke and 2010 was the correction -- as though mid-term elections and presidential elections are the same thing -- I have to question that. I think the more likely story is that 2008 was evidence of a new electorate and that the trends there (such as Hispanics being wary of the GOP or a higher than historic black turn-out with a landslide in the direction of a black president) are going to continue and in some cases maybe even accelerate.

In that light what actually happened is not so absurd (and the evidence that it's not so absurd is that, again, a number of pollsters--to varying degrees--got it right). So, too, did betting markets which are certainly fed by the polls but should not be entirely lumped in with them.

Posted by: The Political Omnivore at November 15, 2012 12:25 PM

You still are holding on to this accusation that polls like PPP "we're doing the same thing" as you were when you started your analysis from the Party ID data.

This is incorrect. Party ID is attitudinal data. It is a result, not a starting point.

Polls like PPP weight based on hard demographics like race, region, and age, not Party ID. Party ID is a result of their polls, not a cause of it. No reputable pollster would ever weight their polls based on attitudinals like that.

Posted by: everdiso at November 15, 2012 11:05 PM

Crank,

If the polls have sampling errors due to outdated methods, as your point A. contends, why did Nate Silver use traditional polls to come up with his 100% accurate prediction?

You still seem to be stumbling to find an answer as to why you blew it. It doesn't matter how you rationalize it after the fact -- you simply blew it.

Part of why you blew it cannot be captured in the data analysis. If you live in a conservative bubble, where all you read, watch, or listen to is GOP-centered, then that leads itself to a false reality.

In the end, who cares why or if you blew it? Move on to the Hot Stove League and/or downplay your political crap.

Posted by: Who Cares at November 16, 2012 8:20 AM

I fail to see how winning now does anhniytg for republicans. They had a decade of the majority and the presidency and look what that got us. I guess Congress will keep changing hands every two years until? Until what?

Posted by: Lorenso at November 22, 2012 10:37 AM
Site Meter 250wde_2004WeblogAwards_BestSports.jpg