Before you go any further, I commend to you the following URL:
http://slate.msn.com/id/83375/
The authors explain why the model you're developing is likely to be worthless.
Even more intriguing is the sidebar that was published with this article:
http://slate.msn.com/id/83375/sidebar/83381/
The authors develop a model which has predicted the two-party voting percentage for elections since 1968 more accurately than any of the academic models out there. For example, while the academic models predicted Al Gore to get between 53 to 60% of the two-party vote in the 2000 election, this model gave Gore a bare 50.1% majority - just 0.1% off the actual result. It's an excellent result, considering that the only variables in the model are the number of points scored by the losing team in the most recent Super Bowl and whether the most recent Summer Olympics suffered a boycott.
Posted by Jonathan at October 25, 2003 01:46 PMOh, I know there is a good chance that it will be off, I've read it.
And in Rosenstone it makes it clear that most everybody, regardless of whether they were using econometrics or political science or reading tea leaves -- almost everybody seriously screwed the pooch in 1976. And as I note implicitly, not even my model could correctly guess the results in 1976 or 1964 in the South (although as I do also note, I have a strong hypothesis as to why) without the addition of arbitrary dummy variables.
(Arbitrary dummy variables make baby Jesus cry).
BUT - while the record is not perfect, it is significantly better than random guessing. And as Zaller/Bartels note in their paper, it really comes down to variable selection and how models are weighted. As soon as I can develop an easier way to implement Bayesian Model Averaging (I tried this summer, and failed) I intend to do so with this.
Moreover, modeling has value because it helps to make generalizations about what factors influence voting behavior. You may notice in my post I say that a candidate's home state advantage is likely to be about four points. That comes from my model (or atleast the two linear models; the GLMs are harder to interpret due to their bases in log-odds and such), which estimates the value to be 4.1-4.4 percent. Interestingly, this is in line with previous estimates of this value (I remember reading the 4 percent figure from one of the papers referenced in Zaller & Bartels 2001).
It's worthless to interpret this model as a statement of absolute fact. I try to qualify (and quantify) its predictive ability with the 1960 and 1956 postdictions (and as I noted, I wouldn't be surprised if I called 5 states, or so, wrong). I did this in part because in another recent post I rapped John Lott for not trying to qualify his model's predictive capability.
I have tried my hardest to implement good statistical practice, making sure that my variables are both (a) reasonably in line with established voting theory and (b) that each variable is statistically significant (using T statistics).
Thus for these reasons I feel what I have been working on for the last four months has educational value and to a lesser-extent, decision-making value.
And I hoped it would be interesting.
Although back to your note -- the Slate article does correctly take on some professors which made claims about a Gore victory in 2000. Again, as Zaller/Bartels note, their are good reasons why those claims didn't work - because they were basing the claims on the "wrong" econometric variables. More importantly, though, what was wrong about Abramson and other touting a certain Gore win was that, quite simply, they tried to make their prediction sound too authoratative.
I certainly am willing to admit that - as I noted in the opening part of the entry - that this is just a "stab" in the dark. Hopefully, it will be a very good stab -- an educated guess (in the language of science) -- as to what will happen.
UPDATE: The Excel and HTML files are now online. Sorry about the delay.
Posted by Jim D at October 25, 2003 02:22 PMSpeaking of bookies check out www.tradesports.com where you can bet on who will win each state. In general the posted odds are more favorable to Bush. In particular Bush is favored in all your leaning slightly to Democrats states and also in Michigan, Minnesota and Delaware. Bush is given a 50-55% chance in Maine (as opposed to your 14.9%) and 17-22% in Rhode Island (as opposed to your .1%). Overall Bush is given 61-63%.
Posted by James B. Shearer at October 25, 2003 05:18 PMDemographics, voter turnout and population shift dramatically from year to year. This model does not strike me as particularly well grounded from an empirical standpoint.
Posted by dsquared at October 25, 2003 10:24 PMDemographics, voter turnout and population shift dramatically from year to year. This model does not strike me as particularly well grounded from an empirical standpoint.
Posted by dsquared at October 25, 2003 10:24 PMDick Gephardt has his own idea as to which states are in play here.
I guess he has written off Flordia. Plus I would put Arizona in the "in play" category.
Posted by Karl-T at October 25, 2003 10:48 PMGiven the results of these models over the years, they're really not much better than guessing....
The problem with election modelling can be summed up by modifiying the old military adage and saying modellers are always forecasting the last election. In the fall of 2000, some of the people quoted in the article I referenced were still insisting Al Gore would get 55+% of the vote even as polls were showing Gore and Bush separated by only a point or two. One of the modellers, after the election, insisted that his model was accurate (even though it missed the actual proportion by over 6%) because the election had been an "outlier".
Modelling assumes that the electorate values the same issues in the same way for each election and thus tries to shoehorn what are essentially discrete events into a whole. However, even over the course of four years, what the electorate values can change. In that sense, having data from many years ago may actually hurt one's model more than help it; can we really assume that the factors that made a presidential candidate desirable in 1964 are the same in 2004? As Roseann Roseannadanna said, "If it's not one thing, then it's another".
It seems to me the best way to model a presidential election is to wait until a few days before the election, check the polls, and see who's ahead. I guess that would take all the fun out of it, though....
Posted by Jonathan at October 26, 2003 03:54 AMI remember having a game for my Apple IIc that was just this modeling. Interesting the cover of the box had the states correct in the election later that year.
I have been a strong supporter of Howard Dean, giving money when I am unemployed and working a July 4th event in Pasadena as well as other events.
Looking at the electoral map, which I was concerned about before, and wondering how the issues of anti-war and gay civil unions will play out in the battleground states, I have not been active recently.
No matter how much I want a straight-talking politician in the White House I want Bush out more.
Hmmm ... Gephardt seems to think Kentucky and S. Carolina are swing tossup states. Does that strike anyone else as absurd.
Sherk
Posted by Sherk at October 26, 2003 08:47 PMYeah, I'm with you Sherk.
Gephardt calls SC a swing state and NH GOP-leaning.
Absurd.
Last time I checked SC was solidly Republican at the presidential level and New Hampshire was a swing state. Or maybe it's just me.
Posted by ByronUT at October 27, 2003 12:59 AM