So I had intended to write this up sooner, but I spilled beer on my computer and lost the functionality of the "S" key. Turns out that's pretty important in the whole "typing things" activity. Fortunately, I have a backup laptop, otherwize I may rezorted to to typing like thiz, which would be pretty annoying. Enjoy and feel free to offer criticism!
METHODOLOGIES AND INFORMATION OF VARIOUS NON-IMPORTANCE
Unless otherwise noted, all data is from the 1996-2013 seasons. And this will probably be noted anyway. If you're wondering why this exploration is limited to that range, it is because:
1) The 1950's were fun and all, but probably aren't that relevant for what we enjoy today.
2) It's a lot easier to handle 17 years of data than 84 years of data.
3) Eff ties. They are a huge pain to deal with.
Unfortunately, this means I'm just outside the range of "college football started in 1995" humor, but we can all try to tough it out.
The primary tool for evaluation here is going to be bowl game results. Bowl games offer convenient data points between teams that are of (roughly) equal skill/talent, at a point where there's enough previous input to have reasonable ratings for comparison. Obviously, this cuts out a huge portion of the season that could have been used for evaluation, and that's sad. With 517 bowl games since 1996, we've still got a pretty decent size pool in which to splash around.
IMPORTANT NOTE: The effect here (for example) is that I will compare Michigan State's performance for Wk01-B10CCG to Stanford's performance for Wk01-P12CCG. If Stanford's Moxie Rating (NOTE: not a real thing) for Wk01-CCG is 4.5 and Michigan State's Moxie Rating for Wk01-CCG is 4.0, I'd predict Stanford to beat Michigan State based on Moxie Rating. I'd tally my overall prediction percentage accordingly.
However, those Wk01-CCG performances are still calculated using the entire season of data, including the residual effects of the bowl games. I did this because I'm lazy and the difference is too minimal for me to care about. Just FYI.
This "analysis" is amateur, flawed, and obviously incomplete. I definitely welcome criticism and suggestions for improvement, and though I may protest to some things, I do try to take into consideration other viewpoints. I hear that good for things like "friendship" and "being likable," so it doesn't hurt to practice :)
Eff Michigan State. This is definitely one of the important pieces of information.
Oh yeah, a link to a primer if you're wondering what's behind a lot of this. All right, on to making a fool of myself!
THE RELATIVE IMPORTANCE OF OFFENSE AND DEFENSE
My ratings favor a strong offense over a strong defense. It's not a huge bias (notably, 2008 USC is my top rated team that year in part due to an excellent defense), but it does exist. Part of this is due to the realities of math and scoring (you can only hold a team to 0 pts, no less), but the other part of this is because I really believed offense was more important than defense, "defense wins championships" be damned. I know this is the case in the NFL, and previous tinkerings around of mine supported this theory, so viva la offensive, y'all.
Well, that hasn't stopped various Sparty mouthpieces from clap-trapping about, and since I've done the work now to have some more extensive "analysis," let the reckoning begin:
Bowl Prediction Percentage (1996-2013)
Well, crap. Put the crow on my bill, I suppose.
First thought: Hey, I've got a winning record! Awesome.
Second thought: If I had started these a decade earlier, I would've given up in abject frustration after that run from 1998-2000. I retroactively blame Tom Brady.
Third thought: The predictive power of offense is clearly more variable than that of defense. That could make sense, as offensive performances are more variable than defensive performances (though Nebraska's done its damnedest to try and fix that). I'll definitely be looking into capping outliers of offensive performances to try and address this.
Fourth thought: I'd enjoyed a pretty solid seven year run prior to this past season, so I feel slightly vindicated with my recent bowl performance. The SEC and I will hold our heads proud, thank you.
Fifth thought: I wonder how much of this is due to hidden variables. WARNING: WHAT FOLLOWS IS SPECULATION ON GENERAL TRENDS, NOTHING MORE. For example, defensive performance correlates more highly against recruiting than offensive performance, probably because it's easier to scheme an offense than defense. What this means is that there are a lot of non-blue bloods and mid-majors with good offenses, but good defenses are concentrated more highly in traditionally good schools (yes, Michigan State I see you, be calm). So the effect may be less that good defenses make good teams, but that good teams tend to have good defenses.
Sixth thought: I think there may be something to that line of thinking (I'll note that my overall rating stills performs best as a predictor), but I probably still need to adjust my weighting for defense:
BCS Bowl Winners (1998-2013)
I maintain that 2012 Louisville had no business winning that Sugar Bowl.
Seventh thought: Are these plots easy enough to read and understand? I tried to format them to be fairly friendly, but let me know if anything needs explaining.
LUCK-NESS, CLUTCH-ITUDE, AND CLOSE GAME RECORDSI've previously done a FanPost on this subject, but here's an updated version of the plot referenced within:
"Luck" and Previous Year Comparison (1991-2013)
If I did my math right, the correlation of that data corresponds to a p-value of 0.09, which fails to be statistically significant at a standard p=0.05 threshold. Not that statistically significance really matters considering what's plainly visible. "Luck" is not repeatable year-to-year, and thus I continue my quest to quell those hysterics about teams over-or-under-performing their talent level. We're often looking for patterns where none exist (Tom Osbourne says "hey, y'all").
But maybe that does fail to convince you. On to the second topic of the lesson: Is there skill in winning close games? On the surface, it seems obvious that the answer should be yes. After all, the players still play the game - this isn't Dungeons and Dragons (although they start overtime with a coin flip...). Whatever the outcome, it's ultimately decided by the play on the field.
And I'd agree with you: the play on the field decides the game. Consider that close games (almost by definition) usually come down to a single drive or a single play. Skilled or not, no QB completes 100% of his passes, no RB get 5 yds on every play, no DB never blows an assignment (I think, I'm not a film guy). To say nothing of when shlt just happens and you give up a game ending Hail Mary. Wait, hang on.
Okay, I'm back. The point is, most of the time you just hope your guess is better than than their guess and that no one (tr)ucks up. There's plenty of chance involved.
But never mind all that: if there is actual skill in winning close games, if some teams bring out their inner Jordan while others piss all over themselves like my dog meeting someone new, we'd expect that to be reflected in the game results. And:
Bowl Prediction Percentage (1996-2013)
It's really not.
Okay, obviously some explanation is needed. Working our way from the top:
Cls Bwl - Cls v Cls, I only considered teams that played in close games (7 pt margin or less) during the regular season. So 2013 Florida State is removed from much of this analysis, because they didn't play a close game until the national championship. This seriously limited my available games for "analysis," but I didn't want to give an arbitrary "Close Game Win %" for teams that didn't actually play any close games.
"Cls vs Cls" means that both bowl participants had a regular season close game record for comparison (if Team A went 1-1 and Team B went 1-0, I'd pick Team B to eat Team A, based on close game record. In the case that both teams had equal records, it was a push). Again, Auburn vs Florida State would be removed from this "analysis" because Florida State didn't have a close game record for comparison.
"Cls Bwl" means that the bowl game itself was also a close game. This means this first category is most restrictive but probably the most directly appropriate to our inquiry. 178 bowl games fall in this category.
Any Bwl - Cls vs Cls, Both teams were again required to have a regular season close game record, but I opened the floor up to all bowl games where this was satisfied. Wisconsin vs South Carolina is in this category, even though Wisconsin lost by 10 pts, as both teams had close games in the regular season. Auburn vs Florida State again doesn't come into play. 444 bowl games fall into this category.
Cls Bwl - Tm Only, This category looks at the (close game) bowl results in comparison to each participant individually. This roughly doubles the sample size. So the result of the 2013 BCS Championship is compared against Auburn's regular season close game record. However, this result is not compared against Florida State's regular season close game record, because there isn't one. 385 bowl "games" fall into this category (again, several bowls are double-counted, once for each participant).
Any Bwl - Tm Only, All bowl results are compared against each participant's regular season close game record. 2013 Wisconsin's close game record (shltty) against their Capitol One Bowl loss (also shltty) qualifies. 2013 Florida State again does not qualify, as they did not have any close games in the regular season. 954 bowl "games" fall into this category (again, several bowls are double-counted).
That was a whole lot of pedantic drivel, but I wanted to try and make it clear what that plot up there is portraying. In short: your record in close games means nothing. Actually, it seems to be a slight negative indicator, judging by those percentages. If I had to hazard a guess as to why those percentages sit just below 50%, I'd say that teams that have good records in close games tend to punch a few bowls above their weight class. Most of Northwestern's recent history is good evidence of that. Teams that record poor records in close games just don't qualify for bowls to begin with (and aren't part of this analysis), or they are Wisconsin and like to make me sad.
Now, I chose 7 pts as my close game threshold just out of personal taste, but because I am brilliant, I had the foresight to configure my Excel sheets so I can change that threshold with a single click. Setting that to 100 pts (the effect of which is just to consider the total W/L record) gives:
Cls v Cls - 55% (517 games)
Tm Only - 50% (1034 games)
So the team with the better record wins their bowl 55% of the time. A winning percentage (though not as good as my ratings up there, ya know)!
(If you're wondering why the "Tm Only" category remains at 50%, it's because teams with losing records don't qualify for bowls in the first place. So a 50% prediction is required by math.)
So what's the takeaway from this section? While I'd love to lay down a blanket statement like "CLUTCH IS A LIE Y'ALL," that lays far beyond whatever conclusions this "analysis" can bring. And most of you would derisively dismiss me after that anyway (if you haven't already). So a more nuanced take would be this:
While "clutch" ability may exist in some form, it is difficult to distinguish any real effect from that of random chance. Furthermore, any real effect would be fairly minimal compared to other factors (like actually being good/crappy). Because of this, we shouldn't attach much significance to good or bad performances "in the clutch."
Except for Beilema. Eff that guy.
WHO CARES ABOUT THE LITTLE SISTERS OF THE POOR?
One of the things I hear a lot - and tends to annoy me - is the assertion that games against the Akron's and Purdue's of the world don't really matter. What matters is how you play against the best. This seems like it should be true, as you don't win titles by failing to show up when it matters.
However, this is a bit of a self-fulfilling argument. The reason you need to play well against top competition to win the conference is because you need to win games to win the conference. The margin of error is considerably smaller when playing Ohio State than when you're playing Indiana (sorry, Hoosiers).
Also, I think that in a sport with as few games as college football, performances against crappy schools can absolutely still give you valuable information about a team. Focusing just on the best opponents seriously limits the available pieces of knowledge from an already seriously limited pool.
But I get how Ohio State beating up on Buffalo doesn't seem like it should really matter. So I decided to actually try and look at this, using essentially the same structure as I did in that "Offense & Defense" section up there. Only this time, I limited each team to their five best regular season opponents (by my ranking) when making calculations. Why five? Just felt like a good number. Also, you'd be surprised how quick the SOS tails off for the Big East.
NOTE: In the few cases of rematches, each game was counted independently. So Nebraska occupies two of Wisconsin's five best opponents in 2012, for example.
This is intended to be compared against my prediction percentages from the "Offense & Defense" section, but I'll just re-post the plot so y'all don't have to scroll up and down:
Bowl Prediction Percentage, All Regular Season (1996-2013)
Bowl Prediction Percentage, Top Five Opponents (1996-2013)
Granted, it isn't a huge difference in the prediction percentage, but there's definitely a decline. More importantly, there isn't any improvement in discarding performances against the lesser teams on the schedule, which is the point I'm trying to make.
Obviously, we want our teams to play well against top competition. I remember Wisconsin beating Ohio State in 2010 much more fondly than annihilating Indiana that same year, and not just because I didn't actually attend the Indiana game (I was actually in Indiana during that game, interestingly). I just think we shouldn't dismiss good games if they came against bad teams, nor should we ignore poor games against the same. Those games still have some value, even though the margin of error is wider.
UPSETS HAPPEN, Y'ALL. THAT'S WHY THEY CALL THEM UPSETS.
I actually didn't really do any "analysis" on this topic, but I feel it's an important discussion to have and explain why head-to-head results lie sometimes. I've previously posted this plot:
Which shows the win likelihood for a favored team, based on the expected margin. The Exp Diff is based on my ratings and calculation, but that plot lines up pretty darn close with what you'd get using the Vegas spread as well:
image via Stassen
I'd like some credit for making my plot actually legible. The point is, the favorite does not always win. In fact, since that red line is my quadratic model, I have the formula:
W% = -0.0005*X^2 + 0.0320*X + 0.5000
Integrating that for X = 0 - 25 pts (this assumes that predicted margins are mostly evenly distributed over that range, which probably isn't completely accurate, but will serve our purpose for now):
W% = 79.6% (check my math!)
As it happens, this is almost exactly in line with my percentages over the last decade (79.5% average since 2000). So 1 in 5 college football games features an upset. That's a lot! If you think my system is in need of some salt, plenty of other systems out there track around 15-20% for "ranking violation percentage." Even those systems specifically designed to minimize these "violations" can't get much better than 10%.
So upsets in college football (and probably other sports as well, but I'm not really interested in those) happen all the time. They are not rare, they are not inexplicable, and they are not flukes. Upsets happen. And because of this: do not take head-to-head results as conclusive. A fair amount of the time, they will lead you astray.
Of course, this might depend on your definition of upset. Plenty of people might not consider Wisconsin's 2010 win over Ohio State (or Michigan State's win over Wisconsin) an upset because both teams went 11-1 in the regular season and won the Big Ten. The difference in quality between them is small enough that a win one way or the other shouldn't really be considered an "true upset." I get that viewpoint and agree with it to a certain extent! An upset is when a clearly inferior team prevails over the favorite.
But that also brings me back to the main point I want to make, in that if two teams are similar enough in quality that a win either way isn't a "true upset," then the head-to-head result should not be taken as conclusive evidence for superiority/inferiority of one over the other. Obviously, any game that Wisconsin loses is evidence of this phenomenon.
Also, not all upsets are close. Some of them are comfortable victories! Penn State was a 25 pt underdog against Wisconsin this past year and still won by 7 pts (I assume, as I left early to avoid the crowd and get a run in). That's not a blowout by any means, but you might reference that performance by saying Penn State "beat their projection" by 33 pts. A 7 pt underdog (winners 30% of the time!) performing similarly would win by 25 pts, something that is definitely a blowout. Or instead of hypotheticals, I can just reference actual instances, such as a 3 pt underdog cruising to a 37 pt margin (that memory's a bit more palatable).
Of course, all of the above only applies to other teams. Your team is clearly unique and should be considered separately, right?