A Primer on Hoegher's Rankings and Season Previews
[NOTE: I've decided to update this page for those that are more interested in the exact formulas and calculations I use. Here is a direct link to the Excel file for my 2011 Rankings. I won't pretend it's heavy on explanation (I obviously know what everything means), but the cell formulas are all there if someone wants to go through and look at it themselves.]
The subject of college football rankings has always fascinated me. Whether the AP, Coaches, BCS, and so forth, it amazed me that pollsters/robots were able to take 120+ college football teams with (relatively) little inter-connectivity and rank those teams in a coherent fashion.
I'm sorry, did I say amazed? I should clarify, it incensed me (as much as I could, being a teenager with relatively little football knowledge beyond Randy Moss, Cris Carter, and Tarvaris Jackson* are awesome!). It felt un-fair to me that small schools such as Utah or Boise State weren't ranked as highly as other, more name brand schools. They beat everyone put before them, shouldn't that be enough?
*Don't question me on this. I will fight you.
Eventually, I realized that factors other than winning percentage needed to be considered in any rational ranking of teams. Schedule strength and margin of victory would be examples of some of the more obvious ones. Which was fine, but I seized upon another gripe: it seemed to me that these criteria were applied selectively to teams (I think the colloquial name for this is the ESS-EEE-CEE Effect).
And this is where computer and numerical rankings of college football teams appealed to me. Everything that a human attempts to do in evaluating and ranking teams, a computer does in an un-biased and comprehensive manner (copyright Wikipedia). The only limitations are time and whatever criteria are chosen for review. A marvelous solution, and having time on my hands, I set out to create my own*.
*HOEGHER'S FIRST RANKIN'S (by Mattel)
The first question I had was: where do I start? I knew that I really only wanted to consider the games a team played and the scores within those games. Anything more would require much more time and effort than I was willing to give (I'm a college student, I have to make time for beer). Inspiration for criteria and calculations came from some intuition, some incorporation of ideas found on various other blogs and readings (Jerry Palm's RPI, Colley Matrix, Brian at MGoBlog, Year2 at TeamSpeedKills, and of course Bill Connelly at FootballStudyHall, of those I can recall), and some blind guess work. I tried to minimize the last one.
I went through several iterations of rankings systems, some of which I found merit in, most of which sucked, and all of which went by the evaluation process "eh, looks good." I'll spare you the full list* and talk only about my current system(s) and their methodology:
*In semi-chronological order of creation: Game Points, Hoegh, Resume, Rel Eff, Prediction, E-O Ratings, Adj Eff, Adj Marg, Resume (revised). Almost all of these were stupid and ill-conceived, but still brought me endless joy when I would type in scores.
Rel Off, Rel Def
Consider two teams: Team A and Team B.
Team B (on average) scores 20 pts/game on offense and gives up 20 pts/game on defense. Let's call them the Fighting Opponents, if you want a nickname.
Team A has its own scoring offense and scoring defense data, but that's too many numbers. Let's also not give them a nickname, because I'm not very clever.
Team A plays Team B. The score ends in favor of Team A, 40-30. Huzzah Team A, you defeated the Opponents! (correction: I am very clever) That's nice, but what can we get from that besides the W?
I decided you can calculate the Relative Offense (Rel Off, for short and now and ever) of Team A for the game by:
Rel Off = (pts scored by Team A)/(avg scoring defense of Team B)
Similarly, the Relative Defense (Rel Def, for short and now and ever) of Team A is:
Rel Def = (pts scored by Team B)/(avg scoring offense of Team B)
So for this game, Team A had a Rel Off and Rel Def of:
Rel Off = 40/20 = 2.0 Rel Def = 30/20 = 1.5
Do this for every game that Team A plays, and the average Rel Off and Rel Def of Team A can be calculated. And - while I was lazy above and didn't post Team A's hypothetical scoring stats - this can be done for every team in college football. So long as you have the data. And time. Viola, stats for comparison between teams!
NOTE: I apologize for the poor formatting of the equations. My computer runs on Windows XP, so it's a little behind the times.
Adj Off, Adj Def
Except not quite. Sorry for blue-balling you. I've been there, it's not fun. Like a... ahem. Anyway!
So Rel Off and Rel Def is nice, but it doesn't take into account disparities in opponent quality. At least, not enough for my tastes. So I came up with Adjusted Offense and Adjusted Defense (Adj Off, Adj Def for short). I won't walk you through the actual math here. It's not a complex calculation, but it's boring and I'm unsure how to succinctly explain it. (now excuse me while I prattle on for another several pages)
Suffice to say, if Team A plays a schedule against relatively good defenses, the Adj Off will be an increase compared to Rel Off (that is to say: better). If Team A plays a schedule against relatively good offenses, the Adj Def will be a decrease compared to Rel Def (that is to say: better). The vice versa is obviously true as well. The actual adjustment is fairly small over the course of a season*, but I think it's a good adjustment that reflects the differences in opponent strength.
*In the first few weeks of the season, it is delightfully chaotic.
The intended effect of this adjustment is to say:
if Team A played a perfectly average opponent, how many points would they score? How many points would they give up? How does this compare to other teams?
Ideally. C'est la vie.
Adj Eff, Adj Marg
So those are nice (I think so, at least. Please limit your criticism to backhanded compliments, I'm fragile.). But how to combine those into one rating for comparison? (SO MANY POSSIBILITIES). I came up with two simple ratings: Adjusted Efficiency and Adjusted Margin (Adj Eff and Adj Marg).
Adj Eff = Adj Off/(Adj Off + Adj Def)
Adj Marg = Adj Off - Adj Def
Adj Eff was inspired by MGoBlog's GopherQuest [LINK]. Ideally, it represents the ratio of points Team A would score to the total points scored if it played a perfectly average opponent. It tends to benefit good defense over a good offense.
Adj Marg was inspired by seeing Auburn ranked 7th in 2010 Adj Eff and reasoning that something needed to be done. (they are ranked 1st in 2010 Adj Marg, which was better) Ideally it represents the (scaled*) margin of victory Team A would have over a perfectly average opponent. It tends to benefit good offense over good defense.
*The scaling factor is roughly 25**, depending on the season.
** Okay, actually the scaling factor is usually ~27. 25 is just an easier number to use for back of the napkin calculations and illustrations.
My "official" rankings are based on Adj Marg, because those (generally) tend to give slightly better predictions for game results. But I think both have some merit, so I mentioned them both here.
For some demonstration and criticism, here's the Top Adj Eff and Adj Marg for each year dating back to 2005. I'm completely comfortable with Texas taking the top slot in 2009, Colt McCoy getting injured was rather unfortunate:
First, a note:
While I put a reasonable amount of thought into the rational and methodology of my Adj Marg rankings, I did not do the same for my Resume rankings. This is mostly because I think that evaluating teams on wins and losses is uncertain at best (which I'll expand on later). I include these primarily as a point of comparison between the Adj Marg rankings, which don't take winning or losing into account.
Okay! I've actually recently revised my Resume ranking system. My previous system, I was honestly pretty proud of. I thought myself fairly original and clever in its design, and it matched up decently well with several of the BCS computers. Two problems:
1) The rankings were fairly unwieldy (and I'm lazy)
2) The ordering of teams wasn't unique
By that last point, I mean that depending on how I went about it, I could get a different ordering of teams for the same information. The difference was fairly small, but still: I didn't like it. (also the laziness thing)
So I scrapped those and came up with a new system, again only taking into account wins and losses. I won't explain the whole formula, but I will explain the core component.
First, assume that not all wins are created equal. (I think we can agree on that). Beating Alabama last year is worth quite a bit more than beating New Mexico State. Similarly, not all losses are created equal. Losing to Oregon is not quite the same as losing to Army (sorry, Northwestern). So rather than assigning each win or loss an equal weight, scale them by the opponent winning percentage.
Win Quality = Opp Win %
Loss "Quality" = (1 - Opp Win %)
To illustrate, beating a 6-6 team is worth only half a win, beating a 12-0 team is worth a full win (even though beating them would mean they are no longer perfect, and just shut up, I'm just using this for demonstration purposes), and beating a 0-12 team would be worth basically nothing.
Similarly, losing to a 6-6 team is worth only half a loss, losing to a 12-0 team is hardly impactful, and losing to a 0-12 team would LAWLS-worthy of the highest order (and worth a full loss, though again they would no longer be winless, and JUST BE QUIET ALREADY). Again, the full calculation is more complex than that (as you might guess), but this is the basic idea that governs the system.* And it seems to work reasonably enough. The SEC cleans house with this, though. That's not terribly surprising, but still disappointing.
*You've probably already guessed this by my love for Tarvaris Jackson, but again: I will fight you.
For some demonstration and criticism, here's the Top Adj Eff and Adj Marg for each year dating back to 2005:
Flaws in the Rankings (really just pertaining to Adj Off, Def, Eff, Marg)
Full disclosure: I'm an amateur (shocking!). I do these because it's fun for me, and I do put some thought into the rationale behind these, so I think they are somewhat sound. But I fully admit that I have limitations and should not be considered an expert by any means.* I welcome criticism, especially as I make mistakes and that can offer the insight for me to correct them, but again: these are for fun. If you want a fully qualified opinion, there are lots of other voices out there! (Bill Connelly seems to do an excellent job on SBNation and Football Outsiders, in my opinion).
Additionally, as noted before: I only track the games played and the scores of those games. No play-by-play data, no situational statistics (e.g. 1st quarter scoring), not even home/away effects. Sorry, I'm a busy student with precious little free time as it is (BEER!). In no particular order, an attempt to address the deficiencies in the numbers:
You Don't Take Wins or Losses Into Account. Wins and Losses Matter.
Correct, I don't. For a long time, I sat on the other side of the Isle of Man (is that like Whore Island for girls?)... excuse me, other side of the aisle. Damn homonyms. The whole point of a football game is to crown a winner and a loser. Good teams win, bad teams lose, that should be reflected somewhere, right? [just assume this is in Sterling Archer's voice LINK TO ARCHER] I gradually changed that train of thought, eventually discarding it completely last fall after watching the Oklahoma State - Iowa State game (I'm sure you all remember it). Let's take a trip down Memory Road (if you make a wrong turn, you go up Freud Pass, and that's no place any man wants to be)
End of the game, Iowa State has fought valiantly, but Oklahoma State is lining up for a game-winning field goal. Time expiring, the kick goes up... and misses by - quite literally - a single inch. I actually thought it was good when I first saw it go through. Iowa State goes on to win in overtime, as I'm sure you know.
Now, some may quibble that overtime had yet to be played and the game wasn't won or lost on that kick, but - the game was won or lost on that kick. A single inch was the difference. Now, suppose that kick was an inch more inside, Oklahoma State wins that game. Does that mean that Oklahoma State (THE DOS) is any better than Oklahoma State (THE UNO)? Because of a single inch on a field goal kick? I concluded: no, there's no real difference between THE DOS and THE UNO. And if you're trying to rank teams by their quality and who's better or worse, then using wins and losses is a poor metric to accomplish that.
I'm sure that there's things such as "clutch" or "choking" or "coaching"* (hello Texas A&M and Kansas State!) that is better reflected in win/loss records than my numbers, but I also think those things are nebulous at best and impossible to identify at worst. So rather than try to incorporate that in a way that's poorly justified, I think it's better to just note the differences when they occur and move on. (KenPom made a blog post [LINK] like this with regards to Wisconsin basketball earlier this year) And besides, that's why I offer the Resume rankings as a point of comparison.
*Bret Bielema says coaching is for pussies.
You Don't Set a Limit on Margin of Victory
Yes, but this doesn't cause me too much worry. I recognize that the difference between a 30 pt win or a 40 pt win are pretty minor in terms of judging which team is better. They're both blow-outs, who cares? And the offensive and defensive game-plan changes when a team is up (or down) by that much. I know that Football Outsiders corrects for this by ignoring plays and drives when the game is out of range. Since I only track the final scores of each game, I don't really have that option. (It makes me feel impotent, but I still try to push on... well, we don't really need to talk about that. Ignore the previous sentence. Please?)
In that case, I would need to add a qualifier that ignores scores above a certain threshold. But that distinction and the effect it would have on a Team A's Rel Off, Def (and subsequently Adj Off, Def) would seem very arbitrary. (which is something I'd like to avoid) I know a commonly used number to use as a limit on blowouts is a 21 pt margin, but I don't see why that's necessarily better than 20 pts or 24 pts or 30 pts.
Finally, I think that not having a cap on the margin of victory (or loss for that matter) is better for identifying the differences between teams. Oklahoma 2008 won a lot of blowouts, and capping their scores to an arbitrary margin would reduce the ability to measure their offensive prowess (if that makes sense). Anyway, the point is I recognize that possible issue, but I think it's minor at worst. (SO SHUT UP... sorry. Please voice your criticism in an appropriate manner.)
Shut-Outs Have Too Much Effect on the Rankings
This is partly by design, but it is an annoyance I think I'd like to try and fix.
I decided to go by score ratio (compared to average) instead of score differential (compared to average) for two reasons:
1) I liked having my own system
2) Scoring 20 pts on Alabama (gives up 10 pts/game) vs scoring 40 pts on Syracuse (gives up 30 pts/game) is not equal.
I stand by that reasoning. Unfortunately, that means that shut-outs are weighted equally. I think a shut out is impressive, but I don't think it should be considered the same to weight a shut-out of Oregon the same as a shut-out of Florida Atlantic (for example*). I do intend to correct this, and I'll update my rankings if it proves useful.
*In no way to I intend this as a troll on Michigan State fans. Nope, not at all.**
** In all seriousness, the Owls were horrible last year.
You Don't Correct for Overtime Games
This is by far my biggest concern regarding my system. I rate teams based on full game totals. Overtime (by definition) is beyond the total of a game. For something like the Football Outsiders Rankings, no problem (they track plays and drives). However, I simply track game scores. So overtime can inflate/deflate a team's offensive/defensive prowess.
Some of the time, this isn't a huge issue. Games that end with 3-0 overtime scoring don't really matter. But there are other games that having 21+ points scored in overtime, and that can have a significant effect on that offensive and defensive ratings for each team involved. Overall, I don't think it's a huge difference (especially at the top), but it's definitely something that I think I need to correct.
Unfortunately, I don't know of a good, clean way to do this. Best way I can tell is a brute force method of going in and editing scores (if a game ended 20-20 going into overtime and ended 26-23, adjust it to 20.5-20, for example). ESPN (and a data sheet helpfully provided by Ed from The National Championship Issue) can help me there. The issue is that by the time I noticed this flaw in my methodology, I had 6 years of rankings on file. Going through 8000+ games didn't sound fun, so I put it off. Sorry.
On a related note, the several called games (from rain or whatnot) this past year infuriated me. My ratings are based on full game totals, and I didn't know what to do with games that were called (almost exclusively blowouts) in the 3rd quarter. Eventually, I decided to just leave the games as is, but still: annoyed.
You Don't Take Into Account Home Field Advantage
I already noted this above, but I thought I'd expand on it again. I don't deny the effect of home field advantage. Heck, I cheer for Wisconsin and Camp Randall is definitely a friendlier atmosphere for the Badgers than any other place. At one point, I had a half-assed reasoning that good teams perform well at home and on the road, so home or away shouldn't matter. (I realize that's foolish thinking, don't worry). Now, it's just that it would be a huge pain to both:
1) go through my seven years of data to correct each and every game
2) adjust my calculations to reflect home/away performances
I kind of like the uniformity, so I have just left it as is right now. And honestly: I don't think it matters that much. Good teams perform no matter the environment, and I think that over the course of the season, that gets through the data. But maybe I'm just wrong. I am open to that being an occurrence (but just once or twice).
2012 Season Previews
Okay! Now that I've gotten through... [checking word count]... wow, 3000 words? Sorry for the length. Anyway! I have all these years of data, and I thought I might be able to use them to preview and predict the 2012 season for each Big Ten team. I also thought I'd have more time before the OTE staff began their 2012 primers, so this primer is a little rushed. Sorry if there are any typos (I ABHOR TYPOS).
Now to give you an idea what to expect, I'll show the template of what these previews will look like. This will just a be a bare bones slate, the Big Ten Previews will be much more in depth (Unless I'm hungover. Fair warning.) I've got data on all the FBS teams, so we'll just pick one at random. How about... West Virginia (HOLGO!).
2012 Season Preview and Projections (West Virginia)
RichRod Never Should've Left (West Virginia Performance 2005-2011)
Figure 1: West Virginia Yearly FBS Rank (2005-2011)
Figure 2: West Virginia Yearly Adj Off, Def Rank (2005-2011)
Figure 3: West Virginia Yearly Win % and Luck (2005-2011)
I've decided to display the historical trends of a team by FBS Rank. I think that's an easier metric to gauge rather the Adj Off, Def, Marg data. I know my metrics pretty well (more or less), but we all know that a team in the Top 10 is usually pretty good. Sure, that cut-off can change year to year, but I think it's a good enough snapshot.
HOLGO [insert Red Bull References here] (West Virginia 2011 Season)
Won Orange Bowl vs Clemson (a billion 70-33)
Just to clarify: Opp Rat is the average Adj Marg of opponents, with tougher schedules being ranked better. Luck is the difference between the actual winning percentage and the expected winning percentage (Based on Adj Marg. In short, a team should beat those teams ranked below it). Based on my numbers, West Virginia should've won more games than they did (especially because they played in the Big East).
[insert more Red Bull references] (West Virginia 2012 Preview and Projections)
Avg Opp Rank: 49.1 (Avg Non-Con: 100.0, Avg Conf: 32.1)
Final Record: 9-3 (6-3), Avg MOV: 9.2 pts, Bowl Prediction: [don't care, not Big Ten]
- toss-up games
- best case/worst case
- individual games to highlight
NOTE: I don't know why the 2012 Schedule table is coming in poorly. I'll work on fixing it for the Big Ten previews.
Clarifications: numbers in red are negative. So that Week 11 game against Oklahoma State is predicted to be a loss by 7.5 pts. Obviously, the Big 12 is a step up from the Big East.
Also: the projected Adj Off, Def are based on the past five years of data, with
0.33 weight for the past year, and 0.67 for the previous four years (EDIT: my bad. That should read 0.33 weight for the previous four years, and 0.67 weight for the last year. I know that seems high, but it's really the formula that gives the best correlations for the upcoming year). Based on some preliminary calculations, that seems to be pretty decent in predicting the next season's performances. Obviously, this doesn't take into account the effects of momentum through the years or coaching/player changes but: such is life. I might re-visit that later, but that's the formula that will be used for all the Big Ten previews.
So... that's roughly what the season previews will look like (though obviously with some more discussion rather than just [discussion]). There may be some minor aesthetic tweaks. If any of y'all have any suggestions on how to better present these, let me know! I plan on putting each of these out when the corresponding Big Ten Preview on OTE occurs, which I think makes a lot of sense.