OT: Critical Problems Facing College Football Analysis

by suave_andrew, Sunday, February 06, 2011, 10:37 (5116 days ago)

I was recently browsing an old 2005 issue of Journal of Quantitative Analysis in Sports (great bed time material...) and came across the brief article linked below by Aaron Schatz of Football Outsiders. Schatz follows the approach of David Hilbert, who posed a series of critical problems facing mathematics at the turn of the 20th century. The article turns a critical eye to issues facing football analytics:

http://tinyurl.com/4gxvrbf

I thought it'd be interesting to go through a similar exercise for college football. I think that you need to first separate out the issues facing both the NFL and college game, and keep it college-specific. In my opinion, the NFL provides a far better way of analyzing things that are general to the game of football, and Schatz already provides a good list of the main issues affecting the general analysis of football and data collection.

College Football-Specific Issues - Analytical areas that need to be addressed and/or better understood that are specific to the game of college football for the most part. These are not necessary in order of importance, but rather the order in which they popped into my head:

1) Recruiting - This is probably the most visible input into college football programs and also probably the most poorly analyzed 'predictor' of a team/coach's future success. More analysis needs to be done on the proper utilization and value of recruiting service data. Relative recruiting - how talented a team is in comparison to its actual opponents - is not even looked at as far as I can tell, yet is probably the only thing that should really matter from a usefulness standpoint. There is basically no recruiting data for lower division teams, and service academy recruiting data is very poor. Finally, bias is an issue with recruiting services, be it regional, for marketing purposes, or for revenue generation (i.e. you must attend our camp to get rated, etc). It's also conceivable that a player's family or representative(s) can buy the player coverage from recruiting services.

2) Player Development - Development is mentioned a lot, but little has been done to try and quantify it. I personally think that comparing recruiting performance with the number of players a program puts into the NFL is a good starting place, but it's far from perfect. Because it appears that recruiting has diminishing marginal returns, it's arguable that player development becomes an increasingly critical factor of top level program differentiation.

3) Strength of Schedule - I don't think there's an adequate way of looking at this yet. This sort of goes hand-in-hand with identifying objective ways of ranking programs relative to one-another. There's a million ways to try and compare programs and use their difficulty of schedule as a 'deflater' for this purpose, and while some methods may be more popular than others, I don't think there's truly one best way of doing it. Like with recruiting, I think that strength of schedule should be completely relative: the only thing that should matter is how much of an upfield battle did a team face against its opponents? How much of a talent difference was there? Was the team a visitor to an extremely tough environment? What was the weather like? What was the margin of victory relative to all of this? None of this stuff is really factored in. Sagarin, for example, uses a base home field advantage, but there is absolutely not some 'standard' home field advantage across every program in college football.

4) Objective/Fair Ranking - Linked closely to the SOS issues above. The Massey consensus rankings are probably the best and most objective approach at this point until SOS is addressed better. The AP and Coach's polls are probably the worst and least objective approaches that we have.

5) Home Field Advantage - As mentioned in the SOS portion, I refuse to believe that a standard homefield advantage can be applied to every home team across college football. The atmosphere, noise, and fan involvement at places like Eastern Michigan or Temple are simply different than what you will find at LSU or Penn State. Scheduling becomes involved because top-tier programs play more home games on average than bottom-tier programs. Top-tiered teams have more bargaining power and thus have an advantage in scheduling that multiplies the home field advantage they're already getting from good venues. There is hardly any way to really quantify most of this at this point, however.

6) Offensive Style/Tempo - There is virtually no quantitative analysis done on different types of offenses that you see in the college game. Part of it is because formations aren't listed in play-by-play data, so it's virtually impossible to analyze unless you intend to sit down and chart every game after they're played. Variations in offensive tempo looks to be the next major trend coming down the pipeline, thanks to Oregon's run to the national championship game this year. Dividing the number of plays a team runs by that team's time of possession yields some insight into tempo, but doesn't tell the full story - special teams plays, running out the clock, etc. jumble up time of possession.

7) Program Growth/Regression/Cycles - What factors cause some program to experience growth in yearly performance over time while others experience regression? Are there measurable performance cycles that just about all programs go through? At its core, this is sort of the macro economics of college football, and virtually nobody has looked at it. TCU would be a good case study of program growth; Miami of regression.

8) Affect of Player Experience (years in program) - Is there a predictable way to measure how a team will perform relative to the amount of experience of the players in its starting lineup? This is being looked at somewhat, but nobody really knows for sure. I personally think that experience is tied closely to player development and there are limits to how well an experienced but underdeveloped or not very talented player will perform. Further, is it better to evenly distribute the number of players you bring in each year, or overload (over sign?) some years in order to maximize the amount of experience of the starting lineup every few years for championship runs?

9) Affect of Program Revenue - There's been a bit of work on this in the field of sports economics, specifically in analyzing the BCS and whether or not it's affecting competitive balance of college football. Many public schools disclose how much revenue their football programs bring in on a yearly basis, as well as their expenditures on their coaching staffs. But little is known about how much money schools devote to recruiting, to their training programs, to academic tutors, to year-to-year facility improvements, etc. Football programs at private schools don't need to disclose anything that they don't want to. Knowing this information would make it possible to develop a 'production function' for different programs, but the data is not available or if it is, the time hasn't been taken to make a database of it.

10) Transfers/Player Movement - Every year hundreds of players transfer to different schools, leave for the NFL early, 'retire' from football due to injuries, or simply drop football (maybe even school) all together. There is no analysis into player movement beyond the anecdotal news or blog writeups that may or may not occur when this happens. There is definitely no real macro analysis looking at trends in any of this stuff, other than maybe oversigning - but even that is not overly analytical. A good start would simply be some kind of database that tracks player transfers/movement for each team on a yearly basis.

11) How to compare/incorporate lower football divisions - Even hardcore college football fans probably have a hard time telling you which division of football schools like James Madison and Northern Colorado are in. Besides the Sagarin rating system, there really isn't any work on evaluating lower division football teams, let alone figuring out how to compare them to upper level teams. This throws a monkey wrench into any kind of analysis on strength of schedule, ranking, and weighting a teams performance when teams play lower division teams. Nobody really knows how to approach it, yet just about every upper division football team is playing at least one game against a lower division team now.


I'm interested to hear other people's thoughts on the subject.

Tags:
charts, stats, draft


Complete thread:

 

powered by my little forum