Reply

by suave_andrew, Tuesday, July 27, 2010, 18:46 (5797 days ago) @ LaFortune Teller

My projects tend to be relatively theoretical, so I realize that there is always going to be some issues with methods.

And this is a really cool project.

I guess I'll wait for the article to understand the methodology more completely. For instance, the recruiting and schedule ratings are easily grasped and generally well-accepted measures.

The only thing I've changed with recruiting is that I'm going by average rivals stars rather than the rivals ranking - the avg stars tend to be more correlated with win % than ranking (likely attributed to some bias).

Home field advantage ought to be relatively easily understood as well -- depending on the methodology. Personally, I haven't been convinced that the quantifiable differences between best and worst home field advantages are all that dramatic.

Home field advantage is basically the difference between the actual home field win % and the team's expected total win %. The expected comes from the team's performance at home (home win%) plus the number of total wins that came from home games. This produces a pretty accurate picture in my opinion, because it takes into account the number of games played at home as well as a team's reliance on their home field versus their performance.

The development, game day, and cluck/luck measures make sense, but are also going to be heavily scrutinized depending on the methodology.

The development one is pretty solid. I would argue that using the NFL draft as a barometer of a team's player development makes sense when you figure that a team which produces a higher than expected number of prospects is a program where you'd expect to find players that are more fundamentally sound all around.

Here are a few big questions I have:

1. Are you trying to make a "complete" picture of a team? Or a coach? It seems like the approach you are taking with at least a few of these items is drawing a connection between performance and expectations -- that there are teams/coaches that get more out their raw materials in development and game day performance than they ought to.

My intent with the profiles is to take a five year snapshot of that program and say, "from 2005-2009, the program's total winning percentage for the period (or simply performance) can be explained by these rankings. So if a team recruited well, developed its talent well, and had a relatively easy schedule, but only won something like half its games, then you would expect to see poor rankings in the other areas: especially gameday performance.

If this is where you're taking it, are you trying to actually identify areas teams/coaches could concentrate on for improvement, or are you trying to identify the status of a particular team/coach as a way to see through other basic data that might be clouding our judgment? And are there clear distinctions between some of these things, or are there overlaps (between, say, clutch/luck and game day)? And if it's supposed to be complete, is there anything missing?

The way the gameday one works is that it's basically a catchall statistic given all the other factors. So it's not actually derived from any on-field statistics, but rather saying "given all of the other factors, the difference between actual and expected win % is best explained by what occurred on the field." Which means Clutch or Luck is factored out, as is talent, schedule, and everything else. It's definitely the most theoretical part of this.

It helps explain why a team like Southern Miss (ranked 11th in game day performance) who only had average recruiting, terrible player development, no luck, and really no home field advantage was able to post a 57% winning percentage for the period (better than Notre Dame's). That and a relatively easy schedule.

2. Are the scales important? Are you doing the analysis a disservice by assigning a 1-120 rank for both rather than using the ratings themselves? The difference between #1 and #15 in recruiting, for instance, might be equal to the difference between #1 and #120 in home field advantage. Or it might not. But without that nuance, it makes understanding the complete picture more challenging.

Using rankings rather than relative scales provides an easier to understand way to communicate the profiles to people who are not familiar with the process and don't want to really take the time to figure it out. Also, the rankings did end up making the model more explanatory and the ranking variables were more significant - I'm not sure what that's a result of, maybe just that the relative scale isn't all that important?

3. I like the use of 5-year comprehensive data for a picture of a program. But I think it is a fairer picture of a program taken at the end of the 5-year period. In other words, when you are making comparisons and drawing conclusions, I think you should be careful about grouping teams that might have had success at opposite ends of that period. Alabama in 2009 was certainly a function of its 2005-09 recruiting, development, game day, etc. But in the same table, is it right to talk about USC in 2005 as a function of its 2005-09 recruiting, development, game day, etc? (Maybe this isn't what you're doing, and maybe that would make this far more challenging to do, but I think 2005 teams should be evaluated on 2001-05 data, 2006 teams on 2002-06 data, etc).

An eventual goal is to be able to transfer my profile system into a yearly team profile and maybe even try to use it for predictive purposes. But right now I think the easiest and least time consuming approach is to just use five years worth of data because it dampens out a lot of randomness you get from year-to-year data. It obviously does not give a full description of teams that have undergone coaching changes (i.e. Alabama), but the profile still does describe that five year period in total.


Complete thread:

 

powered by my little forum