In a post above I showed how different statistics relate to winning percentage. The most obvious point illuminated in that post is that, of those 18 statistics, the ones most closely associated with winning are Scoring Defense and Scoring Offense, in that order.
But this made me wonder: How do the 16 other statistics relate to Scoring Offense and Defense respectively? Afterall, I now have all the data for every I-A team for the last 6 years; that's one hell of a sample size. So the data given will reliable. And the new calculations would only take a few minutes.
But what a gold mine. I am very interested on hearing the opinions of some of the football gurus on this board about what some of this data means. I'll offer some of my own interpretations; but with any luck, this will be far from the last word on the subject.
The Data
The table below shows:
Column 1: The 18 stats in descending order of their correlation to winning percentage
Column 2: Their correlation to Scoring Offense
Column 3: Their correlation to Scoring Defense
Category____________________corr. SO_______corr. SD
Scoring Defense_____________0.5781__________1.0000
Scoring Offense_____________1.0000__________0.5781
Pass Efficiency Defense_____0.5336__________0.8849
Rushing Defense_____________0.5871__________0.8650
Total Defense_______________0.4504__________0.9088
Pass Efficiency_____________0.8338__________0.4989
Turnover Margin_____________0.5832__________0.6639
Total Offense_______________0.8819__________0.3639
Net Punting_________________0.2323__________0.5330
Punt Return Defense_________0.2694__________0.5423
Rushing Offense_____________0.4969__________0.4782
Punt Returns________________0.3954__________0.3566
Kickoff Return Defense______0.3413__________0.3280
Pass Offense________________0.4914__________0.0550
Pass Defense________________0.0547__________0.5243
Kickoff Returns_____________0.2547__________0.1746
Penalties__________________-0.1297__________0.0089
Penalty Yards______________-0.1608__________0.0324
Observations
First off, one of the many checks to make sure the math was right was verifying that yes, Scoring Defense has a 1.0 correlation to itself, and so does Scoring Offense. Likewise, it is noteworthy, but just barely, that there is a 0.5781 correlation between the two. Very good teams tend to be good in both areas. Very bad teams tend to suck at both. The teams in the middle are mixed. Whatever.
Further, it is not terribly surprising that Total Defense has the highest correlation to Scoring Defense and Total Offense has the highest correlation of any of the stats to Scoring Offense. If anything, this only serves to give me confidence in the data.
Rushing
Now this starts to get fun. Lets take another look at the rushing part of the table by themselves.
Category____________________corr. SO_______corr. SD
Scoring Defense_____________0.5781__________1.0000
Scoring Offense_____________1.0000__________0.5781
Rushing Defense_____________0.5871__________0.8650
Rushing Offense_____________0.4969__________0.4782
First off, look at Rushing Defense's correlation to Scoring Offense. Is that cool or what? There are only two stats more closely correlated to scoring offense than is Rushing DEFENSE, but Rushing OFFENSE is NOT one of them!!
To me, the reason for this seems simple and I'll call it "The Blow-Out Effect". When you have an extremely high-powered offense, the other team typically ends up throwing the ball all over the joint trying to catch up. Not only does this drag down the # of carries per game, but even drags down the yards per carry as the number of sacks goes up (due to the NCAA's perverse habit of subtracting sacks from rushing yards). The Rushing Defense ends up with better numbers, but may not necessarily BE better.
But this phenomenon seems to exacerbate the problem of the already unexpectedly low correlation between Rushing Offense and Scoring Offense. When you're getting killed, you tend to not rush as much; when you're blowing the other team out, you rush more so as to kill the clock. Shouldn't that INFLATE rushing (per game) numbers? Why on earth is Rushing Offense SIXTH on the list of things that are most correlated to Scoring Offense, and only barely above Passing Offense at that? Better football minds than mine will have to sort that out. All I can tell you is that I have checked and rechecked the data and the calculations.
What is less mysterious but still up for conjecture is why Rushing Offense has only a slightly better correlation to scoring offense than to scoring defense. Some might suggest that the offense that is good at rushing is keeping the ball away from the opposition. Personally, I have always been suspicious of this reasoning. When you count the number of possessions per game, each team almost always gets between 9 (extreme Tressel-ball) and 14 (WAC ball), with the number of possessions seldom differing by more than 1. That comes out to about an equal number of chances to score.
But maybe there is something to the idea of letting the other team's qb get "cold". I have always admitted that this is a factor, but have wondered just how much.
On the other hand, isn't it true that old-school coaches who focus on Rushing Offense also tend to appreciate the importance of a stout D? And isn't it also true that WAC-a-doodle, pass-happy coaches often focus on the offense to the detriment of the defense? It seems to me that this might explain the data more than anything. Let's call this "The Coaching Effect" for future reference.
Passing
The questions don't stop there. Here are the data for passing, taken by themselves:
Category____________________corr. SO_______corr. SD
Scoring Defense_____________0.5781__________1.0000
Scoring Offense_____________1.0000__________0.5781
Pass Efficiency Defense_____0.5336__________0.8849
Pass Efficiency_____________0.8338__________0.4989
Pass Offense________________0.4914__________0.0550
Pass Defense________________0.0547__________0.5243
What makes these numbers most interesting is when you compare them to the numbers for rushing statistics. While Rush Defense has a much higher effect on Scoring Defense than does Pass Defense (yardage), Pass EFFICIENCY Defense BEATS Rushing Defense with its correlation to Scoring Defense. This, I would expect, is also due to "The Blow-Out Effect": You know the pass is coming, so the other team's QB is throwing the ball while running for his life into a defensive backfield that is less concerned with run support than they otherwise might be. This also explains Pass Efficiency Defense's correlation to Scoring Offense, though it is notable that Rushing Defense's correlation to scoring is higher still.
Similarly explainable, though with less certainty is the looseness of the connection between passing yardage and scoring. Pass offense's relatively low correlation to scoring offense, and likewise pass defense's loose association with scoring defense both seem attributable to "The Blow-Out Effect". You might throw less when you've scored a lot; and the other team will throw more when they're have a hard time scoring, because they're probably playing from behind for much of the game.
But look at the other side of that coin; and better yet, compare it to the situation with rushing offense and defense. Passing offense has seemingly NO CONNECTION AT ALL with scoring defense, and Passing defense likewise has NO CONNECTION AT ALL with scoring offense. Recall that this was clearly not the case with the rushing numbers, as each rushing category has a moderate positive correlation with the opposite scoring category.
It seems to me that both "The Blow-Out Effect" and "The Coaching Effect" would apply negative pressure to both of these correlations, and therefore the number I would expect would be moderately negative. I guess it is possible that very good teams being adept in all areas and very bad teams being uniformly awful might cancel that out though. It is also possible that Passing simply has no "Cross-Correlation" effect the way that rushing does.
For those of you who are engineers and are therefore familiar with a very different definition for "Cross-Correlation", spare me. No one who isn't already familiar with it wants to know the "classical" definition of that term.
Turnovers
Turnover Margin, it turns out, has a fairly significant impact on Scoring Offense, and an even greater impact on Scoring Defense. I wonder if the greater (effect?) on defense is caused by the shift in momentum attendant to turnovers ("sudden change" is the generalized term used by tOSU coaching staff).
Special Teams
Net Punting and Punt Return Defense both have a moderate impact on defense as you would expect. There effect on the field position battle would seem to explain their small but clear effect on scoring offense. What is interesting about these two is that Net Punting has a higher correlation to winning percentage, but Punt Return Defense has a higher correlation to both Scoring Offense and Scoring Defense. As the difference is not great, it isn't clear if there is anything to be gained by employing inductive reasoning to fathom the cause.
It is notable that the effect Kick Return Defense has on the field position battle is seen more in Scoring Offense than in Scoring Defense, while the reverse was true for Punt Return Defense. It seems to me that this is attributable to the fact that a good Kick Return Defense will leave the opponent pinned deep in its own territory every time (punt defense only sometimes), ultimately resulting in the offense getting good field position; bad Kick Return Defense results in average field position much of the time, but bad Punt Return Defense can result in disastrous field position to your defense much of the time.
Penalties
Finally, there is the matter of the zebras. It is fascinating to me that penalties and penalty yards seem to have no effect whatsoever on defense. But look at the effect on offense. Bear in mind that this is the ranking for scoring offense correlated to the ranking for fewest-penalties and fewest-penalty-yards. So the lower number is better for both, thus resulting in an expectation of a positive correlation. But there is a small but measurable negative correlation. With this large a set of data, this correlation can be taken to be quite real, even if it is perplexing.
My first thought was that this small correlation is a result in there being a difference between the way games are officiated from conference to conference in conjunction with the fact that there are vast differences in the average offensive output of each conference. But wouldn't that then show an effect on the defense as well? Better football minds than mine will have to hash that one out too.
What seems very clear from this though is that the officials affect the game far more on calls regarding possession than on called (or not) penalties.