Thursday, September 10

Hail Mary Probabilities in College Football

In case you didn't hear my yelling all the way from Illinois this last Saturday, BYU beat Nebraska on a last second (literally) Hail Mary - a 42 yard pass from Mangum to Matthews. As a game winning play, it has to rank up there with McMahon to Brown in the 1980 Holiday Bowl and Beck to Harline against Utah. (Though it probably can't compete with the importance of the game - BYU's first bowl win ever and, well, any game against Utah.)

The question of the day is: how unlikely are these plays? It seems like these plays ought to be easy to defend. It's end zone or bust as far as the offense is concerned, so the defense really only has to worry about a long pass play, and it isn't easy to complete a pass when there are 3 or 4 receivers against 8 defenders in the end zone. Sadly, the internet is pretty lacking in data on this. The best I could find was this analysis using data from 2000 to 2011 in the NFL. Which game them only 223 data points.

Well, after some digging, I found this site, which has play by play data for the 2014 college football season. There is a lot of fun things that can be done with this data. And, because there are so many more college football teams there is a lot of data there. So, I started filtering . . .

In the interest of transparency, I'm going to fully detail how I've selected this data. It's more than 145,000 plays, so you'll excuse me for not sorting through them by hand. (Feel free to skip this next bit if you want.) The data was filtered to include drives that ended the second or fourth quarter whose final play was a pass. This catches the end of each half where there is little reason not to try and get points if you can. I then excluded plays from the 4th quarter where the margin was more than 9 points - teams in this situation may not be playing their best players. (I assume no one has given up before halftime.) There are, of course, a few worthy plays that could slip through this screening. For instance, if a team is throws a "hail mary" that happens to be caught while there is still one or two seconds left on the clock, I won't have included it. But I'm happy that this filtering will provide the vast majority of pertinent situations. Oh, and in case you're wondering, I'm excluding all OT posessions because of how the data is configured. (Some small percentage of OT games would have passing plays that qualify as short range hail marys.)

Ok, so here's what I find: 210 plays met this criteria, which resulted in 11 touchdowns. That's 5.2% which isn't great, but is a little higher than I would have guessed. Let's break it down more.



Comments:
- These numbers sure make it look like teams aren't trying very hard at the end of the second quarter.
- There was only 1 defensive touchdown scored in all these plays (in a 4th quarter blowout), so there isn't much of a reason to fear a pick-six (which could hurt you in the 2nd quarter, but can't in the 4th quarter unless you're tied).
- The huge fumble rate it due to teams lateraling the ball around trying to score, and mostly happens on plays from really far away. (Average field position for the 7 fumbles in the 4th quarter was the offense's own 27 yard line.)
- Look at the huge difference in sack rate between the 2nd and 4th quarters. On game ending hail mary plays, defenses usually rush 3 and offenses will block with 6. Apparently teams don't do that in the 2nd quarter. Perhaps pressuring the QB is more effective? (Bad things can happen when you give a quarterback all day. See: 2006, Beck to Harline.)
- "4th down, knock it down" applies here. ("4th quarter, knock it down"?) Unless it's a tie game and you are going to try and return it for the win, there's no reason to even try and catch the ball. Nearly all of these interceptions are a case of a defensive back deserving a ruler across the back of his hands.

Taking all 209 attempts with yardage data from competitive games together, we can break it down by distance:




More comments:

- The completions that don't result in a TD from less than 50 yards out confuse me. The QB should be able to get it to the end zone from there - I guess some of these are teams running out of time and not getting that "final play" off. In one sense those plays should be removed from the data set - they don't represent a "hail mary". But, in another sense, I like that they're there, because they represent another way that you can fail from the 40 yard line with 7 seconds left in the game.
- Completion percentage sky rockets beyond 70 yards; quarterbacks can't throw the ball that far, so it is more sensible to throw short and hope for some lateraling magic to happen. In this group we see the INT rate drop (6%), fumble rate climb (19%), and the sack rate drop (13%). This strategy is also far less likely to work (3.1% success rate), though, given the low chances of success, some of these 2nd quarter attempts may not represent true best efforts.
- It's easier to score from 11-20 yards out than it is from closer it. This probably makes sense as offenses have a little more room to operate.
- Why is 41-50 yards a sweet spot, compared to 21-40 yards? Also, why are there so many more attempts from the longer distance? One reasons is that from the 45 yard line a hail mary is an option, while most teams won't even consider attempting a 62 yard FG. So even if your team is only down 1 or 2 points, the hail mary is probably the better option. Here's some data on historical NFL field goal kicking by distance. Apparently my next post needs to look at FG % by distance for college kickers, because I can't find anything very good for college.

Finally, here's a graph of what the success rate looks like by distance.


I'd love to have more data to see if this would smooth out a bit, or if this really is tri-modal. So, if you happen to have multiple years of play by play data for college football handy, let me know.

Also, as I do have data for last year, if you can think of anything interesting you'd like me to look at, let me know. Essentially, what I have is the game state (teams, score, down, distance) and result (score, completion, INT, fumble, rush, sack, yards gained) for every play.

No comments: