Friday, March 14

March Math

March has arrived, which means, it's time to do some math.  This is a long standing tradition for me.  But I know that many* of you have been waiting two long years for more analysis of different NCAA tournament scoring systems.  Wait no longer, your time has come!

To recap past work, the standard scoring system for most NCAA brackets doubles the point value per game each round (1, 2, 4, . . ).  The drawback here is that the final two wins by the champion accounts for 25% of all possible points.  This means that if you have the champion correct, it is essentially impossible for anyone who does not have the champion correct to beat you.  In any office/friend sized competition, you can pretty much just have people pick a final four and a championship final score and save yourself all that time agonizing on all those 6/11 matchups.  They won't ever matter in this scoring system.  Now that I've got the results of the last 29 years of NCAA tournaments in spreadsheet form (just the totals, not year by year) we can look at the value the various seeds provide.

1 seeds have never lost an opening round** game, so they're good for an automatic 4 points in the first round.  They have a 87% winning percentage in the second round, so they're worth about 7 points in that round, the 11, 13, 15 and 20 points in the other rounds.  Total, the 1 seeds produce about 69 points each year, out of 192 total points, or 36.1% of the total available value.  Yes, over 1/3 of the whole contest is just knowing that the 1 seeds are the best teams.  Not surprisingly, the 2 and 3 seeds also provide a lot of value, with 17.9% and 13.0% of the total points coming from those spots.  Pictures say it better, so here's a graph.



The key point here is that life drops off in a hurry.  The championship game is worth 16.7% of the total, which is more points generated that the average total value of every 7, 8, 9, 10, 11, 12, 13, 14, 15 and 16 seed.  This is what has always struck me as being lame.  All those 5/12 upsets are what make us love the tournament, but the bracket contests measure something entirely different.  It also means that even when 13 seeds do manage to win a game or two, they aren't producing any value to those who managed to get it right anyway

Occasionally, you will see a bracket that doesn't crank up the value of each round as much, incrementing by one each round.  This decreases the value of the final games significantly, effectively increasing the value of the earlier rounds, and since the early rounds are where the lower seeded teams can make some noise, they see an increase in value with this scoring system.




The 1, 2 and 3 seeds all see a decrease, while every other team goes up, except for the 16 seeds which are stuck at zero.  If one of those 16 seeds ever manages to pull off the upset, it would be huge.  They've lost 116 straight, so if you manage to pick it right, you should be pretty stoked, but all it gets you is a single point.  It does knock a 1 seed out, which will damage other people's brackets, but despite predicting the greatest upset in 30 years of tournament history, you're still going to get creamed by anyone who picked the winner correctly.
So, I proposed a system where you get bonus points for upsets, which brings value back to the lower seeds.  Basically, you take the exponential point value per round (1, 2, 4, 8 . . .) and add on the difference between the seeds, multiplied by the round they're in.  (12 over a 5 in the first round is worth 1 + (12-5)*1 = 8 points.  8 over a 1 in the second round is worth 2 + (8-1)*2 = 16 points.)  Here's the graph:






A big win for the little guys!  The top seeds take another hit, and every team from the 4 through 12 seed have relatively equal values (except for the poor 9 seeds - they just suck).  The system has down sides.  It's a bit harder to implement (warning: math).  Point values of games depend on who makes it there (and who wins) so numbers aren't comparable from year to year, and it is very difficult to look ahead and see what might happen in the bracket.

Of course, I can take this a step further.  The idea I've been pushing here is basically that the system should be weighted such that the value of a win should be inversely proportional to it's likelihood.  We could do that.  In this system, you just need a table of values (which I'm looking at) that tells how many points each seed gets for each win in each round.  For 1 seeds, that would be 1.0, 1.1, 1.5, 2.5, 4.3, and 6.4 points for each win.  For 2 seeds it is 1.1, 1.5, 2.1, 4.6, 9.7 and 29.0.  And so on and so forth.  Of course, it doesn't work at all for projecting a value for an upset that has never happened before, but the system gives out 116 points for a seed reaching a round that has only happened once, so that would be a place to start.  But this method feels inelegant.  Why? Because I say so.  I realize it's no good reason at all, but that's true of all of these systems.  The contest can be whatever you want it to be - or at least, whatever you can convince your friends to play with you.


* if we define "many" as "greater than zero"
** I will never recognize the first 4 games as legitimate

1 comment:

tysqui said...

I agree with your comments completely. On espn.com, if you don't pick the correct team to win it all, then it's pretty likely that you're not going to win your bracket pool (unless it's quite small).

All growing up (when bracket scoring was done by hand), we would give 1 point for a win and 2 points for an upset, applicable for every round. It was a decent method, but I like your proposal better.