Scope Creep vs. the Full Story

Ready for more introduction to business intelligence, through the lens of NFL football? My blog on BI and football last week prompted some fun conversations, including a theory (which can be turned into an interesting business question) from Rob, one of my old Microsoft colleagues. 

To recap, my original question was "Given a deficit of X points, how much time does an NFL team need for a reasonable chance of a comeback?" I was able to purchase a fantastic set of data from ArmchairAnalysis.com, and discovered something unexpected: from the 2000 season through Super Bowl LI, the team that scored first in any given game went on to win 65% of the time.

That surprisingly high statistic leads to other interesting questions, including a bit of a puzzle: if scoring first affords a strong edge, why does the team that wins the coin toss almost always elect to kickoff first in the first half?*

I'm not digging into that question quite yet, because Rob brought up that other interesting theory: the team that gets to 20 points first almost always wins. His question opens up a great conversation about scope creep in BI projects.

Look at every project manager's resume and you'll see "avoiding scope creep" listed as a primary skill.** BI managers live in fear of scope creep, mainly because of an associated phenomenon: people who add requirements during the execution of a project often forget that their additions contributed to the project failing to complete on time. The best defense against this, of course, is another of those great management skills, "setting proper expectations." A thorough project charter should clearly outline what's in scope and what isn't, and a major change should require stakeholder review and approval.

But an additional question can't always be dismissed as scope creep. My fifth rule of business intelligence is that each stakeholder has his or her own questions that must be strategically organized.*** Part of that organization is evaluation of the question. Does it belong in the project? If so, what's the priority, and how are resources impacted?

Since my NFL analysis hobby is blissfully free of stakeholders, I decided to take a diversion from my own questions and answer Rob's.**** 

First, I should mention that some time ago I migrated the original data (imported from .csv files to Access) to SQL Azure. When I started adding measures and dimensions to the data, I realized I'd much rather do it in SQL than VBA. Also, it gave me a chance to try out SQL Azure as a consumer. (And some day I'll post my critique of the migration process from Access to Azure.)

The question required some additional fields in the dataset. (DrUsual's Fourth Rule of Business Intelligence: the platform must evolve. This can apply to the data itself, the hosting environment, visualization capabilities, and much more.) For each game, I added fields to flag whether either team reached 20 points, whether the team that reached 20 first went on to win the game, how many minutes remained when 20 points were reached, and what the deficit was for the trailing team, when the other team hit the 20-point mark.

Time to pivot.

First conclusion: in a game where either team scores at least 20 points, the first team to reach 20 points wins the game 86% of the time. Bravo, Rob. However, we had an inkling this was true already, since the earlier analysis indicated that any lead is relatively hard to overcome. So, let's dive a bit deeper. What other conditions cause variance in the chance to win, despite having reached 20 points first?

The most impactful key driver (that I've found) is the trailing team's deficit.

  • If the trailing team is behind by 3 or less, the leading team has a 67% chance of winning.
  • If the trailing team is down by 7 to 10 points, that chance of winning leaps to 81%.
  • The trailing team needs two scores or more to catch up, 11-14 points, victory is 90% likely.
  • Reach 20 points with at least a 15 point lead, and it's pretty much game over -- you win 95% of the time.

I was also interested in the impact of game time remaining. This should be significant, right? The more time left, the better chance of a comeback. The outcome here is interesting, though. Check out the results:

  • Leading team hits 20 points with 46 to 55 minutes remaining: 90% victory.
  • Leading team hits 20 points with 31 to 45 minutes remaining: 87% victory.
  • Leading team hits 20 points with 16 to 30 minutes remaining: 84% victory.
  • Leading team hits 20 points with 15 minutes or less remaining: 88% victory.

In addition to the tight range, there's an interesting anomaly: your best chance of winning comes when you reach 20 points with the MOST time available for the other team to come back. Two conclusions here. First, time remaining isn't as impactful as the actual deficit. Second, a team that makes it to 20 points in the first quarter of the game is probably truly outclassing the opponent. The leading team has won 52 of 58 games where this has happened. 

RaceToTwenty.png

The chart above shows a pivot of both time remaining and the deficit when one team reaches 20 points. I put this one together simply for an "eyeball check" -- scanning to see if anything interesting jumps out. One clear lesson: if you're going to let the other team jump to an early lead, make it a big lead. There are six instances of a team losing after reaching 20 points in the first 15 minutes. In five of those cases, they had at least a 15 point lead when reaching 20 points. We can probably label that the "Wake Up Call Phenomenon." 

(Incidentally, it was Buffalo that choked in two of these five games, giving up 21 point leads each time. In 2011 they ended up losing 49 to 21, via another phenomenon we'll simply call, "Tom Brady Was Here.")

And in case you're wondering about the single instance of a team overcoming a deficit of more than 15 points when the opposition reached 20 point mark with less than 15 minutes to go -- that was the St. Louis Rams, week 17 of 2002 versus San Francisco. The 49ers made it 20-3 with 14 minutes remaining, and the Rams managed four touchdowns in that last quarter, including a defensive TD on a fumble recovery. 

Ah, sports. You make numbers so much fun.


* The most common theory is amazingly common -- that there's an even greater advantage to scoring last in the first half, then scoring on the opening drive of the second half. I'll explore that in another blog.

** I'm surprised it's not a standard skill on LinkedIn. Then again, a lot of really good skills are missing from LinkedIn's list, including "Not wasting time," "Taking accountability," and "Acting like a grownup."  

*** I know, in the previous blog I gave you my first rule, and in this one I'm giving you my fifth and fourth. It's a narrative, not a recipe.

**** After all, knowledge for knowledge's sake is worthwhile, but conversations are more fun.