Business Intelligence 101: Infographics

Another appropriate title for this blog would be, "Buying Myself More Time." I'm eager to share the results of my NFL 1-2 Punch analysis with you, but the latest discovery does need in-depth validation. A large number of NFL fans take the 1-2 scoring strategy as gospel, and I'd like to be quite certain of my results before I contradict them.

That certainty reminds me of a conversation with my friend Roland, a few years back. I was rattling off a list of Swedish rock bands, and included Golden Earring. Keep in mind, I love music trivia. I'm very confident in my music trivia knowledge. And for the better part of 30 years*, I was certain that Golden Earring was from Sweden. Roland, who's Dutch, informed me that no, Golden Earring is from The Netherlands. It took me a while to believe him, despite his rather obvious authority on the matter, simply because I had been so certain they were Swedish.**

Back to the subject at hand: infographics. I used the term "infographic" last week with a non-IT, non-corporate friend and was asked, "What exactly is that?"

Simply put, an infographic is a visual representation of data or information. Seems straightforward. Aren't you just trying to impress me by using a fancier word than "chart?"

Fig. 1: So, who was part of the band in 1997? 

Not exactly. Consider this differentiation: when you look at the visual, no matter what you call it, do you get a sense of overwhelming detail, or do you immediately get a feel for the overall picture?*** A good infographic should produce the latter. More detail on infographic functions in a moment.

I stumbled across a great example while checking some items for yesterday's blog on band compositions. Wikipedia has some excellent infographics showing membership in popular bands over time. Check out the two screen captures: the first gives a simple textual list of people who have been part of Styx. The second is a graphical timeline.

Fig. 2: Much easier to read, and actually provides some insight beyond simple membership.

Want to know who was in the band in 1978, 1997, or 2003? You'll probably answer those question far more quickly using the infographic. It also provides some immediate insights that you don't get easily from the text version. Apparently the band was inactive from 1984 through late 1990, and again from 1992 to 1995. And it prompts an interesting question: what was that brief stint in 1995? 

Those are the things a good infographic should or can do: first, give you a good feel for the overall picture. Second, provide a visual that helps commit the general picture to memory. Third, make it easy to answer basic. Fourth, call out significant insights, and finally, prompt questions that might be worth following up. And a good infographic should do these things more effectively than text or grid delivery.

If you're still with me, how about a little homework? Send me a link with an interesting infographic. If I receive a few of these, I'll follow up with another post to share people's favorites.


* 30 years is nothing compared to the astounding 48 year+ consistency of the band. George Kooymans and Rinus Gerritsen founded the band in 1961 and have stayed together the entire intervening time. The two other current members, Barry Hay and Caesar Zuidewijk, have been with them since the late 60s or early 70s. Freaking amazing. I see you guys aren't planning on any Texas visits, probably because of my transgression with the whole "they're Swedish" thing, so maybe I need to visit Oss in July. 

** I was right about ABBA, Europe, and Ace of Bass, at least.

*** In my opinion, this is one of the top flaws with business intelligence delivery. Report consumers love big scorecards with dozens of KPIs, but the report author should always consider time to action. When a consumer sits down to look at this report, how long does he or she need to determine the action to be taken?

**** This was a brief reunion, re-recording songs for a greatest hits album. They planned to tour as well, but unfortunately, John Panozzo was in bad health and passed away. 

It's A Valid Question, Mr. Hunter!

Today's blog topic was planned to be analysis of the NFL's 1-2 Punch theory, but I'm invoking my Second Rule of BI: check your work. Stakeholder resistance to a result is directly proportionate to how strongly that result contradicts the stakeholder's expected result.* So, when the outcome is dramatically different from "common knowledge," it pays to spend some time double-checking. Not to mention prepping a presentation, because you're going to have to go in-depth to reassure your audience.

So, I'm going to postpone the results of the 1-2 Punch analysis and post an article I started last fall, before life was derailed for a few months. This'll give me some time** to explore the NFL analysis more in-depth, so I don't present something that truly is flawed.

Alan Hunter is one of my favorite people to listen to, whether it's on SiriusXM or Twitter. In case you're not familiar, Hunter was one of the original MTV VJs, and is now a host on two of my favorite Sirius channels, 80's on 8 and Classic Rewind. He's a prolific Twitterer or Tweeter or whatever the hell we call it, and seems to be a genuinely nice guy, and the world could use a few more of those.

He's also very responsive to his fans on Twitter.*** Last fall, when I was giving some serious consideration to buying a ticket for the 2018 80's on 8 Cruise, I tweeted a question: would Fee Waybill be on the cruise? I think Mr. Hunter's response was giving me a small dose of good-natured sarcasm. (Like I said, genuinely nice guy.)

Still, it elicited some good-natured indignation from me. I know Fee Waybill is the lead singer for The Tubes.**** It's still a valid question. This is 2018, after all. If you buy tickets for an "80's band" concert, you'd better check the lineup closely.

Tubes Tweet.png

Case in point: a few years ago I saw Yes at WinStar Casino...and walked out thoroughly disappointed. The performance was excellent -- but they played almost none of my favorite songs. No "Leave It." No "It Can Happen." No "Love Will Find a Way." Why? Because there are two different Yes configurations today. If you're seeing Geoff Downes, Steve Howe, and Jon Davison, you're going to get a totally different set list than if you see Jon Anderson, Trevor Rabin, and Rick Wakeman. 

Or how about Fleetwood Mac? I remember being really excited a few years ago to see Fleetwood Mac tickets on pre-sale (a year ahead of time) until I read that Stevie Nicks wasn't in the lineup. I'm sure it was still an awesome concert, but to me, just not the same.

The Cars, Styx, A Flock of Seagulls, Journey...fair to say, whenever you see any 80's band today, you should ask, "Who exactly IS Styx today?" Or in the case of Van Halen, "So, who's singing this month?"

That in mind, I can't wait to see Jeff Lynn's ELO in August. Yes, this is a very different lineup from ELO Part II. (But at least in this case each version of the group performs the original ELO catalog, as far as I know. If I don't hear Jeff Lynn singing "Can't Get It Out Of My Head," I want my money back.)

Mr. Hunter, maybe I'll run into you at the United Center for the Bon Jovi concert in April? But I won't ask you if Jon Bon Jovi is going to be there.  :)


* Irony: when the analysis supports the pre-supposition, everyone's happy to say, "Looks good. It's just like we thought!" Contradict the assumptions, though, and the first argument is that either the data or the analysis is wrong. Or both.

** I don't actually have time for this. I've got job applications to fill out, a screenplay to finish writing, and an iOS development tutorial that I really want to complete. But this NFL data is so intriguing...

*** I find this particularly cool. Too many celebrities forget that without the fans...you're not really a celebrity.

**** If you didn't know this, don't be embarrassed. It just means you haven't spent enough time on music trivia with me. To impress and enlighten your other friends, just reference Waybill's excellent solo contribution to the St. Elmo's Fire soundtrack, "Saved My Life."

Business Models versus Data Models

This morning I got back to one of my high-priority questions regarding my NFL data analysis -- if statistically there's a significant advantage to scoring first, why do teams almost universally defer to the second half after winning the coin toss?

The most prevalent theory seems to be that there's a bigger advantage in a 1-2 punch: scoring at the end of the first half, then scoring again immediately after receiving the opening kick of the second half. Good news -- with my handy data set from ArmchairAnalysis.com, we can put that theory to the test.

Jacoby Ford.jpg

Time to augment the dataset!* I added a number of fields, tracking which team was on offense for the opening drive, which team was the offense for the final drive of the first half, and which team was on offense for the first drive of the second half. Next, how many points were scored in the H1 final drive, and how many points were scored in the H2 opening drive. This gives me all the data I need to test the hypothesis. 

But wait! I compiled these new fields for ten games, just a small set for a quick check of the data logic. Good thing, because game #3 in my dataset showed the bane of the business intelligence professional: an anomaly that seems to contradict the business model.

Specifically, in the Week 1 of the 2000 season, the Eagles somehow were on offense for the first drive of EACH half. According to the business model, this can't happen. The team that kicks off first in the first half will receive the first kickoff of the second half.**

So, what's going on here? Is my business model incorrect? Or is it a data issue?*** Fortunately, this dataset has the mother of all sports data: play by play information for all 4,523 games represented. It's a virtual ton of data. Analyst heaven.

And it provides the answer. Those crafty Eagles actually kicked and recovered an onside kick to start the game. Thus, in the dataset's [Drive] table, Philly is on offense for the first drive of the game. They still get to receive the ball first in the second half, and since Dallas did not try the same maneuver, Philly ended up on offense for both opening kicks in the game. Go figure.

Is this an error in the business model or the data model? Neither, in my opinion. The business rule is that each team must make one of the opening kicks. The data definition of a "drive" is a series of events demarked by one team's possession of the ball. The kickoff is a specific business process event, and that event does not constitute a "drive."

To prevent confusion, our documentation (our data dictionary, information model definition, white papers, etc.) should include a thorough explanation of this important point. Analysts need to understand it so they handle the scenario correctly in calculated fields. Report consumers need to understand it so they don't make misguided decisions. Remember that documentation blog that I posted last week? (It's the one you skipped.)

But wait...this isn't the only scenario where the team receiving the opening kick doesn't also own the first drive, according to the [Drive] table. In the 2010 season, Miami kicked off to Oakland, and Jacoby Ford returned it 101 yards for a touchdown. In the data model, no actual drive occurred -- only the kickoff event. The first drive of the game occurred after Oakland kicked, and Miami started an actual drive from their own 1 yard line. They did go on to win, despite taking the more common route to scoring.

All right, I've discovered something crucial about my dataset. I've explained it to my stakeholders. But...the job isn't done. Before I can move on to my question about the 1-2 Punch, I must decide how to handle these scenarios during analysis. First step -- determine how often they occur.

Turns out there are 159 instances of the same team seemingly owning both "opening" drives. That's 3.5% of my total games, not an insignificant number. If this was a professional project, I'd have a governance process defined for an official decision on how to account for the scenario in reporting. Since it's my personal project, though, I only have to agree with myself. Best stakeholder ever.

The solution: add a flag to conveniently filter these games out of the analysis when I (finally) get to that question of the 1-2 Punch. But since today's blog was hijacked by my seeming anomaly, that work will have to wait until tomorrow. Such is BI.


* Data analysts get so excited about adding calculated fields to the dataset. I think that's our version of creating the CGI fire from the dragons in Game of Thrones.

** Unless the kickoff rules in 2000 were as unclear as the current day catch rules are, that is...

*** Dear BI Analysts: you may as well assume it's a data issue, until you prove otherwise. Your stakeholders certainly will. Low customer satisfaction? Bad resource utilization? Blew through the budget in one quarter? Prove it isn't a data error, then we'll talk.****

**** Keep in mind, as irritating as that attitude can be, that should be the attitude of the BI professional. You need to be 100% confident in the data before you tell your stakeholder that the data's fine.

Game of Thrones Teaches Business Intelligence!

While waiting impatiently for the next season of The Man in the High Castle, I've been re-watching Game of Thrones. Given that HBO has put off the final season by a year, and George R. R. Martin is now tentatively scheduled to deliver the final books sometime in the year 2030, I'm picking up a few of the tidbits that I missed the first time around.

A great example came up sometime in Season Six. Strife in the Greyjoy household! Uncle Euron comes home, murders his brother, and assumes kingship of the Iron Islands. Theon and Yarra, his niece and nephew, steal the best ships from the Iron Fleet and sail away, presumably with a significant number of the island warriors.

IronFleet.jpg

Uncle Euron is undeterred, however. He holds forth with a rousing, inspirational speech, which ends with, "Build me a 1,000 ships, and I'll <something something something about conquering the world.>"

Um, say again? Build 1,000 ships? 

Yep. Euron commands all hands on deck. Every man felling trees, cutting lumber, and banging ships together. Every woman stitching sails and weaving rope. Apparently it's just that easy. Time is a bit murky in the GoT world, but it's certainly only weeks or months before Euron is sailing with a massive armada that decimates Yarra's fleet.*

It occurred to me that I've actually seen this episode many times, and not just on HBO. Operations, and BI in particular, are often the subject of the "Let's just dive in and do it in no time!" mentality. We've got a business problem that we didn't anticipate. Someone thought of a major new initiative while showering. Or worse, someone attended a workshop, saw some cool data visualizations that a vendor spent months building in the app they're trying to sell, and wants something similar in the ecosystem -- tomorrow.

"Build me 1,000 KPIs, and I'll give you the Seven Kingdoms!"

Here's the problem: building a ship isn't that easy. There are plenty of skilled trades involved in shipbuilding, and even with motivation provided by leadership's "can do" attitude**, the application of massive manpower rarely makes up for the correct skill and experience. One hundred foot soldiers can't significantly speed up the proper shaping of a keel, and if the keel is faulty, the ship is ineffective.

Likewise, a dozen vendors who know nothing of your business and data can't slap a quality BI infrastructure together overnight. So, how about pouring on the labor from the opposite direction? Open up the BI platform to those who have the business knowledge, if not the technical skill. After all, your account managers, team managers, and sales folk can just pick up some data skills with an hour on Lynda.com, right?***

There's a big difference between using a tool daily and having the skills necessary to build the tool. Most sailors spend a great deal of time onboard a ship, one might expect, and they're probably aware that it's supposed to float. Put them in charge of building one, and it'll probably have as auspicious a career as the CSS Neuse.****

Of course, you may be thinking that this is a terrible analogy because Euron Greyjoy's team DID finish 1,000 ships in virtually no time, smashed the Iron Fleet, and captured a bunch of key enemy leaders. Don't forget -- Game of Thrones is fiction. And fantasy. Reality would have made that subplot would have made that subplot a bit anti-climactic, when Euron's armada set forth and promptly sank.

Strong leadership hires strong specialists, then paves the way for those specialists to do a quality job. Cut corners and throw wrongly skilled labor at your projects, and you may as well wait for the dragons to fly in and save the day.


* I'm not apologizing for "spoiling" an episode that's been out for over a year.

** Or the motivation provided by Uncle Greyjoy's "do it or die" attitude.

*** I also don't believe that 1,000 monkeys on 1,000 typewriters will recreate the works of Shakespeare. However, they probably could come up with the NFL's rules on what constitutes a catch.

**** It's an interesting story. Look it up; you can thank me later.

Don't Forget the Documentation

Apparently it's going to be business intelligence week on my blog, since this entry on my fifth rule of business intelligence makes three in a row* on the subject. This time, I'm going to give you the rule right up front: documentation must be a requirement of BI development, not an option.

This might seem like a no-brainer -- if you've never worked in an actual IT or operations department.

In a decidedly non-scientific method of evaluation,** I'm going to hazard a guess that documentation is considered a "nice to have" in about 99% of BI (and general IT) operations. More specifically, documentation is an activity that leadership and management teams typically refuse to budget time for, yet lament the lack of when things aren't so clear later. The strongest adherence to a culture of good documentation tends to be found in project management, but there's far more to documentation than just the project charter, Gantt charts, and status reports.

On the developer-facing side you've got data source, acquisition, and transformation information. Development standards, style guides, platform strategy and history, data governance, retention policies, and relationship models. Facing the end user, you've got business rules references, metric and KPI guides, subject overviews, and access policies. And that's just a short list -- there are far more subjects that need quality documentation in order for your data to become a usable information asset.

Predictability is one of the major focal points as a BI matures. Every stakeholder wants to know when his or her request*** will be ready. The BI team and vendor resources grow more rigorous about organizing development into sprints, performing VROMs, and predicting the number of hours for each task. And that prediction rarely includes thorough, high-quality documentation.

After all, time is money, and it's bad enough the world has to wait 20 person-hours for that next customer satisfaction report. Two more hours for governance and documentation?**** If we just forego those activities on the next ten projects, we could accomplish an additional project in the "time saved!" We'll come back and document everything when we have some "breathing room."

Breathing room, of course, tends to occur on the 20th of Never.

Even in the one-man show of my NFL analysis I keep a rudimentary set of documentation. A field name that seems quite descriptive today can be ambiguous in a very short time of non-use. The couple of minutes I spend tracking notes in OneNote or Excel are time better spent than an hour of reverse-engineering my code later, or worse, sharing incorrect analysis because I forgot a definition.

Rudimentary, sure, but sufficient for preventing mistakes and saving time in future development.

Rudimentary, sure, but sufficient for preventing mistakes and saving time in future development.

Ever wonder why the FDA requires ingredients to be listed on food packaging? It's so you can understand what's in your food, and avoid making bad decisions. Sure, people still make plenty of bad dietary decisions, but with better information, they have a better chance of making a good decision. No one ever looks at a Cheeto and says, "Wow, a bag of these will help me lose weight!"

In BI, lack of documentation or low-quality documentation precipitates significant mistakes. A developer can create a new measure with an incorrect calculation. Stakeholders can waste time debating results because their definitions of KPIs differs. Incorrect information can be issued publicly, or to external customers.

The solution is simple. Change the culture of your organization such that proper documentation is part of the development process. Budget the time and resources to include documentation activities. Don't allow the process to be put off until that non-extant breathing room appears. And once that high-quality meta-information is available, try reading the side of the package every once in a while.


* That's right, a perfect 5 for 7!

** I.e., my gut feel after 20 years in this discipline.

*** Keep in mind that each stakeholder's current request is always the most crucial, make-or-break-our-business request in the history of the company. And Earth itself.

**** Next you're going to be demanding bathroom breaks!

Scope Creep vs. the Full Story

Ready for more introduction to business intelligence, through the lens of NFL football? My blog on BI and football last week prompted some fun conversations, including a theory (which can be turned into an interesting business question) from Rob, one of my old Microsoft colleagues. 

To recap, my original question was "Given a deficit of X points, how much time does an NFL team need for a reasonable chance of a comeback?" I was able to purchase a fantastic set of data from ArmchairAnalysis.com, and discovered something unexpected: from the 2000 season through Super Bowl LI, the team that scored first in any given game went on to win 65% of the time.

That surprisingly high statistic leads to other interesting questions, including a bit of a puzzle: if scoring first affords a strong edge, why does the team that wins the coin toss almost always elect to kickoff first in the first half?*

I'm not digging into that question quite yet, because Rob brought up that other interesting theory: the team that gets to 20 points first almost always wins. His question opens up a great conversation about scope creep in BI projects.

Look at every project manager's resume and you'll see "avoiding scope creep" listed as a primary skill.** BI managers live in fear of scope creep, mainly because of an associated phenomenon: people who add requirements during the execution of a project often forget that their additions contributed to the project failing to complete on time. The best defense against this, of course, is another of those great management skills, "setting proper expectations." A thorough project charter should clearly outline what's in scope and what isn't, and a major change should require stakeholder review and approval.

But an additional question can't always be dismissed as scope creep. My fifth rule of business intelligence is that each stakeholder has his or her own questions that must be strategically organized.*** Part of that organization is evaluation of the question. Does it belong in the project? If so, what's the priority, and how are resources impacted?

Since my NFL analysis hobby is blissfully free of stakeholders, I decided to take a diversion from my own questions and answer Rob's.**** 

First, I should mention that some time ago I migrated the original data (imported from .csv files to Access) to SQL Azure. When I started adding measures and dimensions to the data, I realized I'd much rather do it in SQL than VBA. Also, it gave me a chance to try out SQL Azure as a consumer. (And some day I'll post my critique of the migration process from Access to Azure.)

The question required some additional fields in the dataset. (DrUsual's Fourth Rule of Business Intelligence: the platform must evolve. This can apply to the data itself, the hosting environment, visualization capabilities, and much more.) For each game, I added fields to flag whether either team reached 20 points, whether the team that reached 20 first went on to win the game, how many minutes remained when 20 points were reached, and what the deficit was for the trailing team, when the other team hit the 20-point mark.

Time to pivot.

First conclusion: in a game where either team scores at least 20 points, the first team to reach 20 points wins the game 86% of the time. Bravo, Rob. However, we had an inkling this was true already, since the earlier analysis indicated that any lead is relatively hard to overcome. So, let's dive a bit deeper. What other conditions cause variance in the chance to win, despite having reached 20 points first?

The most impactful key driver (that I've found) is the trailing team's deficit.

  • If the trailing team is behind by 3 or less, the leading team has a 67% chance of winning.
  • If the trailing team is down by 7 to 10 points, that chance of winning leaps to 81%.
  • The trailing team needs two scores or more to catch up, 11-14 points, victory is 90% likely.
  • Reach 20 points with at least a 15 point lead, and it's pretty much game over -- you win 95% of the time.

I was also interested in the impact of game time remaining. This should be significant, right? The more time left, the better chance of a comeback. The outcome here is interesting, though. Check out the results:

  • Leading team hits 20 points with 46 to 55 minutes remaining: 90% victory.
  • Leading team hits 20 points with 31 to 45 minutes remaining: 87% victory.
  • Leading team hits 20 points with 16 to 30 minutes remaining: 84% victory.
  • Leading team hits 20 points with 15 minutes or less remaining: 88% victory.

In addition to the tight range, there's an interesting anomaly: your best chance of winning comes when you reach 20 points with the MOST time available for the other team to come back. Two conclusions here. First, time remaining isn't as impactful as the actual deficit. Second, a team that makes it to 20 points in the first quarter of the game is probably truly outclassing the opponent. The leading team has won 52 of 58 games where this has happened. 

RaceToTwenty.png

The chart above shows a pivot of both time remaining and the deficit when one team reaches 20 points. I put this one together simply for an "eyeball check" -- scanning to see if anything interesting jumps out. One clear lesson: if you're going to let the other team jump to an early lead, make it a big lead. There are six instances of a team losing after reaching 20 points in the first 15 minutes. In five of those cases, they had at least a 15 point lead when reaching 20 points. We can probably label that the "Wake Up Call Phenomenon." 

(Incidentally, it was Buffalo that choked in two of these five games, giving up 21 point leads each time. In 2011 they ended up losing 49 to 21, via another phenomenon we'll simply call, "Tom Brady Was Here.")

And in case you're wondering about the single instance of a team overcoming a deficit of more than 15 points when the opposition reached 20 point mark with less than 15 minutes to go -- that was the St. Louis Rams, week 17 of 2002 versus San Francisco. The 49ers made it 20-3 with 14 minutes remaining, and the Rams managed four touchdowns in that last quarter, including a defensive TD on a fumble recovery. 

Ah, sports. You make numbers so much fun.


* The most common theory is amazingly common -- that there's an even greater advantage to scoring last in the first half, then scoring on the opening drive of the second half. I'll explore that in another blog.

** I'm surprised it's not a standard skill on LinkedIn. Then again, a lot of really good skills are missing from LinkedIn's list, including "Not wasting time," "Taking accountability," and "Acting like a grownup."  

*** I know, in the previous blog I gave you my first rule, and in this one I'm giving you my fifth and fourth. It's a narrative, not a recipe.

**** After all, knowledge for knowledge's sake is worthwhile, but conversations are more fun.

The First Rule of Business Intelligence

A bit of an advance apology; you're going to have to read on a bit to get to the actual rule. This is a blog, not a text book, so you don't get to read the title, skim the subheadings, and call it a day. Also, I'll get back to societal problems and the way we treat people with special needs in a few days. For now I want to post some topics that I've had in the works for a while* but haven't gotten to yet.

Super Bowls 51 and 52 were both great games, for totally different reasons.** For those who don't remember (or don't follow football***) during Super Bowl 51 the Patriots were down by 28 to 3 with only eight and a half minutes left in the third quarter. They came back to tie the game, then win in overtime.

That made me wonder: for any given deficit in a football game, how much time is needed for a team to have a reasonable shot at a comeback victory? To the Business Intelligence cave!****

Football Chart.png

First stop: ArmchairAnalysis.com. I obtained an awesome data set, play-by-play data for every NFL game (including playoff games) from the 2000 season through Super Bowl LI. And I do mean play-by-play. What happened in the play on both offense and defense, plus environmental data, and more. 

Using Access and Excel, I added some measures and slicers to the main data, then set out to plot the relationship between deficits, comeback wins, and time. The primary question: given a deficit of X score, how much time is needed for the team that's behind to have at least a 50% chance of winning?

I pictured a result showing something like, "A team behind by 3 only needs 5 minutes in the game, a team behind by 10 generally needs 12 minutes, etc." However, the results quickly exemplified one of the first rules of business intelligence: your first business question is probably not the right business question.

Why the focus change? According to the pivot, 65% of teams that scored first went on to win the game. And even more significantly, if that first score is a touchdown, the win percentage jumped to a whopping 70%.

That's seriously heavy information. Of course, the first thing I did was re-check the data, make sure I hadn't made any mistakes with calculated fields. Nope. Sure looks like a team that starts with the lead has a serious advantage.

That fact changes the priority of my business question. I'm no longer concerned with the time needed to overcome a particular deficit -- I'm more concerned about the importance of the first score, and of making that score a touchdown. 

The unexpected information also raises a new business question: if scoring first is so significant, why does the team who wins the coin toss almost always elect to kick first? Is it because the entire NFL grossly misunderstands their own statistics? (Don't be so quick to scoff that possibility -- go read Moneyball.)

Or is it because the situation is too complex to be explained by a single statistic? That's more likely the answer, and the subject of my next BI-related blog.

 


* Sure, you might consider a year a bit more than "a while," but...

** Despite the Minnesota Vikings not being in Super bowl 52, as they should have been. Sorry, Tim VandeSteeg, it was a good try!

*** Yes, I said, "football." You know, a game characterized by an oblong goal, very large men, and not being rugby. The round-ball game played with no hands is called "soccer," Europe.

**** Okay, BI isn't really that exciting. We work hard, but we rarely literally spring into action, and we don't have theme music. 

Intelligent Planning for Showers and Data Systems

On a recent road trip my hotel room shower provided an excellent physical example of the poor planning behind many data and tool infrastructures.  Don't get me wrong; it was a very nice bathroom.  I prefer to stay at Hilton hotels, and this Hampton Inn was quite nice.  The bathroom was clean* and looked like a lot of care went into the visual appeal.  Each individual component did what it was supposed to do.  At first glance, the system looked solid.

The shower stall was spacious -- probably six feet deep, very comfortable.**  Here's the problem: the door was at the opposite end from the shower head and taps, and the shower head was fixed in place.  There was no way to turn on the water without being blasted immediately.***

Worst shower design ever.

Of course, the system owners can't easily remedy the problem.  Relocating the taps is an expensive, laborious process.  Changing the shower head is the most likely option, except that the supply pipe from the wall had no threads.  And this is a hotel; the change has to be replicated in about 400 locations.

So, the infrastructure and a major delivery system were either not planned together, or not planned with the end user in mind.  They do the job, but the user experience is poor.  Sound familiar?

Here's the cardinal rule of system planning: you must start with the desired outcome.  That's the bare minimum.  Really, you need to start with the desired outcome AND some conception of how the system will be extended later, but since many organizations can't even force themselves to start at the desired outcome I'm going to pass right by the concept of extensibility right now...

I can hear many of my ex-colleagues at Microsoft thinking, "We're great now at starting with the desired outcome," to which I say "Bull dookey."****  After 16+ years of seeing systems developed at Microsoft, I'd say these are the most common starting points, in order of frequency:

  1. We need a data-origination tool!  Engineers need to track labor, we need to gather customer feedback, we need to track expenses.  Data input, build a tool!
  2. It's consolidation time.  We recognize (for the thousandth time) that we have dozens of non-communicative data systems, based on all these individual tools, and we're going to build (wait for it) a consolidated warehouse!  Oh, and our organization has convinced our new leader to fund a new consolidated warehouse, so we're going to ignore the eighteen other consolidated warehouses...
  3. The new boss wants a better scorecard, and wants it now.  We'll just build a cube for that, created some measures, and hell, if there's enough funding, how about a tool to go with it, with some "scrubbed" data?  Wait, the measure don't match the calculations from other tools?  Bummer.
  4. We are looking at the desired business outcome and planning the system accordingly.  However, time is of the essence.  Let's start with (wait for it) a new consolidated warehouse!  But this time is different.  This time we're going to plan on doing a serious data overhaul some day.  No, seriously, we will!
  5. (We're going to start with the outcomes we need to support, define the insights we'll need to provide that support, define the data structure needed to eventually deliver those insights, and build the tools on that new data structure.  We may even take this is as an opportunity to totally redefine some of our key business concepts.)

I put #5 in parentheses because in over 16 years I've rarely seen it happen, despite being the correct approach.

The concept here applies to any business or system, whether it's a single-person sole proprietorship or a corporation with 100,000 employees: when planning your data and tool infrastructure, start with a full set of desired outcomes -- tactical and strategic, internal and customer-facing, immediate need and future-proofing.

As an example, my game company is working on the design of our first online app.  We've considered a few features that didn't make the cut for the first version: league play, private instances, and end-user administration, to name a few.  However, the data infrastructure includes placeholders for all these concepts.  If we choose to implement any of them later, we won't have to patch or overhaul the base system to do so.  Thanks for the lesson, Microsoft.

Oh, and to the Hampton Inn: I know you can't do anything about that shower stall tap location, but perhaps you could put a towel hook somewhere near the door?


* Cleanliness is my #1 criteria for a hotel, particularly the bathroom and the bed.  I can put up with a lot in the way of noise, price, or lack of customer service, but if the bed looks unclean...I'll sleep in the Canyonero.

** Especially if you've been driving all day and haven't gotten much exercise.  The Fitbit Surge is water resistant, so you can get some steps in while showering.

*** I say there was "no way," but that's not quite true.  I did think out of the box, but the best immediate solution was to stand on the toilet, reach over the shower wall, and activate the taps.  I tested this method and found that at 5'9" I'm about four feet too short to make this work.

**** I don't really say "bull dookey," but I try to keep my blogs family-friendly, so...

Don't Duplicate Your Databases. Or Oscar Envelopes.

Not the Academy of Motion Picture Arts and Sciences.  If you watched the Oscars last night (or if you have a pulse and either an Internet connection or TV this morning) you've probably already seen the finale -- Warrant Beatty announced that La La Land had won* Best Picture, the producers gave their acceptance speech, then the audience was told that a mistake had been made -- the Best Picture actually went to Moonlight

Apparently PricewaterhouseCoopers has a person waiting on each side of the stage, and each of these people holds an identical set of envelopes for the announcements.  (Their names are all ove the Internet, if you're that interested in personally vilifying them.)  When a presenter approaches, the person on that side hands the presenter an envelope.  See where this is going?

No Oscar for you!

No Oscar for you!

After the Best Actress award was presented, one of the PwC folks was left with the unused envelope and accidentally handed it to Warren Beatty as he and Faye Dunaway approached to announce the Best Picture.  Beatty was confused by this, clearly.  I think he was trying to ask Ms. Dunaway about it; she saw La La Land next to Emma Stone's name and announced the film as Best Picture. 

Anyone with PMP certification could tell you this is a process just waiting for an error.  In fact, any perceptive 14 year old could tell you that.  Or a BI person -- this is why we get so pissy with people who want their own replication of a database "so I can do my own reporting."

We have to have two envelopes; we don't know from which side the presenters will approach.  How can you NOT know this?  It's one of the most choreographed events outside of the Super Bowl Halftime Show.  The seat numbers aren't assigned in that auditorium?  Even Cinemark assigns seats now when you want to see Resident Evil: This Series Just Won't Die.

It's never been a problem before.  So your success rate has been excellent in the past.  Good for the past!  Weigh the acceptable failure rate -- clearly, failing once for the Best Picture presentation isn't considered acceptable.  Likewise, an airline can make 4,999 safe landings and nobody says about a plane crash, "But it was only .02%!"**

We suppose you have a better solution?  Well, sure.  To start with, only print one set of data.  I mean, envelopes.  That'll reduce the chances of your end users giving conflicting information.  Second, plan ahead.  I know, strategic planning is so 1980's -- it's a lot easier to just hope for the best.  Third, if you really can't predict the direction the presenter will approach from, just have one envelope, and have Vanna White deliver it to the presenter after he/she has reached the microphone.  If anyone can make a superfluous activity seem vital and elegant, it's Vanna. 

At the very least, invest in some operations management.


* Yes, I originally had "one" instead of "won" here.  Dragon Naturally Speaking is great, but everyone wonce in a while it chooses the wrong homonym and I miss it.  Thanks Eileen!

** It's probably not as extreme a comparison as you think.  People today seem to take it as personal betrayal that George R. R. Martin is so far behind on the next Game of Thrones novel, or that Burger King changed their menu.  I'm waiting for the protest marches to start.  Shouldn't be long; the conspiracy theorists are already hard at work.