Moneyball or moneypit? When big data infatuation meets unintended consequences

Profile picture for user kmarko By Kurt Marko October 9, 2017
Are the troves of data now available masking long term, sustainably profitable business for short term gain? Examples from baseball, airlines and hotels suggest 'yes.'

baseball-graphic moneyball moneypit
Moneyball or moneypit?

Are we looking at Moneyball or moneypit as more and more data is sucked into systems designed to help decision makers of all stripes, both consumer and business?

Arguing against the benefits and proliferation of technology is a fruitless endeavor often delivered by neo-Luddites who see a dystopian outcome from every advance. Nevertheless, only the naive refuse to face technology's unintended consequences. Many of the naysayer arguments are highly debatable, such as the contention that smartphones hijack our minds, (WSJ paywall) are deleterious to our intellect and stifle neurological development. Other arguments are easier to measure, such as the changes brought by the proliferation of data and its increasingly sophisticated analysis among all manner of activities.

In this season of playoff baseball, one of the more glaring examples was described in a recent article detailing  the Downside of Baseball's Data Revolution: Long Games, Less Action. (WSJ paywall.) In that article, the WSJ reported that Major League Baseball chiefs were worried that the obsession with data was bringing the game to a grinding halt - literally. More of that later.

While detrimental, at least to those watching, changes to a sport are a trivial example (although perhaps not if you're an advertiser, broadcaster or team owner) in the scheme of things. Nevertheless, the baseball case is instructive for organizations using (or expecting to use) data to shape decisions and where the process of data collection and analysis can of itself create unanticipated problems that negate the advantages of data-backed insights.

Baseball - failing at Moneyball?

Baseball is the sport with the longest history of data collection. Data's impact is profound, to the point where a player's legacy is often defined by numbers.

Whether it's 61 (home runs), .400 (batting average) or 20 (strikeouts per game), fans and managers have long been preoccupied with all manner of statistics. ERAs, RBIs and batting averages have been joined by a slew of new measures that have evolved into a specialized field of baseball statistics called sabermetrics. Many of the new metrics, like home run launch angle and exit velocity, quality of pitch, spin rate and route efficiency rely on a new generation of data collection technology called Statcast that uses cameras and other sensors deployed by Major League Baseball in every ballpark to accurately measure the motion of players and the ball. In that sense, Statcast data resembles the vast array of new records, businesses are collecting through IoT and mobile devices.

Most people are familiar with the management phenomenon named after Michael Lewis' perceptive book, Moneyball, in which Lewis describes the clever, pioneering use of pre-Statcast Sabermetrics by the Oakland A's in the early 2000's. The team was losing baseball's talent race due to a tightfisted ownership and limited payroll. Lewis discovered an equalizer by using strict adherence to data analysis to select players and develop hitting and pitching strategies that violated the conventional wisdom of baseball's orthodoxy. As the book details, these techniques worked and have subsequently been adopted by many other teams, largely eliminating the A's strategic advantage.

Fast forward to today, and the advent of vastly more detailed and accurate data sources and the ability to digest them in creative new ways continue to shape the game in dramatic and unexpected ways. As the WSJ article highlights in referring to a presentation at last year's owner's meeting.

Two Major League Baseball officials and a statistician told the group that the sport was being brought to a standstill by the very phenomenon that has revolutionized it in recent years—the embrace of data analytics to drive strategy.

Baseball has never been more beset by inaction. Games this season saw an average gap of 3 minutes, 48 seconds between balls in play, an all-time high. There were more pitcher substitutions than ever, the most time between pitches on record and longer games than ever.

For example, in the last 30 years, the average length of a game has increased 11.4% (19 minutes), the number of pitchers per game has gone up 45% (from 5.8 to 8.4) and both the number of strikeouts and home runs reached record levels this year as hitters swing for the fences and fresh pitchers keep them off balance.

Any executive seeing this kind of significant change in their organization's KPIs takes note and digs into the broader business implications. Baseball insiders like the brain behind the A's Moneyball strategy, Billy Beane, is quoted in the article as seeing the changes as a natural evolution,

I wouldn’t call that bad. I would call that progress.

However, fans, particularly younger ones weaned on fast-paced video games and social network distractions are put off. MLB viewership is plummeting. Thus, data and analysis that optimize the game strategy and improve the odds of winning can end up harming the overall business. In this sense, the results baseball sees from its data revolution isn't unlike phenomena seen in other industries in which data-driven efficiencies lead to a degraded customer experience.

The sad fact is this is a known problem. A Market Watch article from February 2017 said:

Changes are being discussed within MLB to increase viewership and lifelong fandom, including rule changes, partnerships and cutting-edge streaming options, said sports media consultant Lee Berke. “While the flat screen is still the majority of your audience, there is a small but growing number of viewers consuming content on their phones, tablets and laptops, and it’s imperative for baseball — and any sport, for that matter — to make sure their content is on those screens.”

Is anyone among the MLB execs learning?

Lessons from other industries

Travel and leisure are industries where many people have felt the experience-sapping 'success' of applied big data analytics. Traffic data and predictive models have allowed airlines to perfect the art of route optimization and maximal seat utilization. Over the past 40 years, the fleet-wide load factor (a measure of seat utilization) for U.S airlines has increased over 43 points with planes now averaging about 83% full as airlines have mastered the task of precisely matching aircraft capacity with seat demand while aggressively eliminating or reducing service on routes that don't consistently run near capacity.

The result is that airlines have rarely been more profitable while airline passengers have never been more uncomfortable, with customer satisfaction ranks down in the cellar with health insurers among the worst industries as seats get packed closer together and crowded flights have eliminated empty middle seats.

The ability to measure market demand, competitive supply and temporal or externally-driven variabilities with increasing granularity have led many businesses to adopt sophisticated price and revenue optimization (PRO) strategies that often leave customers unsure whether they got a good deal or ripped off.

Hotel rates are a constantly moving target that vary by season, day of week or even the time one books a reservation. Do customers benefit? Yes and no. Kayak and will dutifully tell you how prices are shifting on flight routes and at hotels in chosen cities, but that's about as far as it goes. The user is left second guessing about the optimal time to booking. That's a flat out losing proposition in corporate T&E management today but who knows where predictive analytics lead us next?

Amazon is the master at data-driven pricing, continually monitoring demand and the competition while tweaking prices several times a day if necessary, a practice that goes into overdrive around the holidays. Unfortunately,  it's shifting prices can leave customers thinking they got a better deal than they actually did, a practice which cost them $C1 million when Canada's Competition Bureau found that Amazon was inflating reported customer discounts. Oops!

While dynamic price and supply optimizations undoubtedly improve a company's profitability in the short term, when they degrade the customer experience, increase confusion and lead to distrust, such data-driven efficiencies are destructive in the long run, opening the way for disruptive new competition like JetBlue, Airbnb or (now part of Walmart).

Other potential pitfalls of data infatuation

Companies risk paralysis by analysis, so-called data daze, a phenomenon that occurs when users let a flood of new data sources overwhelm the decision making process. One survey found that 97% of organizations are struggling to use the data they already have. If executives delay decisions due to a preoccupation with having the latest data or running yet-another-analytic-study, the competitive time value of data is lost.

Another problem occurs when the data doesn't conform to pre-existing theories or conventional wisdom, as exemplified by the Moneyball scenario. A mismatch between analytic models and gut feelings can create data distrust in which executives seek reasons to doubt the validity of the data and analytic models. While having a healthy skepticism about data quality is necessary, throwing irrational roadblocks in front of a fact-based analysis creates another form of organizational paralysis.

My take

The collection and analysis of vast new troves of business data often provides valuable insights. However executives shouldn't limit its use to merely that of an efficiency tool that improves existing operations. Therein lies the path to turning Moneyball into a moneypit. As the baseball, airline and other examples illustrate, there's a significant risk of optimizing processes at the expense of long-term profitability. It's analogous to the operations research problem of getting trapped into a local maximum, short-term efficiency, and missing the global maximum of sustainable profitability.

Instead, management should spend more time on using data to inspire new products and services that satisfy unmet and previously unknown needs, improve the overall customer experience and increase loyalty and repeat business. Executives should leave the low-level optimizations to line of business managers while using the same data to feed other models aimed at strategic understanding and decisions.