Tell me how good it will feel when I implement your solution

Profile picture for user Neil Raden By Neil Raden June 17, 2019
In pursuit of happiness with the promise of harmonized data.


Speaking with a client recently about their analytics initiatives, I mentioned an anecdote from long ago when my company built a data warehouse and associated BI applications for the finance department. When we demonstrated our final product to the CFO, his only comment was, “Where is the export to Excel button?” I thought he was making a joke. He did not see the humor in this at all. On the contrary, he said, “Export to Excel has always been a clear demonstration of the deficiencies in the whole BI (Business Intelligence) proposition.”

What are those deficiencies? Excel and traditional BI tools are terrible solutions for business enterprise analytics, but their use are still pervasive because the whole analytical process in most organizations is disconnected today and each recipient of data resorts to the tools they are most comfortable with to find insights. For someone to arrive at an analytical insight, there are multiple steps and technologies from managing data, adding context and meaning to it, developing iterations of queries and models, and finally, presenting or displaying findings in sharable reports, visualizations, and active “storyboards.” Punctuating the analytics process with one more step, one more tool, one more expert to rely on, doesn’t seem like such a sacrifice. Unless you don’t have to.

Software marketing us a confusing mess. Consumer product marketing hits right at the benefit or relief or prestige of buying the product. Software marketing buries you with features and technology. It’s backwards. Of course, vendors do address the benefit issue, but they just never lead with it. In my opinion, the most pressing need today for getting value from digital transformation (or whatever you want to call it) is crushing the data problem, making sense of distributed sources, multiple protocols, embedded semantics and even timing issues (when data from multiple sources arrives at different times with different values). But if you look at the content of vendors that attack different aspects of the problem, a clear line between their technology and easing of the suffering is not there.

From 80/20 to 20/80

There is a belief in the big data industry that analysts and data scientists spend 80 percent of their time preparing their data and only 20 percent of their time doing actual analysis. There is really no way to prove the accuracy of those numbers, but that they have remained unchallenged for about a decade, lends credence to the fact that the data “prep” time is substantial. And really, that is all it is saying.

The root of the problem is that there is just too much data, arriving too fast for old, manual spreadsheet or data wrangling approaches or legacy data warehouses to cleanse and harmonize data. Data harmonization is an intelligent, machine- learning approach to prep and blends data from diverse sources without the complexity of traditional data modeling or reliance on IT experts. Even proven ETL tools are not adequate because they are still based on conforming data into an existing model instead of allowing analysts to flexibly work with a dynamic on-demand model.

But imagine this for a moment: the ability to “cycle” though analysis from start to finish without hand-offs, wait-states, errors, and constant requests to IT people. Instead with AI and machine learning, continuous insights from your most complex data feeds right into visual insights with business context. The machine discovers every bit of detail and relationships across vast amounts of data and dimensions. The business can click on any information and discover new insights unconstrained and collaborate in context to end team debates on whether or not the data is driving the right business decision.

Now if I were looking for a solution and the vendor started with this scenario instead of microservices, serverless, containers, Kubernetes and  vague “AI Washing” (claims of AI in the product that aren’t there), I’d buy it.

A CPG company exemplar

An interesting point about CPG companies (and Manufacturing companies) is that their customers are not the ultimate customers who purchase (or choose not to purchase) their products. Some companies, like cosmetics, or a manufacturer of tools, have such a close relationship with their customers (department stores, for example), that the sharing of information about “sell through,” what was actually sold at retail, is quite common. But for most CPG and manufacturing concerns, they are pretty much on their own to understand their ultimate customers and to take efforts to figure out how to influence them while being many hops removed from the consumer.

Three most often chased analytics by such companies are product affinity (also known as market basket analysis), response to promotions, customer segmentation on a micro-segmentation basis, and inventory right-sizing to adequately meet in-store and online demand. To address these analytics, companies need to gather their marketing data, sales data, demographic data, online click-stream data, sales and supply chain anomalies and competitive information. To this they add Nielsen or IRI data and external social media and brand response data.

This assemblage of data sources across supply chain sources, marketing, sales, distribution, and retailers, harmonized by AI-driven analytics, enables continuous insights, without the need for long IT cycles of prepping data, modeling it, conforming it into dashboards – all of which is nothing but a rearview mirror report, which in today’s competitive environment is not going to let you rapidly capture more sales and more market share. Instead, a better solution, powered by AI and machine learning, continuously reveal insights to business stakeholders, so they can capitalize daily on opportunities at every location the products are sold.

My take

Harmonized data is the holy grail of analytics. Everything else is just waiting.