Big data projects - what have we really learned?

Jon Reed Profile picture for user jreed December 19, 2013
Companies are still in the early stages of getting value out of big data, but enough projects are completed to dig into lessons learned.

Big Data
Some of us are sick of big data. Others would say the best lies ahead. Gartner, which makes a formal study of disillusionment, says that we are still early in the big data hype cycle and the big data trough of disillusionment is still ahead. Oh boy.

The reality is perhaps more interesting. Yes, companies are still in the early stages of getting value out of big data, but enough projects are completed that formal studies have been done - and lessons learned. Here's my takeaways from research of note.

Highlights from IBM's big data field research

Recently, the IBM Institute for Business Value released findings from a study of 900 business executives from 70 companies (research free with sign up). The IBM Institute for Business Value is a product-agnostic research team within IBM that has been publishing analytics research since 2009. In 2012, the group focused their research more closely on big data field research (they consider big data as a subset of analytics for their research purposes).

The latest research focuses on the 19 percent who identified as 'substantially outperforming their industry and the market'. They identified factors across companies that yielded better results. Here's a graphical overview of some of their findings:


On the value realization front, the study found that almost 40 percent of companies see a quick return on big data investment - typically within the first six months of analytics adoption.  But on the flip side, when it came to the three factors needed to drive analytics transformations within organizations (sponsorship, trust and skills), the study found significant gaps in all three areas.

Analyzing the high performing respondents, the study found 9 interrelated 'levers' that seemed to distinguish the higher performers achieving the best big data results. Here's my definitions of the 9:

  • Culture: meaning the pervasive use and embrace of data within an organization
  • Data governance: formal structure related to data governance and security
  • Expertise: an effective plan for upskilling analytics capabilities within lines of business
  • Funding: a disciplined and structured approach to funding big data projects
  • Measurement: ability to numerically measure the business outcome of big data initiatives
  • Platform: Comprehensive hardware and software capabilities supporting multiple projects
  • Source of value: collective understanding of the types of big projects that produce results
  • Sponsorship: executive endorsement of big data projects and 'analytics culture'
  • Trust: organizations willing to share information across silos to get a better result

Most of these bullet points made sense to me, though I thought 'source of value' was a bit more vague than the rest. When you consider that these characteristics need to be working together to achieve a good result, you can see why big data projects can struggle. Cutting edge technology doesn't solve most of them: culture change, sharing across silos, cultivating analytics skills, and building a data governance model that allows for data sharing while protecting data security - these are monster challenges for most companies.

Given the obstacles to building data-driven culture, it's easy to see how political infighting would quickly dampen such initiatives. During a podcast discussing the research, Rebecca Shockley, Global Research Leader for Analytics at IBV, made the point that another cultural obstacle is openness to data change - meaning that the results from big data might point to business decisions that are counter-intuitive to the current market direction. Even with systems in place, it can take fortitude to make the changes implied by the data.

Shockley brought out another point during the podcast: the crucial issue of big data expertise. I was expecting to hear her say that the exceptional respondents were way ahead of other companies on existing big data skill sets. However, she said that the companies with the best big data results had only 'slightly better' skills within their organizations. Shockley did see a skills advantage, but of a different kind. The companies excelling in big data are a lot better at developing skills internally, and making sure skilled individuals available where they are needed. Shockley:

It's really around the development of those skills, the focus they put on making sure that the folks within the organization get the training that they need, and that the organization as a whole has access to the skills and capabilities within the organization.

Research roundup: skills and data variety stand out

During my big data research, two factors stood out: skills and data variety. Lack of appropriate skills were frequently cited as an impediment to greenlighting big data initiatives. And when it comes to the 'three V's' of big data: velocity, volume, and variety, it was the third, variety, driving the most compelling use cases.

During a November 14 webinar on Big Data and Analytics Strategy Essentials, Gartner's Doug Laney reinforced those points with the following strategy guidelines:

  • By 2015, more than 30% of analytics project will deliver insights based on structured and unstructured data.
  • By 2015, over 50% of Big Data solutions will make use of data streams from instrumented devices, applications, events and/or individuals.
  • By 2016, 30% of businesses will have begun directly or indirectly monetizing their information assets via bartering or selling them outright.
  • Through 2017, premiums for big data related technology and project skills will remain 20-30% above norms for traditional IM skills.

The exact nature of the skills shortfall varies based on who you talk to. Data scientists with the ability to build algorithms for industry-specific use cases get a frequent mention (Many BI teams are not able to construct such algorithms, or need guidance moving from distributed to sequential algorithms).

On the techie side, companies might find their internal SQL skills need to be complemented by Map Reduce and Java skills. Systems admins may also need some serious upskilling to work with big data computing environments (here's one recent piece elaborating on big data skills needs). One thing stood out to me: companies don't want to approach big data like classic waterfall ERP projects where they pay high-priced consultants to do the bulk of the work. They want to balance external experts with internal competencies - a healthy outlook for customers even as it inflicts change on systems integrators.

For data variety, there are all kinds of interesting use cases popping up. An early misconception is that 'variety' means social media data. While it's true that unstructured data plays a key role, and that getting closer to customer behavior is a key driver of early big data projects, variety has a much wider definition than social sentiment.

Variety includes public and private industry feeds, sensor and mobile data and a sometimes-overlooked source: 'dark' corporate data that is trapped in legacy systems or used only by one corporate silo. Pulling in new data sources for better decision-making or improved customer experience is, in my view, the most compelling part of the big data conversation. From using sensor data from Oil and Gas rigs to predict blow outs, to applying sensors to beer taps to ensure the beer keeps flowing (an SAP HANA startup Den and I filmed), the data variety use cases keep coming.

Concluding thoughts

I still don't think much of big data terminology, but I find myself moving off of knee-jerk skepticism into unapologetic curiosity about new business models and real-time applications. Data privacy and security are not to be taken lightly, and we shouldn't underestimate the difficulty of the culture change and siloed thinking - both will put a drag on the transformational aspects of becoming a so-called data-driven organization.

On the adoption side, the convergence of big data and cloud will be a story to watch. In theory cloud rollouts could ease adoption trauma and make skunk works projects easier (that's clearly what Amazon is thinking with their streaming analytics Kinesa service, now in public beta).

We can't accuse vendors of making up big data fantasies by themselves. They might fuel the hype for marketing purposes - ok, many of them absolutely do - but Gartner now tracks more than 300 companies with Chief Data Officers. On, 'big data' is the second most popular search term (I'll let you guess what the most popular one is).

One way to look at the upside is to think of successful big data projects as moving companies further down the path from products to services to customer experiences. That means data-personalized relationships that build loyalty and community, which in turn become the thread between formal transactions. That goes back to a better customer experience, which is a vision we can all get behind - even if we're reminded in the fever pitch of holiday shopping just how far most companies have to go in winning our loyalty.

Disclosure: SAP is a diginomica premier partner as of this writing.

Image credit: Big Data @ joreks -

A grey colored placeholder image