The battle for BI minds and wallets has shifted to the business user, with the explosive growth of data visualization vendors as proof points. But there's a problem. Until business users can process and cleanse their own data, they are back to the IT bottleneck. That ties hands and slows the widespread adoption of next-gen BI tools.
Or so says Paxata, a BI startup with big ambitions that formally launched on October 28. 'We are bringing self-service to the biggest bottleneck in self-service BI, which is data preparation,' Co-Founder and VP Products Nenshad Bardoliwalla told me in a follow-on to a briefing I attended with two members of Paxata's founding team, Bardoliwalla and Co-Founder and CEO Prakash Nanduri.
But is another best-of-breed tool needed in the BI portfolio? And how can a solution focused on business users help solve the 'dirty data' problem that stymies BI initiatives?
Why visualization isn't enough
Paxata's founding team, which hails from enterprise backgrounds from SAP to Tibco to Tableau, thinks they have the pedigree and the product to change the data transformation game. With $10 million in funding, the Paxata team points to Dannon, Box, and Pabst as early customers that vouch for the enterprise credibility of their solution - not to mention out-of-the-gate partnerships with Tableau, QlikTech, and Cloudera.
'Our goal is to upend and change the lives of business people who normally have had to spend their time ewaiting for IT to do a data services project or write SQL Queries for them,' says Bardoliwalla. With today's vast array of feeds and the need for real-time collaboration, Bardoliwalla doesn't think that spreadsheets and Access databases - the go-to environments for most business analysts - are adequate tools when it comes to the challenge of proliferating data sources.
They call their solution an 'Adaptive Data Preparation Platform', which is a fancy way of saying. 'This ain't your grandpappy's MDM (Master Data Management) and ETL.' Fully buzzword-compliant, Paxata is an HTML5-based multi-tenant solution that was built in 12 vigorous development sprints.
Why would seasoned BI executives invest their careers in this uncertain endeavor? Bardoliwalla:
The thing that was shocking to us was - there was this gaping hole. Nobody was tackling this notion of a next generation information management platform. Nobody was looking at how to rethink data integration, data quality, enrichment, governance and collaboration, and put it in the same platform.
Nanduri took the argument a step further. In the mad push to equip users with visualization tools, BI has gotten ahead of itself:
The biggest challenge to decisions in the enterprise is not in analyzing data, it's getting the right data in the first place. Getting the right data is what prevents all the other down‑stream stuff from happening.
Can business users really clean their own data?
I've never seen a business user clean and transform their own data, so I brought my skepticism with me to the Paxata product demo. Bardoliwalla and Nanduri walked me through an example not unlike what they did with Pabst Blue Ribbon. Paxata explained that Pabst used Paxata to mash up and organize their distributor, product and retailer information. The result? A single view of data - one that would help them slice and dice questions such as which distributors are excelling with which products, and through which retailers?
Paxata's appeal for this kind of use case: get the data sorting done on the fly, thereby avoiding a clunky, multi-year data services project. Here's a zoom in on a screen shot similar to those that business users at customers like Pabst would work with:
As you can see, the screen looks straightfoward, not dissimilar to an Excel look and feel. But appearances can be deceiving. Within this screen, the Paxata user is provided with a set of contextual actions for filtering, transforming, clustering, and semantically reconciling the above data. It's all done on the fly, without any advance pre-definitions or dreaded aggregates.
The example I was shown involved harmonizing data from several sources that had chaotic and badly organized information on distributors. One example was data culled from an XML format - but all the geeky XML code was hidden, showing the business user only the data fields that would make sense to them.
Tools like visual pattern recognition and graph analysis allow users to organize data in ways that actually make sense to their needs. Machine learning and semantic algorithms reinforce the user's agenda, allowing the system to 'remember' the preferred field structure for future data pulls. When other data reconciliation methods fall short, full text search is available also. Once the user spots the patterns that impact them, they are off and running - with no script writing or map/reduce jobs. Beastly tasks like splitting columns (in this case of my demo, distributors organized by zip code), is not difficult.
This kind of data manipulation could pose a governance nightmare. Paxata thinks they have addressed this via complete recording of all data changes made by each user - a level of visibility that is very hard to achieve in a typical spreadsheet. Data can be reverted to a prior state, and multiple users in different locations can manipulate data while collaborating in real-time.
The results? Paxata has already published a couple of case studies documenting their projects. In the case of Pabst, that meant reducing the time it takes business analysts to reconcile Great Plains general ledger data with information from Margin Minder by more than five times. According to the case study, Pabst sales and marketing analysts are able to provide new insights into sales effectiveness - without the need for additional IT resources.
Paxata is a provocative startup with plenty of nuances I didn't get into here. The partnerships with Tableau and QlikTech indicate that Paxata has staked out a sweet spot in between data sources and the leading visualization engines. Paxata's team sees a range of potential use cases, including sitting on top of Hadoop or SAP HANA. Considering the SAP background of some of the executive team, a Paxata-Lumira use case is also theoretically possible.
It will be interesting to see how the 'big BI' incumbents, from IBM to Oracle to SAP, respond to Paxata. Then there are information management focused vendors that will surely have something to say about their own capabilities versus what Paxata does. I will not be surprised to hear some of them say 'we can do the same thing Paxata does,' and it will be up to the enterprise customer to decide for themselves the veracity of such claims. Other startups and BI all-in-one tool vendors will have their say also.
I didn't cover pricing; the nice thing is I didn't have to. Paxata has published a pricing page that puts the details out in the open. As for the not-sexy UI, Paxata has intentionally designed their UI to feel very familiar to Excel users. Since their agenda is to serve up information to visualization tools, Paxata's user experience strategy is, by definition, different than a visualization vendor.
When I asked Paxata about their biggest challenge next year, Bardoliwalla responded: 'Our number one challenge is ensuring effective sales execution, which requires laser-focused positioning. There is so much froth in the analytics/BI/Information Management market and so many vendors, that differentiation is very difficult.'
That sounds about right. The Cloudera partnership might be the most convincing illustration of how Paxata fits into the next-gen BI narrative. Cloudera wants to become the enterprise data hub, whereas Paxata wants to empower business users to make that data relevant. As Tim Stevens of Cloudera put it: 'Paxata will give our customers new ways to streamline the delivery of their original source data to a broader range of non-technical users.' It's a worthy ambition, that's for sure - and a story to watch.
Image credit: Business plan © Sergey Nivens - Fotolia.com
Disclosure: SAP is a diginomica premier partner. Bardoliwalla and I are both Enterprise Irregulars, a group of enterprise bloggers and practitioners, as is diginomica co-founder Phil Wainewright.