A year ago, diginomica looked at the potential at the enterprise level of ‘graph’ – an upcoming non-SQL based form of database, which sees itself as forming a distinct database category in its own right.
Twelve months on, has any of that promise and swagger translated into genuine market momentum? Is graph a genuine CIO option for someone wanting to work with Big Data, say?
As with many things in both life and IT, the answer is ambiguously yes and no.
Downsides first: it wasn’t Neo that went public first but another ‘NoSQL’ database contender, Cloudera (Hadoop), last month. You will still ask in vain for any kind of Gartner graph Magic Quadrant or market bulletin - Neo admits that until the global graph market hits $500 million, the analyst bean counters won’t get out of bed.
And unless we missed it, no Fortune 100 – or even 2000 – company has announced that it’s ripping out all its relational investment in favour of a data model allegedly based on the way human beings naturally manipulate information (i.e. things and relationships, not nasty old schemas and tables).
So on that basis, Oracle investors can rest easy. But there are other signs that point to reasons to be cheerful from the graph camp. For example, Neo announced at its London GraphConnect Europe conference that it absolutely plans to IPO, by 2020, and is going to be profitable next year, for example.
And while Gartner may shun graph, Forrester Research has opened a book on it, with one of its upcoming TechRadar briefings is set to be on the topic. Meanwhile, although no-one’s replacing SQL with graph on an enterprise-wide level, there are some big names are starting to dabble in some significant ways. The conference boasted presentations from graph fans at Airbnb, Deutsche Bahn and eBay.
To understand why big brands want to work with this still minor data play is to understand, perhaps, what’s happening in the market as a whole. As Neo’s founder and personable CEO Emil Eifrem told diginomica:
When I grew up as a professional programmer in the 1990s, the database selection was Sybase, Informix, DB2. But it wasn't a choice of data model, because they all have the same mode; it was just a marketing choice.
That top-down push won’t cut it any more, because both the size of the addressable market for database has grown – which opens up opportunity for new entrants.
But its nature is also changing, too; new use cases are opening up where graph’s capacity for complex data mapping, where connections and links between entities that you didn’t know about before become visible, or which benefit more from being modelled as nodes and relations instead of other abstractions like tuples and tables, are coming on-stream.
Take Nordic telecoms giant Telia. It’s launching a new upgraded service to its 1.3 million broadband customers, Telia Zone, that it intends to grow to a smart home network, and its Head, Rickard Damm, told us that to do that, it uses Neo’s Neo4j product as a way to map and model what’s effectively an IoT network in utero:
Graph is future-proofing. We are using it as a way to get the information we want to have for developing future products. You have a social graph for Facebook and a page rank graph for Google, we’re building the first graph for IoT. The APIs we are publishing today are arguably not dependent on a graph, but the ones we are going to publish in the future are going to include machine learning algorithms, predictive algorithms, where we see the use of the graph. We are pre-empting that and deploying today to be ready.
But – and it’s an instructive but – Telia doesn’t only use graph. It’s key, but it’s not the only instrument in the Telia Zone orchestra. That’s actually the key to understanding what’s going on with graph, and also Hadoop, Lucene, In-Memory and all the other exotic data engines that have been popping up last few years; they are there because the older databases don’t do what they do well – but that doesn’t mean they automatically retire the others.
This sound IT sense was crystallised at the conference by Capgemini UK’s Dave Da Silva, whose presentation title was ‘Don’t Choose One Database: Choose Them All’. For Silva, who told the overwhelmingly t-shirted developer audience that:
Don’t try to find the perfect database for all your needs. It doesn’t exist. Use multiple ones for different problems, ideally together.
According to this thesis, you’d want to still be using SQL for large, complex business queries, but Hadoop for probing very large datasets. as well as in-memory for fast, ad hoc querying and investigation, and Lucene-style data software for complex free-text data discovery and retrieval.
That makes sense, as you’d not get great great results by pointing a conventional SQL tool at a Big Data heap, nor would you do in-memory work if your target was of a size that your hardware couldn’t cope with.
It’s worth noting that this idea of co-existence has in some quarters led to a belief that what you want is a database that can switch into these other modes as needed is not one Eifrem believes in at all:
I think many of the vendors today are confused and are starting to do this multi-model thing, where they want to add all the models into one database engine. You can do that with one or two, but after a while, ultimately, believe you me, when you lay out that information on disk or in memory it will be optimised for one data model. Unfortunately, there’s a lot of support for this in the analyst community.
What we really want, he argues, is pure-play graph, which is probably as undoubtedly as cool as Neo says it is, but for a sub-set of problems.
These can be very interesting problems, for sure. Eifrem outlined three - the world’s biggest investigative journalism scoop, The Panama Papers; eight projects he claimed were making real progress in cancer research; and an intriguing historical database achievement by NASA, whereby graph tech found, hidden in ancient records, an engineering hack that fixed an issue with the 1960s Apollo capsules that were affecting the new Orion ones. This saved $1 million and potentially means knocking two years off the time it takes for a Mars mission to be mounted.
He was also able to cite the new eBay ShopBot as employing graph, as well as saying seven of the top 10 retailers in the world now use graph to track purchasing and hence provide Amazon-style recommendations, including no less a name than Walmart, while graph is now used in verticals like fraud detection, identity and access management (UBS), master data management (Cisco) and network IT ops (HP).
So Eifrem’s unapologetic; he stands by his claim that graph is going to be Trump-style ‘yuuge’, that it will carve out a significant niche of its own, and that Neo won’t turn into SAS Institute (as in being a privately held stalwart that gradually loses relevance over time):
We are now able to have a conversation with a line of business executive in any industry, and we can have a conversation that is very exciting to them. There is not one narrative that works for all, it is tailored per industry. So we think there is an opportunity to build this company to become a big public independent entity.
There are questions about graph, for sure. It’s never going to be what you rebuild CICS in. And even if you do believe that its underlying data model is ‘human like’, that’s still no justification for using it to crack every problem. That is why we invented binary computers, after all.
But if the idea of co-existence and applicability of new exotic data software to our new exotic problems pans out, then absolutely graph needs to be taken as seriously as the others.
And if what Eifrem says about the database wars holds water – that it was always a top-down decision, which database to use, not the programmer’s – then judging by the 1000 developers at GraphConnect, this time round the revolution’s bottom-up.
So in this version of the database wars, will the t-shirts and ponytails get to win? Maybe. Maybe not. It's not the grudge match of the kind we saw between stubborn Cullinet and disruptive Oracle, flat-file v relational. Nor is it the hype-drive blitzkrieg of the Oracle v Ingres v Sybase v Informix v IBM v Microsoft conflicts. But there's the makings of a new front opening up over a highly-lucrative piece of technological territory.