MongoDB CTO on when MongoDB works and when it doesn't

Profile picture for user ddpreez By Derek du Preez November 9, 2014
Summary:
MongoDB may be taking aim at Oracle, but in what environments is it best suited? CTO Eliot Horowitz discusses.

eliot horowitz

Following diginomica's recent story about MongoDB taking aim at Oracle  with its new global consulting arm, headed up by Vijay Vijayasankar, I snapped up an offer to sit down with MongoDB co-founder and CTO Eliot Horowitz to further explore the thinking behind the company's go-to-market strategy. Mostly I wanted to get a better idea of what this very young company needs to do to be successful in its (undeclared) mission to go after the very well established RDMS players.

Sitting down with Horowitz wasn't like sitting down with most vendor executives. Horowitz was a little bit awkward, his answers were quite concise and as well as highlighting MongoDB's strengths, he was equally very quick to admit the “huge problems” with the technology – all of which I applaud him for. Instead of giving me a sales pitch, he was happy to have a frank discussion about MongoDB's position in the market and where it is headed.

As Dennis rightly highlighted earlier on in the week, MongoDB is most suitable for applications that support short run or single processes – it's not really fit for long run processes found in the likes of an ERP environment. And this is a view that Horowitz readily supports. He admits that MongoDB isn't going to be suitable for all use cases and in fact Oracle may well be the answer for some environments. He said:

I think it's less about specific use cases and more about specific problems and environments.There are characteristics with use cases that really spring to mind, rather than just saying it works really well for CMS. If you think about it broadly, MongoDB works really well when your data doesn't fit a very rigid schema, but you still need to do complex kinds of operations on it. One of the ways I describe it is if your data doesn't naturally fit into Microsoft Excel, but you still need to do complex queries on it and analyse it and aggregate it and do realtime work on it, that's when MongoDB works really well for you.

MongoDB is designed for OLTP workloads, so more transactional, online, realtime workloads. People definitely do use it for batch processing, and it works okay in those areas, but it's really designed for OLTP. Where you've got a user, or a system, that is working with it in realtime. Sometimes its people interacting with a website, sometimes its monitoring data with machines talking to it, but it's very realtime and very action based, not really reporting based.

Horowitz described how a lot of the company's biggest and most successful clients are using MongoDB in situations where they have a number of different data sources, but all of the data coming in from those sources is similar some way and so it makes sense to have it all sitting in MongoDB. He provided an example of MetLife, an insurance provider, which had been trying for two years to compile all of its policies for millions of customers into a single relational database – but had failed.

They tried for a couple of years to get it to work, but with MongoDB it works very naturally and they were able to get it live within a matter of months.

Horowitz said there isn't a 'typical' project that MongoDB is suitable for and he finds that he is dealing with companies that are starting afresh with MongoDB, as well as those that are ripping out parts of Oracle and replacing them with MongoDB.

eliot horowitz
MongoDB CTO: Eliot Horowitz

But he does at admit that for the largest enterprises out there, they are never going to do a straight swap out between Oracle and MongoDB. Horowitz said:

Obviously a lot of enterprises have a lot of Oracle and they are moving things piece by piece. They are never going to be a full swap. For small companies yes, but for big companies that's an impossible proposition for them. What they can do is start moving piece by piece where there are pain points. What we tell every big client that if they have an application and its on Oracle and its working fine, don't touch it. We will focus on the new projects and the projects that are causing problems.

So what are these 'pain points' that Horowitz refers to? He defines them in two particular categories – developer productivity and cost efficiency. So companies are either finding that their relational databases aren't allowing their developers to adapt and add features fast enough because there is something wrong with the data model, or they can't maintain the cost of maintaining the Oracle or mainframe installation because it is cost prohibitive.

However, when I asked whether or not it was the cost element that really hooked customers in when having initial sales discussions, Horowitz was quick to disagree. He sees the developer productivity as the main win for MongoDB installations. He said:

When I talk to a lot of CIOs, if you think about what their cost centres are, developer cost is higher than operational cost. The people cost more than the hardware and the software. The developers want to be able to focus on making new products and making products, rather than having to think about working around infrastructure problems. That's what the business wants the developers to do and that's what the developers themselves want to do. The number two driver is operational cost.

I was keen though for Horowitz to give me an idea, or provide me with some examples, of how productive MongoDB is compared with a traditional RDMS, but also how much cheaper it could be. The answers I got weren't definitive, but this is hardly surprising – I don't think MongoDB is trying to sell itself as the cheap alternative to Oracle. Rather it's positioning itself as the database for web apps. He said:

It's very use case dependent. We've got examples that are 50X better, 2X better. Then you can look at people on the other end of the spectrum like MetLife that failed for two years with Oracle and had made no progress, then in three months had it working with MongoDB. So is that 8X better or is that infinitely better?

There's two reasons why it might be cheaper. MongoDB is designed to scale horizontally across commodity hardware, so you don't need to have these high end big boxes. The second is the efficiency of the server. This completely changes the way you think about cost. You can start with 30 servers, then go to 50 or 100 when you need them, rather than those massive boxes for four or five years. It's hard to generalise [or be specific about cost savings] because it's very use case dependent.

There are plenty of examples where Oracle is going to be the perfect solution, MongoDB is not going to solve every database problem forever. But there's just some use cases that will make you completely change the way you think about an application.

Horowitz was also very honest about where and how MongoDB is lacking in its current offering – most notably in terms of integration capabilities and some areas of high performance. A lot of this is going to hopefully be addressed by the imminent v2.8 release, which is due at the end of this year or early next year, which will include an integrated storage engine (dubbed Wired Tiger), and some new automation tools.

He added that building up the capabilities of MongoDB will largely be done through consulting with clients to understand what is missing from the

OverwriteData04
product and what is needed, as well as via partners and the eco-system. Horowitz said:

There's a lot of little things we don't have, performance in certain kinds of use cases is a big deal. But if you think about it from a big picture standpoint, MongoDB has been in the market for a little over five years now. If you compare that to any other product that any of these other companies are looking at, it's a very, very young product. There is a lot of integrations and there is a lot of things that MongoDB just doesn't support yet – and I think that's something that will come in time and we need to keep innovating on those things and we have to keep making it better and better.

One of the biggest challenges today is on the certain use case performance, which is what we are addressing with v2.8 – it's not the panacea for all MongoDB problems in the world. We announced that we have integrated a new storage engine called Wired Tiger, which can work in very high rate volume workloads, has support for compression and a lot of features that people are looking for.

In the relational world you've got a few big boxes, in the MongoDB world you could have 2,000 commodity servers, so you need really great management tools for that. That's a huge problem for us. The other problem is that MongoDB sometimes has issues in high right intensive workloads, and the new storage engine will make that a lot better.

The other big thing is automation, where you can have automation tools that let you manage very large clusters all from a very simple pane of glass. This morning we did an upgrade of a 30 server cluster via UI in a matter of minutes, as opposed to having to go in manually and do it. Upgrading a 30 node cluster with zero downtime can be a little tricky, but now the tool will do it for you with a few clicks.

My take

© Paul Fleet - Fotolia.com
I think Horowitz summed it up when he said that “MongoDB's young enough that there's not a definitive playbook about following the exact rules”. But that's exactly where it needs to get to in order to be taken to the next level (well, as much as any technology can 'follow the rules')– it needs to figure out the integrations, the back ups, the automation. All of the things that would make it an easy enterprise choice.

At the moment MongoDB is relying upon the fact that it is a cheap enough product to fail fast with and to learn from, which is fine. But when you go to the MongoDB events you still get the sense that it's a tech toy for developers. Every session I went to people were asking very technical questions about the product – rather than asking questions about the business benefit. Which suggests to me that this is a product that is yet to really grab the attention of the CIO.

That being said, the customer sessions I went to suggest that when you get MongoDB right, it's a brilliant database to work with. In a session with the Home Office, the main guy working on the project (who has had a number of years experience using MongoDB) said that it is now just the obvious choice for most things he develops – he rarely looks elsewhere. That says a lot.