MongoDB hopes to succeed where Hadoop failed - launches Atlas Data Lake

Profile picture for user ddpreez By Derek du Preez June 18, 2019
Summary:
We speak to Mat Keep, director of product marketing at MongoDB, about how the NoSQL vendor is extending services beyond the database.

Image of a data analysis sheet

MongoDB is making a number of interesting announcements this week at it’s annual MongoDB World event in New York City. It is making a pitch that extends beyond its core NoSQL database remit and the vendor is aligning itself with strategic data decision making in the enterprise - a move that gives it an additional ‘value add’ compared to where it was a couple of years ago.

We already know that MongoDB has seen huge growth because of its focus on making an alternative to traditional, relational databases ‘easy’ for developers. It’s strategy has been centred around the developer user experience. It is now hoping it can take this further by tapping into this differentiator and extend that user experience out to other strategic data services within the enterprise.

I got the chance to speak to Mat Keep, director of product marketing at MongoBD, ahead of the event this week to get a breakdown of the announcements - of which there are a slew. There is an update to the core database itself, with the announcement of MongoDB 4.2, which saw the extension of multi-document ACID guarantees from replica sets to sharded clusters (an update on last year’s ACID announcement) and enhanced security features, such as field level encryption.

However, the most notable announcement is that of MongoDB Atlas Data Lake, which is aimed at making it easier for enterprises to make full easier use of all the data flowing into the enterprise. Atlas Data Lake will allow customers to query data on AWS S3 in any format, including JSON, BSON, CSV, TSV, Parquet and Avro, using the popular MongoDB Query Language.

MongoDB hopes that this will provide a useful alternative to Hadoop, which often requires heavy lifting, is expensive and resource intensive. Keep explains:

I think where a lot of the attention will come is how we are extending beyond the database into new use cases and new services. And at the core of that is the new MongoDB Atlas Data Lake, which allows you to take that MongoDB query language, which developers are already very familiar with, and just put point it straight at data stored in S3.

There’s no need to load data into a specific data warehouse or a dedicated cluster. It’s a very quick and convenient way for developers and analysts to unlock more data that’s stored in their data lake - you don’t have to move data anywhere, it’s auto-scaling, you can have as many concurrent users as you like.

Keep adds that MongoDB isn’t going after data warehouses, which he sees as a mature and sophisticated technology sect. Instead, MongoDB wants to make it easier for enterprises to tap into and get insights from the data that is being archived in S3 - data that doesn’t lend itself to mapping into neat rows, columns and tables of a traditional relational warehouse. And this is where the challenge to Hadoop lies, which Keep feels has failed to live up to what was promised in its early days. He says:

That’s where Hadoop came in, with massive hype around 10 years ago. But Hadoop has clearly failed to live up to the promises it inferred. We certainly see this as an alternative to those Hadoop based deployments, where you’re ingesting lots of streams of data into S3.

With that data you don’t know all the data structures in advance. Hadoop incurred huge complexity and engineering cost to do it. The vast majority of organisations don’t have that, but they still want to be able to extract insights from all this new data that’s flowing into the organisation - click streams, sensor data, social media feeds, mobile apps. That is what the Atlas Data Lake is designed to enable.

I really don’t think there’s a huge amount you could do in Hadoop that you couldn’t do in the Atlas Data Lake.

Democratising data with Charts

Whilst a totally different offering to Atlas Data Lake, MongoDB’s introduction of Charts at the end of last year can be perceived as being two sides of the same coin. What I mean by that is that MongoDB is broadening its offering to ensure that it becomes a platform that is central to an enterprise’s strategic decision making.

Charts is essentially a data visualisation tool, that is formed part of the core MongoDB platform, potentially removing the need to integrate with other BI tools. Given the recent acquisitions of Tableau by Salesforce and Looker by Google, it’s clear that this idea of democratising data across the enterprise hasn’t run out of steam just yet.

This week Charts has introduced new capabilities that include embedded charts in external web applications and geospatial data visualisation with new map charts. Keep explains:

You’ve always been able to connect BI tools like Tableau and Looker to MongoDB using something called our BI Connecter. What that does is act as a proxy between those visualisation tools that talk SQL. So the BI Connecter mediated between those two layers.

But for a lot of users, they wanted something that was more simple and could work against MongoDB data, without having to flatten it into rows and columns. That’s what Charts gives them. It’s not going to be as rich or powerful as Tableau, but for people that are building Excel-like charts into their apps or into dashboards, MongoDB Charts gives you that capability.

There’s a huge desire to democratise data access and make it accessible to business users, so that they can very quickly start to slice and dice and get visualisations of data. There has been a huge appetite to make data much more accessible to business users, not just data scientists. Charts is following that stream of making it very easy to start to get insight out of all of the data that’s flowing into your enterprise.

Realm

Finally, and continuing along the same theme, MongoDB unveiled its ‘vision’ for Realm, a mobile database and synchronisation platform that it acquired just last month. Realm will merge with the serverless platform MongoDB Stitch, which effectively acts as an integration layer. With direct access to the database, Stitch pulls in standardised third party services, allowing developers to spend less time coding integrations.

Realm’s synchronisation protocol will also connect with the MongoDB Atlas global cloud database on the back-end. The thinking is that this will create a useful way for developers to connect data to the devices running their applications. Keep says:

MongoDB completed acquisition of Realm back in May. We see lots of synergies between what Realm has done in the mobile space and what we have done in the back-end database layer. But what they didn’t provide was any kind of back-end service or any kind of sync service, so there was still a lot a developer had to do there.

So, bringing that together with one of the world’s most popular, embedded, modern database and providing a completely managed service with MongoDB Atlas and Stitch, just helps front-end developers to build these very reactive, responsive mobile apps and hook all of that into a very powerful, back-end database platform.

My take

Expanding its breadth and its depth, MongoDB is slowly but surely extending beyond its NoSQL database roots. Becoming something more strategic. However, with that comes a lot of wok too. MongoDB itself has to think and talk differently to customers. Rather than being an easy to use tool that it can throw out there, it needs to be able to talk comprehensively about the challenges data and operations bring to the digital enterprise. It needs to become a trusted adviser and an authority on some of the most complicated topics out there.We are seeing signs of that already and it appears MongoDB is entering into a new phase.