Two years ago, the UK’s most successful car sales site had the right sort of problem. With 70 percent of the market, Auto Trader had already made the tricky transition from print magazine to a purely online service with 50 million page impressions a month and rising. But its on-premises tech infrastructure was beginning to feel the pressure of that growth.
After looking at different strategies, Auto Trader decided to migrate its entire operation to the public cloud. With the whole organization driven by data, a central pillar of the migration strategy was a seamless move of its databases, a journey that’s worked out surprisingly well. Here’s how they did it.
Mohsin Patel, principal database engineer at Auto Trader says:
We have to constantly innovate to stay market leader. Of our 800 staff, a quarter are developers and the infrastructure has to support their need for agile engineering, fast feedback and swift releases.
The 42-year-old company got onto the web in 1996.
Ten years ago, we were Oracle and SQL. If someone wanted to create an app on the corporate database, they were given a schema and that was it.
That monolithic approach has fallen out of favour since then, with open source software in production and developers wanting support on their terms.
Other drivers were cost reduction, through shedding physical assets and per-core licensing fees, and speeding up provisioning. System management was soaking up resources, and the lead time on commissioning new hardware might have been suitable for waterfall projects but didn’t fit agile. Patel explains:
The monolithic approach was fine when we were moving slowly, but we put out 15,000 releases to production last year, 40 percent being tier one business drivers, and aim to hit 35,000 this year.
Migration and MongoDB
Two years ago, the need for migration became pressing. MongoDB had started life at Auto Trader in 2010 with one app that lets users bookmark cars they are interested in. Patel says:
Initially, we were seeing 2,000 bookmarks a month. It’s up to two million now.
Over time, 34 more apps were added to the platform, all running on on-premises servers.
The developers wanted more. He adds:
They spin up a Docker container with MongoDB, build their app and say ‘support us’. We don’t want to be a drag, we want to promote faster, better, business-need focused innovation.
Plus, it was time to move off the old, out-of-support version of MongoDB. Patel says:
Moving to Atlas wasn’t an overnight decision. We looked at other options, such as running Chef Cookbooks on a private cloud, but we were in transition as a company and didn’t want to commit. We knew the future was the public cloud.
In August 2018, Patel felt the time was right to try Atlas, MongoDB’s fully managed cloud database, on Google Cloud Platform (GCP), backed by Kubernetes and infrastructure-as-code. He says:
We evolved a proof-of-concept methodology, and it worked.
Patel lists six things that made the proof-of-concept successful:
If you’re upgrading to a new version, don’t necessarily just pick the latest. A recent one – that can be a stepping stone to later upgrades but reduces risk – may be better.
Choose a complex application for the first port. Stress the system. Learn as much as possible.
Set a deadline for the proof-of-concept – two or three months.
Latency is important. Plan for it.
MongoDB has great tools for migration, including live migration. Use them wisely.
Developer buy-in is essential. Pick an app where the developers are very keen to see upgrades and rope them in to help.
By October 2018, Patel knew that GCP and Atlas was going to work. He then evolved a methodology for the full migration of all apps and databases. He says:
We iterated through this process for each app, pushing it to QA before going live. Good monitoring and alerts are in place for performance and system usage.
Migrating the apps one at a time felt right for the company’s agile philosophy, he notes, and it turns out that most of the preparatory work could be done immediately before each was migrated. Any problems or system resource decisions were thus clearly linked to the appropriate app as it was migrated, instead of dealing with a monolithic migration followed by intensive QA.
The steps taken for each app were:
Find your database owners. This can be surprisingly hard, as developers move on from projects. Look at the last commits to the repo and work from there.
Decide on grouping. Clusters cost, so group multiple apps in each Atlas Cluster, selected to minimize risk if one goes wrong and to make use of shared data requirements.
Check data dependencies, and purge. Lean databases are quicker to move.
Analyze your infrastructure for intended load and decide on cluster size. Atlas manages authentication databases and in-flight encryption with SSL, which helps.
Budget for the cost hump when you’re running your private infrastructure in parallel with the public cloud during migration.
Plan your connection pools. Managing connections is different on the cloud than locally,
Do an IO profile. Check disk IOPS during busy periods. This will help you plan connection pools.
Upgrade drivers ahead of time for the new database version. Liaise with your developers early about this.
Characterize your apps for sensitivity to downtime. You can then plan optimal migration strategies for them.
Patel points out that many people overlook another essential step – decommissioning.
Turn off user access, check the logs for a bit, then close things down and hand the space back.
Driven by results
With all application databases migrated by the end of October 2019, Patel says that running on MongoDB Atlas feels like “a world apart” from where they were.
We can deploy in seven minutes and scale up in 20. And if an app goes crazy and takes over the cluster, it’s so much easier to scale up temporarily by changing a single setting and trace the problem. With our old VM system, that could take a day.
Developers can build in access to the platform, including access to a central monitoring dashboard and alert structure that’s integrated with other systems, with just four lines of code. He adds:
It’s seamless for the developers, no matter what they’re developing in. We want the developers to be self-sufficient, and our DevOps model does that. And it’s quick – we have releases come in 2.5 minutes or under, 95 percent of the time.
Patel is in no doubt that the migration happened in the right way and the right time.
There’s lots to do in the future, but shedding the on-premises monoliths with all their overheads leaves us free to concentrate on business benefits. The journey was absolutely worth it.
You can watch Mohsin’s full presentation here: Migrating a Monolith to MongoDB Atlas – Auto Trader’s Journey