Adidas runs from Docker to Kubernetes for its e-commerce platform

Profile picture for user ddpreez By Derek du Preez May 1, 2018
Summary:
Speaking at KubeCon + CloudNativeCon in Copenhagen this week, Adidas explains how the retail company ended up using Kubernetes at scale.

Adidas
Retail giant Adidas is using a cloud native architecture, underpinned by the CNCF driven Kubernetes system, to support its front-end e-commerce platform, which is now being used at scale for massive events such as Cyber Monday and Christmas.

Daniel Eichten, platform engineer at Adidas, was speaking at KubeCon + CloudNativeCon this week in Copenhagen, where he was joined by Oliver Thylmann, CCO of Giant Swarm, an agency that specialises in Kubernetes deployments in the enterprise. Thylmann and Giant Swarm have been helping Adidas on its journey.

For those unaware, a cloud native architecture is defined as one that relies on containers, distributed management and orchestration, and the use of micro-services – all of which are specifically designed for cloud environments. It’s the next level of abstraction up from virtualisation and allows for greater utilisation, lower costs and better portability.

The open source technologies that underpin such an architecture – largely driven by the Kubernetes focused CNCF – are growing hugely in popularity and have seen rapid adoption amongst buyers over the past three years.

Opening up his presentation on the move to Kubernetes, Eichten began by saying:

The Adidas journey to Kubernetes. It was sometimes a relation of love, sometimes it was just hate, and very randomly and seldom it was also, what the f**k?

The shift was prompted in late 2013, where Adidas needed to move away from HP as its hosting provider and move all of its applications to an internal data centre. It asked its suppliers and partners for a quote for what it would cost to move an application from A to B, which ended up being more expensive than Eichten anticipated. He said:

Our first reaction was, what?! It was that tremendously high. Then we asked, which was even more important, why is the quote so high? They came back to us and said that a lot of things have to be done manually.

This was because over the years at Adidas, if you ever needed something technical done to a system, you had to fill in a request form, resulting in layers and layers of manual customisation.

To overcome this, Adidas looked to containerization to reduce the costs of the move. Eichten said:

We looked into some alternatives and what we could do about it and at that point in time we found Docker. Our first reaction was that it was an awesome tool. So we thought we’d get it into our landscape, we tried it out on some local machines and everything was working at that point in time.

However, at that point in time Docker was only supported on Ubuntu, and Adidas’ corporate Linux environment was Red Hat. As a result, its journey to containerization was delayed here. Eichten said:

So we did the second best thing we could do. We took a bunch of VMs and orchestrated everything with Puppet. That was working. Two or three months down the line, the support for Docker on Red Hat turned green. So we took our VMs we had before and we took a container on every of those. It was working, but didn’t feel as neat as it should be.

Progress

Fast forward to July 2015 and Eichten attended his first KubeCon conference in Portland, where seeing what Kubernetes could do first hand was convinced that this was something that Adidas needed in its organisation. However, his enthusiasm wasn’t well received back at HQ. Eichten explained:

I went back a week later and said we have to do this, but I learnt our big corporate was like a shipping container, it doesn’t move direction that easily. So my idea was killed. But later that year we got a new CIO.

It was at this point that Eichten was allowed to pursue Kubernetes, as the new CIO was keen to try new things often and wasn’t afraid of failure. It was at this point that Eichten also met Thylmann and the Giant Swarm team. Thylmann explained:

We met them at the right time and decided to set up a two day work shop, to try to understand what the current challenges are and think about what we can move first. And that’s what we did.

We try to take a step back and really think about what the first project might be and what might be easily attainable.

We normally start with a playground cluster, because it allows us to let the company play. We open a Slack channel together. We have very direct interactions - I think there are 60 people currently in the Adidas Slack Channel. Really assess how we adapt the platform to fit their needs.

Pains

After this ‘playground’ phase, Adidas decided to start its first proper project - it’s e-commerce store, which would be built entirely on Kubernetes. However, Eichten said that whilst Adidas was ‘bold’, it also wasn’t stupid. As a result, it decided to start small and focus on a less critical market to begin with - Finland. However, things still didn’t go to plan. Eichten said:

When we started, we started with very little loads, just a few users, ran them through the cluster, and everything was working fine. So we thought, okay let’s increase the load a little bit. And we put more into it - the result was a crash. Constantly. All over. Redoing the test - crash. Obviously, this was a panic. It was a panic on our side as engineers, our project managers side, our roll-out was in danger.

Eichten added:

At one point in time we found a smoking gun, which was that the proxy connection table was full. That was an easy fix. Just increase the number. At that time, we increased the number by four. Ran the test again. And guess what? Again there was a crash. What is going on? We dug a little bit deeper and found that the network interfaces of our AWS interfaces were saturated. And obviously the first reaction was, what is going on?

As a result, Eichten decided to sit side by side with the teams and follow the progress of the next load test. He soon spotted the problem. He said:

We did the next load test together, so I could sit and see the metrics and compare them. I think the assumption was that we’d test 30,000 concurrent users. When in actual fact it was 300,000 concurrent users.

Fixing for this problem, the Adidas store was back up and running. And Eichten and his team have since rolled it out to most countries, where it is supporting high demand sales and peak weeks such as Christmas and Cyber Monday. The Adidas front-end is now Kubernetes and is operating at scale.