pilots Hadoop platform in search of individualised customer experience

Jessica Twentyman Profile picture for user jtwentyman March 25, 2015
The German online classified ads provider begins Cloudera trial with a view to boosting real-time analytics and personalised customer experiences.

From Augsburg to Zossen, and over 11,000 German cities, towns and municipalities in between, whether you’re looking for a job, an apartment, a used car or a hot date, is a pretty good place to start.

Through its web portal and apps, this classified ads platform receives over 7.17 million users each month, serving up 131 million individual page views.

It’s owned by German publishing giant Axel Springer, publisher of the Bild and Die Welt newspapers, through its Axel Springer Digital Classifieds arm, a joint venture with private equity firm General Atlantic.’s promise to its visitors - and the advertisers trying to target them - is its ability to offer highly localised information, specific to their hometown or another city or town they’re hoping to move to or simply visit.

In information management terms, that emphasis on local relevancy requires a lot of heavy lifting, especially when information comes from a wide pool of external and internal sources, explains Dr Katja Mueller, the company’s head of business intelligence:

Our goal is to aggregate information conveniently for customers, but it’s very complex to organise that information according to location. We need to tag a lot of information, so that whether a visitor to our site lives in Cologne or a small town in Bavaria, he or she will see ads and news and weather reports that are relevant to them.

Mueller, who has also worked in business intelligence at broadcaster Sky and online auction site eBay, is convinced that real-time, big data analytics hold the key to delivering a better, more individualised experience to every visitor to

Hadoop trial

With that in mind, she’s embarking on a trial of Cloudera’s Hadoop distribution, Cloudera Enterprise, and using the Apache Spark tool to handle user queries, in real time, against the company’s vast stores of ads. She says:

Anyone who has used a jobsite will know that, when you input a keyword specific to your job search, you’ll get back 200 results and none of them will be relevant to you, or maybe only one - and that could be on page ten of the search results. It’s frustrating. Eventually, you might give up if it’s too hard to find what you want.

Using Cloudera and Spark should help us to make the user experience much neater. If you’ve already been on our site, and we can predict what you’re likely to search for, then we can deliver relevant ads with greater accuracy.

ds_meinestadt_shot1 in action

But she has other plans for the Cloudera Hadoop pilot, too. It could be a useful way, she says, of getting new insights into how visitors navigate, information that will feed back into future product development strategies, she says:

We can collect a lot of information on what people do on the site, a lot of activity-based data. Some people take a long time to go from one page to another before they click on a listing, for example. So what we want to be able to do is say to the product development team: ‘Here are the pain-points, here’s a part of the site where many visitors drop out, here’s where they get stuck.’ Those insights could make our web portal and apps much easier to use in future.

The Cloudera Enterprise proof-of-concept will begin with six nodes - a typical trial-size cluster - and it should only take four or five months to get up and running, with one or two bespoke analytic applications running on top, she predicts.

We’re an SME [small and medium-sized enterprise] with 300 employees so we can move pretty fast. Nothing needs to go through five or six layers of approval. And we also have an agile development plan that allows us to deliver new software in two-week intervals.

We’ve already started to deploy and we’ll be getting the first data insights within the next few months. Our expectation is that, once [Cloudera] is installed, the thinking on the applications themselves and what we want to do with those applications will develop quite quickly, as business units start to realise what we can achieve with the new capabilities and begin to suggest their own ideas. These things can absolutely grow very speedily if you can show business value quickly.

Demand for business intelligence reports at has never been greater. The Hadoop experiment in real-time big data, Mueller hopes, will help her team respond faster to the deluge of requests it receives. She concludes:

Reporting needs are such in our business that [managers] don’t want to wait another day for our data, they’d like it hourly. In many cases, they’d like it even sooner. And especially on the operational side, real-time information is increasingly important. So that’s our big focus for the next few months: real-time predictive analysis and real-time recommendations.

A grey colored placeholder image