Shoppermotion pinpoints customer locations and journeys in Hadoop


Spanish start-up takes big data approach to enabling retailers and consumer brands to identify in-store hot zones, blind spots and new opportunities to improve layouts.

Marco Doncel

Shoppermotion’s adventures in Big Data may have got off to a shaky start, but the Spanish start-up is now providing big-name retailers and brands with valuable information on in-store shopper behaviors and experiences – with much more to come, according to CTO Marco Doncel.

Working with customers that include IKEA, Carrefour and Unilever, Shoppermotion attaches Bluetooth tokens to shopping baskets and trolleys and installs sensors in the ceilings of stores. This IoT-based approach enables the firm to analyze the routes that shoppers take through a store, how long they spend in each aisle and where they tend to linger on their journeys. That information is delivered to retailers via a customized analytics dashboard, enabling them to identify hot zones, blind spots and new opportunities to improve store layouts.

The amount of data involved is a challenge, says Doncel. Each trolley or basket-based token sends a signal to the ceiling-based sensors every second, along with telemetry information that allows a shopper’s exact location to be calculated. For each store with which Shoppermotion works, that can amount to some 10 gigabytes of data every single day.

So early on in Shoppermotion’s history, it quickly became clear that a Hadoop-based approach to big data made perfect sense. But the company’s early work with the Hortonworks distribution of Apache Hadoop quickly ran into problems, as Doncel explains:

It was OK at first, but we weren’t able to scale up as fast as we wanted, and for a start-up, it’s so crucial to be able to focus on the important task of growing your company. I realised we were losing huge amounts of time to maintenance, configuration, adding new nodes to the cluster. That was awful, because many times, it was just me on my own, struggling with these issues.

After some time, we got in contact with Cloudera and the company happened to have a team close to us in Spain. We began working with them very closely and found we got awesome support and advice.

Shoppermotion got a new cluster up and running in pre-production mode, this time using Cloudera’s Hadoop distribution, so that it could test performance and compare it with Hortonworks. Doncel recalls:

It wasn’t just faster, it was also easier to scale up. What really struck me was the number of hours my team was now dedicating to the health of the cluster – it was reduced by about 50%, so we freed up a lot of time to spend on developing new features for the core Shoppermotion product, which is obviously good news for our customers. So we made the decision, ‘OK, we’ll continue with Cloudera.’ If I’m honest, it was an easy decision to make.


Today, Shoppermotion runs most of its core analytics using Spark, running on top of Hadoop. This ingests the data collected in Kafka, used for building real-time data pipelines and streaming apps. At night, some Spark-based batch processes are also run, for the aggregation of analytics and, on top of that, Spark’s MLib (machine learning library) is used for more sophisticated analytics. All the data is stored in Hbase, and once a week, it is backed up to Amazon S3, using Flume.

With his team of three developers now spending only one-fifth of their time on core maintenance of the cluster, Doncel is confident about Shoppermotion’s ability to move into providing new analyses to customers:

We might, for example, be providing more sophisticated recommendations to them. After all, when you have those data volumes, you’re able to cluster thousands of shopping experiences in different ways. We can see for instance if someone enters a store at a specific time of day and heads straight to the processed foods, it’s probably only going to be a short trip to the store. Or, if someone lingers somewhere else, maybe they’ll spend longer shopping.

Either way, we can start to predict when they’ll go to the checkouts, how quickly lines are likely to build up and the retailer has the chance to get an alert and open new checkouts. We’re already working with a client on that in machine learning mode.

Another thing we’re developing right now is on ways that retailers can send promotions and offers to customers. So if the customer has a loyalty card and we know they stop for five minutes in front of a Samsung Galaxy 8 in the technology department of a hypermarket, we can send them a special discount on Samsung products, for example.

Shoppermotion may only be a start-up, and still less than five years old, but its impressive customer list is paying dividends in terms of bringing new companies into the fold. Doncel predicts:

Once you have those big names, others follow. We’re providing retailers with valuable research that they just can’t get elsewhere, and that’s something other retailers and brands are rapidly getting interested in.

Image credit - Shoppermotion