Brazil’s largest retailer automates data quality checks to improve decision making

Gary Flood Profile picture for user gflood May 3, 2022 Audio mode
The ‘Amazon of Brazil’, Lojas Americanas, says 51 million customers will benefit from the use of Soda’s data checking environment and SQL testing framework

Image of someone using the Americanas app on their phone
(Image sourced via Americanas)

In e-commerce, the ‘Garbage In, Garbage Out’ warning for making decisions based on inaccurate data is always true. With this in mind, the dominant retail and e-commerce player in Brazil, Lojas Americanas, has introduced automated data quality checking in order to improve operations and customer experience.

The aim is to make the discovery, prioritization, and resolution of data issues as easy to resolve as possible on all sales and customer data.

The tech is being applied to customer and sales data before it is ingested by the firm’s new Artificial Intelligence and Machine Learning digital support system, which now sits behind its huge physical and online sales processes.

The move will also enable simple data visualization across the firm to deliver on-going operational efficiencies, the company’s head of Big Data, Analytics and AI, Tiago Andrade, says. He explains:

We are a very big company, so we have a lot of orders and a lot of customers buying at the same time. We need to ensure all the information that is going into the internal systems to deliver the product for the customer in time is always 100% accurate, and these days it's just not possible to do that level of real-time monitoring with people or any manual process.

Battle-tested with Black Friday

With 41,000 staff, Americanas serves the large Brazilian market, across all of its 8.5 million square kilometers and 26 component States (the Federative Republic is the world’s fifth largest by area). Serving as a distribution platform, similar to Amazon, for 137 million items from over 120,000 third-party e-commerce vendors, the company also operates more than 3,500 physical shops and franchises, as well as 200 smart lockers in Brazilian gas stations, pharmacies, shopping malls and subways.

Because of this enterprise-scale of operations, Americanas says it needed tooling to make working with all that data as frictionless as possible.

Andrade provides the example of a developer or a data analyst in the marketing team. To achieve better traffic acquisition, this individual has a budget which they need to use in the very smartest way to win (and keep) customers. He says:

These team members get a lot of channels you can spend this budget on - Google, Facebook, etc - and to keep their data flow running, they use the Americanas internal data platform to track how you’re spending this amount of money on Google, this amount of money on Facebook, etc.

But say an integration with Google breaks: now, that employee will instantly know there’s a data problem, as they will get an alert - 'Oh, your traffic acquisition integration with Google is broken. You need to fix it.’

In other words, Americanas wants all the data analysts and operational staff to make decisions, based on data that is as accurate as possible, in order to deliver its brand promise of accurate Amazon-style product recommendations.

Another key driver, he says, was to ensure the right products always get sold to the right people, and to deliver a unified commerce experience.

The potential of this form of machine-supported data cleansing was first tested by a proof-of-concept pilot, set up for last November’s ‘Black Friday’ peak shopping period.

The proof of concept was convincing enough that the company says data-driven decisions using it could be trusted during what is one of the most important e-commerce sales periods.

As a result, this new data quality ‘filter’ is being rolled-out enterprise-wide.

Checking for broken patterns in data

Andrade and his team are working with two tools from Belgian-headquartered data management specialist supplier, Soda. These are Soda Cloud, which offers data dashboards for holistic scans of data sets to surface issues, and Soda SQL, an open source command line environment.

Soda Cloud is the platform being used by a business analyst, or line of business manager, to get alerts that one of the data sets being used to improve sales in a specific region, or a specific sector, has several errors in it. That means they can either raise it as a problem with their data engineers, or discard it from analysis.

Prior to automation, Andrade said, that either couldn’t be done at all, or would have required data engineers ready to help 24/7 across 50 million-plus transactions a week. Soda Cloud is already initially being offered to around 3,500 business intelligence users.

To complement that high level analysis, Soda SQL is being used by developers to quickly test if the data can be properly accessed by an SQL program, automating data quality checks, which was previously done manually.

Used respectively by business analysts and developers, the software thus acts to check for broken patterns in data to help speed up the solving of customer problems, such as late product delivery. Soda SLQ is now available to 1,000 internal Americanas developers.

Decreasing data correction time

It is still early days for this new way of working at the company, but Andrade has specific ROI targets in mind. He says:

We have an internal metric that represents the time that teams spend handling data issues. It varies from team to team, but it could be up to 30%. And that 30% means you are not working on something creating value to the company.

Our desire is to decrease this to more like two percent to five percent. That means teams will be instead working on stuff that could bring more value to our customers, to our sellers, or even to our overall internal process. This is the most important thing we want to achieve in the next couple months.

A grey colored placeholder image