Mercedes - blocking out the data 'noise' to win the decision-making race
- Summary:
- For an F1 car racing team the smallest of incremental gains in overall speed are often the difference between success and failure. That comes from understanding what the vast volumes of data the cars now generate actually means.
But what may not so obvious to many of the sports millions of fans worldwide is just how important the speed of handling data – huge volumes of it in as near real time and as globally distributed as it is possible to get – has become to achieving the marginal improvements which are now the key to success or failure.
To get the best performance from the racing cars – which in the view of Matt Harris, head of IT for Mercedes AMG Petronas, is the equivalent of racing a brand new prototype specifically engineered for every race during the year, and doing it in front of a TV audience of several million viewers – now involves getting the best decisions possible from that vast pile of historical and real-time data in the shortest possible time. Harris says:
Every decision made around data rather than gut feel means that we can improve our performance overall as a team. What we are doing with Tibco is to see how we can improve our understanding of our data faster, and not have to look at so much of it. The important thing is to know when you have, or have not, missed something important.
The reference to Tibco is the partnership between the two companies that was announced in April. At that time this looked like it could be about much more than getting some front-of-house brand recognition for Tibco, and much more about both parties testing the capabilities of their technologies at the bleeding edge of their operational capabilities.
For example, that statement from Harris about knowing when or when not important data has been missed points to a fundamental road block that exists within all efforts at exploiting BIG DATA analytics – the Big Data itself.
What is normal?
The important aspect, as Harris points out, is building the ability to know what data can be ignored and understanding why it can be ignored. What is `normal’ for any given situation? What constitutes an anomaly and why is it one? This way it can tune out the noise of normal data that tells nothing except that things are still OK:
Our issue is finding what is `normal’ for the car so we can look at the anomalies that might hide a problem, or point to a possible improvement. We need to be able to predict what might happen and what new approaches the team might try.
While it is obvious how and why such capabilities are required when analysing the technologies that go into the car, Harris has realised that the same approaches can be applied to wider areas of operation of the business as well. Every F1 team is constrained by having a maximum of 60 people at each race, so by understanding what is `normal’ for the car, those limited trackside resources can be deployed more effectively on identifying what isn’t normal, and what the impact of abnormality might mean.
Here, the capabilities of the underlying IT come into play, for this is not only about the analysis of real time data generated at the track, but also comparison with historical data back at the factory. So high-speed global communications capabilities, cloud services and the rest play an equal part in generating success or failure.
This also allows Mercedes to exploit other external resources than its own – in this case a team of data scientists from Tibco itself:
That is really very important to us. They are able to become a virtual part of the team for us. We have already shared a couple of sets of data, without telling the scientists about them, and asked them to tell us what they can about the data and what they understand from it. One of them contains gearbox data, where the idea is to model gearbox wear metrics. This can then be used to determine when gearboxes should be changed.
As an example, during the famed Monaco Grand Prix, each car changes gearbox 200 time per lap, and each change is under different loadings of acceleration, deceleration and many other factors. Each car is also only allowed to use five gearboxes per season without facing a start grid penalty. But when each gearbox is used is then a decision for the team assessing the demands made by each race.
So a Monaco gearbox might break in a subsequent race demanding heavy use, while it might readily survive a race with different characteristics and use patterns. This will also vary with the different styles of each driver, as each has their own set of engines, gearboxes, tyres and the rest, which must not be mixed between them.
The volume of data generated by the cars is huge, with some 45 TBytes of data being generated every week from races, testing sessions, simulators and computational fluid dynamics analysis of the car’s aerodynamics. There are 200+ sensors per car in race trim, each dumping data to the pit crew each lap.
That data also goes back to the factory, and the processing cycle is such that, should an anomaly be discovered, an analysis should be ready and returned to the pit crew – wherever they are in the world - before the next lap is completed. That usually means around 1 minute 25 seconds to get the data sent to the factory, analysed and an answer transmitted back to the track. Harris explains:
Some sensors also have multiple datapoints, so across a season we are looking at hundreds of millions of datapoints. In fact we currently don’t know the exact number, we haven’t been able to calculate it. But we are looking for a needle in a stack of needles, not a haystack.
Doing more with the same
The partnership between the two companies is still very much in its early, proof of concept days, but is looking to be on a fast ramp within Mercedes. This is because other areas of the business are looking at how analysis tools can be applied to their area, and starting to experiment, says Harris:
The Tibco guys asked our people is they had done any courses, but they hadn’t. But it is actually not that difficult for clever guys to understand.
So as well as detailed technological analysis, the Tibco tools are likely to soon find themselves supporting race strategy decisions, such as using the lap timings and car GPS position data - available to all teams about all cars - to help determine when pit stops should occur, or how strategy should change when the safety car is deployed. It is also being targeted at conducting regression testing on the specialist applications the company itself develops.
In all these areas, machine learning capabilities will play a part so that the tools learn what is normal and what is an anomaly:
The objective with the Tibco tools is to start making decisions that are based on the exceptions, and to get rid of the data noise. And this can be used in many other areas, such as discovering cyber attacks.
Whatever one thinks of F1 as business or sport, it is a microcosm of the fundamentals of most business activity viewed under the microscope of 100 million or more TV viewers around the world. The most important task for each team is the prevention of a DNF (Did Not Finish), for that means the removal of any chance of gaining points in either the Constructor’s or Driver’s championships.
That does pose a side issue as to whether the analysis tools can be applied by Mercedes to monitor its rivals more closely. Harris’ answer is, perhaps, an object lesson in how businesses can be suckered into a world of over-analysis of too many irrelevant variables – another contributor to the data 'noise':
We sort of do some of that, but if we worry about that then we are not worrying about what we should be doing. It is all about those incremental gains. But it is also about removing levels within the company structure. We have to work on the basis that is anyone sees anything wrong, they should say it, and say it quickly.
My take
As someone who has watched FI since childhood, it is clear that the incremental gains are now the important targets. And while it may not be the spectacle of daring and machismo it used to be, it is an increasingly important demonstration of why microseconds are a great analogy for what is now important for so many other areas of business as well