Data driven, optimized driving strategies save US railroads millions

Profile picture for user ddpreez By Derek du Preez September 23, 2015
Summary:
New York Air Brake is creating new machine learning platforms that use Splunk to better optimise driving strategies on the railroads, which can save millions of dollars.

union pacific train
Can an industry as old as the US railroads have a use case for action based analytics? New York Air Brake, a manufacturer of air brake and train control systems, which supplies approximately half the US market, is proving that optimized train driving strategies based upon advanced data analytics saves millions of dollars. The key to unlocking savings comes from a fresh use case for Splunk products.

At Splunk's annual user conference this week in Las Vegas I spoke with Greg Hrebek, an R&D engineering director at New York Air Brake's Dallas facility. He explained that Splunk's operational intelligence platform, which analyses machine and behavioural data, is helping his company deliver railroad efficiencies that could not be realized by relying on driver experience and intuition.

New York Air Brake's tools range from driver assist, which delivers prompts and alerts, through to full automation. Hrebek said:

Traditionally an engineer (driver) would drive by the seat of his pants, he feels the buck of the train and he really doesn't know what's going on in the rear of his train. So what our control system does is it models the in-train forces at every coupler and takes that into account and develops a driving strategy to minimize those in-train forces.

One of the benefits of creating a driving strategy for a train, since you have all this information – the topology, how long it is, how heavy it is, all this metadata – you can start doing things for fuel savings. That's the primary business case. But there are some other benefits, such as improving wear and tear. But fuel savings is the most important thing, because if you look at the top three operating expenses for the railroad it's labour, maintenance and fuel.

Union Pacific, for example, uses more fuel than the US Navy. And there are seven class 1 railroads. So one percent reduction in fuel can be multi-million dollars worth of savings, even hundreds of millions of dollars. On average our product does a seven percent fuel reduction.

That's a lot of savings being delivered through the better use of data.

Excel hell

New York Air Brake began using Splunk as part of the driver assist system, which issues prompts and alerts to help create better driving strategies. Hrebek explained that when proving the customer benefits, New York Air Brake typically installs the product on a train and let it run for 45 days to create a baseline profile of the driver's behaviour. It then turns on the system and measures the impact of the technology along with savings delivered.

That whole statistical process was traditionally done through Excel. Hrebek said this was 'spreadsheet hell.' Splunk proved that it quickly and accurately turns this data into actionable insights . He said:

It was very prone to human error. And when we introduce this technology to a railroad there is a behavioural aspect. A lot of these guys have been driving their train for 15 to 20 years, they don't want a computer telling them what to do. Obviously if you don't use the technology, you're not going to get the benefit. We need to look at what is the level of compliance from the driver versus fuel savings and put that story together.

Doing all that in Excel is prone to human error because of the different filtering we have to do, based on particular railroads. It was a three week process. I became familiar with Splunk from previous roles and I thought that there has to be a better way of doing this and figured out Splunk could probably help.

Our initial business case for Splunk was around bringing in all the train data we collect and quickly generate reports for customers around these business cases. Rather than an engineer have to go through and manually filter out things, I could type a query and filter out results.

Creating hyper-local strategies

Hrebek quickly realized that Splunk's capabilities could be used for critical analysis beyond data collection and report delivery. Splunk has a strong play in behavioural analysis, which is typically used for security monitoring, but it can be applied to other areas, like driving strategies. Hrebek said:

There is a lot of tribal knowledge in the railroads. For example, if you are coming down a hill and there is another hill on the other side of the bank, you might need overspeed for two miles an hour. Otherwise you are not going to make it up on the other side of the hill. The drivers know this. They're allowed to do it. But if there is a rule on a computer, then the computer won't violate it. Using behavioural analysis tools to detect those anomalies and then have our engineers investigate it, helped us rapidly improve the driving strategy and tune it for that particular customer.

New York Air Brake is going further and will use collected data to introduce machine learning to refine driving strategies by including parameters such as weather conditions. The idea is that New York Air Brake can both localize and personalize driving strategies. The expectations is greater savings for customers, which creates a competitive advantage for New York Air Brake.

Splunk will become a de-facto component by Q1/Q2 next year. Splunk has dramatically increased the ability to do the application work, so to be able to troubleshoot and investigate issues. We have decreased the amount of issues we find in the field because we can do it from the lab, versus customers reporting it. We have taken all this information and our product and converted it into a Linux type platform. We can put this in the cloud and we can start to apply machine learning to our driving strategies.

We will take information from the field into Splunk, normalise it, that will create a dataset,

splunk logo 2
a single source of truth. We will take that information and generate config files for our system. Using Splunk's query language, we will pipe that data to a virtual machine, run a simulation, create the physics of the train which we can then correlate to the log files we had and then create a new dataset.

We can then do more interesting things, like feed it to machine learning. It then not only creates an automation tool for analysis that is going to power our next generation of products, but we will be able to create deeper insights. So create driving strategies that aren't generic, but specific to that train on that particular day, based on anticipated environmental conditions. For example, we have a config file of 400 parameters that defines the driving strategy. We have some generic types of trains and the driving strategy is generic for those.

But, for example, wind plays an important part in train handling so being able to go out and get the wind direction at every mile post and feed that back into a machine learning system that also knows all those other parameters can create a driving strategy for that train on that day. That's where we are going, hyper-local driving strategies.