What do you do if your business needs to create a standardised approach to software development and a consistent model for data ingestion? That was the challenge that faced Barry Libenson, Global CIO at Experian, when he joined the financial data company in 2015.
Libenson – an experienced IT chief who has previously spent time in the retail, manufacturing and food production sectors – recognised the technology team needed to adopt a standardised approach to applications and data to support’s the business’ rapid growth. The firm’s 16,500 people work across 39 countries providing credit and analytics services to businesses and consumers.
Two key achievements during his time with Experian, says Libenson, have been the establishment of an application canvas and a data fabric. These two elements have helped create the consistency the business craves and provide a platform for future, digitally enabled growth.
Creating an application canvas
Libenson says the application canvas is the firm’s standardised software development methodology. This approach draws on a combination of containers, Red Hat’s OpenShift technology and DevOps tools. Libenson says this application canvas is best-thought of as the kit his department provides to the firm’s software developers around the world so they're all building on the same platform:
One of the challenges three years ago was that if we needed to move resources around inside the company, it could be really challenging. It’s hard to take developers who have a specialism in one area and move them over to another team and have them be productive right away.
Libenson says these experts had to learn fresh coding standards or database dependencies as they moved between development techniques. Experian was keen to make it easier for developers. The firm has supported this shift through containers and the adoption of Red Hat’s OpenShift platform. The firm is also using the open-source system Kubernetes to help deploy and manage its containers. By getting everybody to use Red Hat's OpenShift platform, Libenson says the organisation can now use its application canvas approach to take a proactive approach to service creation:
I can now move developers across the organisation where we need them much more effectively than we were able to do three years ago. So, getting everybody on to the app canvas was a big deal. Any new products now are built using that technology. We on-boarded 700 of the development staff in the last 12 months and we're accelerating the pace at which we're doing that.
Libenson says software development through containers allows his IT professionals to tap into a range of cloud platforms. Rather than being tightly connected to a single provider, Experian’s developers create components that can be moved between cloud platforms to meet business requirements for new services. Developers at Experian can self-provision and stand up a new environment in a matter of minutes. Libenson says the application canvas approach is producing significant benefits and he has big plans:
I certainly hope to be able to say in two years’ time that instead of 700 that 2,000 of our developers are now on the application canvas, and are using open source and Red Hat OpenShift as their primary development platform. That aim might be a little tougher because, although all new product development is being done that way, we still have a whole slew of products that we support in the market that aren't necessarily going to be rewritten any time soon.
Designing a data fabric
Alongside his work on software development, Libenson says his other key achievement involves work on the firm’s data fabric. As diginomica has highlighted before, the firm’s data fabric is an architectural approach and a set of data services that provide consistent capabilities across endpoints and environments. The fabric now allows the company to run complex models and analytics. Libenson says a consistent approach to data is critical for digital businesses, like Experian:
We live in a world where data accuracy is so critically important that three nines of reliability is not even close – we would be in a world of hurt in that scenario. It would mean one out of every 1,000 credit reports you produce could potentially be erroneous and that is totally unacceptable. So, the issue for us has been that an enormous amount of information gets pumped into our systems daily. We have roughly 20 thousand different data sources that we ingest on a regular basis, and some are more complicated than others.
Historically speaking, the firm’s data was ingested linearly. Experian would receive a file from a large bank or financial institution and would start to read the data through a single pipeline. If the data didn't make sense, the system would stop processing until somebody could resolve the issue. The data fabric provides a new approach, using an Apache Hadoop cluster to create parallel versions of the ingestion process, setting up multiple engines instead of a single pipeline of information. Libenson says Experian can now process information quickly:
There’s an expectation from customers of not only accuracy but speed. We want to be as real-time as we can possibly be on the information we provide to our customers. The data fabric is much better at handling exceptions, so if one of the pipelines finds erroneous data, the data fabric will take and move aside a data element so that it can be handled as an exception and keep processing the rest of the pipeline.
Libenson says Hadoop scales linearly and almost infinitely, something that is crucial given his firm’s huge set of data sources and its massive performance requirements. The effect on the data ingestion process has been transformative. By running engines in parallel and handling exceptions, the data fabric has helped Libenson slash data ingestion times:
A data set that used to take six months to ingest is now down to about six hours – and we could have got it down potentially to six minutes if we wanted to continue to increase the size of the cluster. We are kind of like the poster child for Hadoop – we do big, complex analytics models on data that gets augmented but that remains relatively fixed. In other words, it's a growing data model – people keep adding to their credit history. And we can run complex, large-scale analytical models.
CIOs often like to say data is the crown jewels of the business – in the case of Experian, it’s more than just a buzz phrase: information is absolutely critical to the work of the firm and its customers. Libenson has helped shape a nimble platform for application development and data use that can scale with demand. He has big plans for service creation in coming months and it will be interesting to see how his plans are realised by his growing development team.