Public Health England - open source and containerisation key to tech agenda
- Summary:
-
PHE is using Red Hat’s open hybrid cloud technologies to support modern digital public health services in the UK.
When it comes to public health, the more you can do in the lab first, the better. Even more so: the quicker, and at bigger scale, too - so you want to make sure the lab guys have the best tools possible.
That’s certainly the thinking over at Public Health England (PHE), an executive agency sponsored by the Department of Health and Social Care and whose jobs is to use world-leading science, research, knowledge and intelligence, advocacy, partnerships and specialist public health services so as to protect and improve the UK’s health, wellbeing, and reduce health inequalities.
To that end, the organisation issues on-going reports and guidance on everything from ‘flu to cardiovascular disease, and from supporting the NHS on reducing antibiotic prescribing and informing the public about sugar in food and drink to using whole genome sequencing for rapid identification of potentially aggressive pathogens. High performance computing (HPC) is, unsurprisingly, seen as one of its major weapons in the PHE arsenal - and it’s a weapon now very firmly based on Open Source, according to its Head of HPC and Infrastructure, Francesco Giannoccaro.
Indeed, back in April, Giannoccaro and the rest of the internal PHE team went public on a bold move on just that basis - a decision to make extensive use of Linux, Open Source, and Red Hat software. As he explained the organisation’s vision at the time:
Our ability to take better advantage of the opportunities created by Open Source technology is vital in our work of keeping the nation safe and healthy, and ensuring that public health principles are maintained and developed. And in turn, PHE’s success in implementing multi- and hybrid cloud is based on the success of the Open Source community and the Open Source model.
PHE’s vision was to pursue an information and communication technology strategy that embraced modern computing architectures and solutions, including high-performance computing (HPC) and multi-cloud operations. It also wanted to evolve an open, automation-centric ecosystem that could integrate a fragmented, proprietary set of existing systems.
That fragmentation was caused by the fact that PHE is actually a combination of something like 70 different bodies, brought together by the Government in 2016. Giannoccaro now says:
The challenge of course is then to bring those large number of organisations into one single research unit. From a technology point of view, we went through a number of obstacles, and at the start we had to be very much focused on supporting ‘Business As Usual’ - simple services like email, intranet, which were essentially all based on the Microsoft technology stack.
Once integration has been achieved, however, at the back office end, the next step is better support for the vital scientific work the organisation has to do to fulfill its mission.
To deliver modern public services and make those services easily accessible, we see making use of the same cutting edge technologies that are used by academia and research Institutes around the role made a lot of sense. If you look at the top 500 supercomputing sites, they hundred percent run Linux now, but to ensure the easy access part of what we want to do, being based on open standards and cloud, and thus our work being as reproducible and easy to share as much as possible, was also a vital ingredient.”
Since April, then, Giannoccoro and his team have been trying to deliver on that vision, building an open, scalable, enterprise-grade Linux platform allied with private cloud infrastructure to drive its HPC work.
Another priority is new virtualization tools to host its existing applications without relying on proprietary tech, as well as hybrid/multi-cloud management and automation capabilities to host cloud-native workloads and applications.
As noted, it’s been bedding in Red Hat Enterprise Linux to do all this, starting with its two main clusters farms located in North London and Salisbury, complemented by other Red Hat software such as its Infrastructure-as-a-service, storage, virtualization and data centre management tools.
Contained
Another area of focus is containerisation, seen as key in improving data and code portability, reusability and data sharing, says Giannoccaro:
A container approach gives you a much greater level of foreseeability in terms of consistency from one environment, one platform to another, from private/on-premise cloud to public infrastructure. We’re working on tools to easily manage the scalability of those containers and also get a nice elastic way to allocate resources for only when they’re needed, making those systems very cost effective.”
As to whether this means the public health is safer, Giannoccaro won’t go that far - but is happy to say important scientific work is certainly happening a lot faster than with his previous topology:
We analyse dangerous, potentially very aggressive, viruses, and we deliver these services to a number of hospitals around the country. The use of containers is all about making those pipelines run smoother, and our scientific applications be shared and very easily used by other organisations around the country. And the size of the datasets here, like the screening programme we had to run during for instance, the Ebola outbreak, can be in the multi-Petabyte range. It used to take between ten to twelve hours to do something like that, whereas the scalability we get from Open Source means we can now perform this massive analysis in less than one hour or 90 minutes.
In terms of future steps, for Giannoccaro that has to be all about a significant reduction in the time PHE scientists need to start running tests, which he sees as being all about greater use of Machine Learning (ML).
Again, the Open Source community is very active in this space. We have started looking into ML frameworks such as TensorFlow and Kubernetes so as to create an environment able to use them in a very scalable and effective way. So in six months, what I would like to see was our scientists being able to use that level of Open Source technology in very easy way in terms of our overall infrastructure here at Public Health England.