An independent review into how the NHS can find ways to deliver better, broader, safer use of healthcare data for analysis and research, has recommended that the health service embrace a ‘small number of secure platforms’ to ‘unlock all the untapped potential in NHS data’.
Interestingly, the review also recommends that NHS data policies actively acknowledge the shortcomings of ‘pseudonymisation’ and ‘trust’ as techniques to manage patient privacy, which has been the approach used for a string of failed data projects in recent years.
The findings have been published by Professor Ben Goldacre, a clinical researcher at the University of Oxford, who was tasked with the job at the beginning of last year. His final report includes a whopping 185 recommendations to the government.
The NHS has some of the most comprehensive and valuable health data in the world, where information has been collected for decades across various organizations and systems. However, collecting and parsing this data for valuable use has its challenges - both from practical and privacy viewpoints.
In recent years the government has attempted to collate wide-ranging datasets for analysis - through its care.data and GP data sweep initiatives - but most have been scrapped due to inadequate communication with the public and serious privacy concerns. Privacy campaigners have long argued that pseudonymisation - removing names and addresses from records - isn’t sufficient to protect people.
Instead the review argues for the use of ‘Trusted Research Environments’, secure analytics platforms, that should become the norm for all analysis of NHS patient records by academics, NHS analysts and innovators, unless those patients have consented to their data flowing elsewhere.
It says that there should be as few of these platforms as possible, with a strong culture of openness and re-use around all code and platforms.
The report states:
At present the system relies on multiple small data projects that do not join up, distributing large volumes of the same patient records to an uncountable range of very different sites for different projects and teams. This duplicates implementation costs, data preparation costs, governance costs, and risks; it fosters monopolies, and obstructs transfer of ideas and analyses between settings.
It obliges the system to rely excessively on weak security practices such as ‘pseudonymisation’ (removing names and addresses from detailed health records) without always acknowledging the shortcomings; and to build complex systems of governance, contracts and trust that can only manage the security risks inherent in data dissemination by acting in a slow and risk averse manner. This approach has arisen from decades of ‘getting by’: but it can never scale to the kind of access needed for a world leader in data science.
Instead it argues for:
By investing in a coherent approach to data curation, and a small number of secure platforms, the nation can unlock all the untapped potential in NHS data. The full text of this review contains detailed background and practical recommendations, reflecting the technical complexity of this space. The high level recommendations below give an overview of the key risks and opportunities. The system should act now, starting with small teams of Pioneers to capitalise on existing pockets of excellence, building capacity and new ways of working in parallel to old approaches; after this, a full transition can come quickly.
This is a generational opportunity. We need a brief, rapier-like focus on platforms, creating teams and ideally institutions who are tasked solely with facilitating analytic work by other people. For less than the cost of digitising one hospital the system can have the secure data platforms and workforce needed to realise the full value of NHS data, driving research, health service improvement, and innovation. COVID-19 has brought fresh urgency: but future pandemics and waves may bring bigger challenges; and there were always lives waiting to be saved through better, broader, faster, safer use of NHS data.
As noted above, there are 185 recommendations included in the review, which are worth reading in full (but are too many to include here). But some of the standouts include:
Map all current bulk flows of pseudonymised NHS GP data, and then shut these down, wherever possible, as soon as Trusted Research Environments for GP data meet all reasonable user needs.
Ensure all code for data curation and analysis paid for by the state through academic funders and NHS procurement is shared openly, with appropriate technical documentation, to all data users. Data preparation, analysis and visualisation is complex technical work, requiring collaboration by many individuals, who may never meet, in a range of organisations, across the NHS and other sectors.
Bridge the gap between health research and software development: train academic researchers and NHS analysts in contemporary computational data science techniques.
Stop doing data curation differently, to variable and unseen standards, duplicatively in every team, data centre, and project: recognise NHS data curation as a complex, standalone, high status technical challenge of its own.
Embrace modern, open working methods for NHS data analysis by committing to Reproducible Analytical Pipelines (RAP) as the core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training.
Have a frank public conversation about commercial use of NHS data for innovation, but only after privacy issues have been addressed through adoption of Trusted Research Environments
Build impatiently, but incrementally, accepting that new ways of working are overdue, but cannot replace old methods overnight.
Commenting on the findings, Health and Social Care Secretary, Sajid Javid, said:
In some ways, health data is unlike other data. Concerns about privacy take on an even bigger life when it concerns our personal medical data. Moreover, the systems across the NHS and medical research can feel intimidatingly complex.
Yet in other ways, healthcare is more suited to data and the innovation that follows than almost any other sector – with the depth and coverage of NHS data providing unique opportunities. Navigating complexity can come with even greater gains, and the number of applications for medical data in health research are seemingly never-ending. The rewards of getting it right are profound, with not just lives saved but longer, healthier and happier lives too.
This report shows that we need to be as thoughtful as we are innovative, guided by safe ethical frameworks for providing access to data, as well as systems that ensure under-represented groups are well represented. It also makes clear that we have all the building blocks we need for success, including an unrivalled wealth of experience in using health data. However, it also shows areas where we must boost our capability and capacity if we are to reach our full potential.
This is the most comprehensive, thoughtful report on healthcare data I’ve seen produced by this government - one that takes into account the failings of the past and is honest about what is needed for effective data use in the NHS. It doesn’t shy away from discussions around privacy and potential commercial use of healthcare data, but puts user need and privacy at the heart of all recommendations. Whether or not it is taken seriously by the current administration remains to be seen, but it’s an effective blueprint that could pave the way for an exciting future for the NHS.