Main content

NHS data sold to consultants and uploaded to Google servers – Twitter explodes

Derek du Preez Profile picture for user ddpreez March 2, 2014
Privacy experts are going to run with this, but we need to take a more balanced look at what the benefits and risks are

Another day, another NHS data debacle. It was only last week that NHS England landed itself with a serious PR disaster after it handed a contract to build an already controversial patient information database ( to Atos – which is currently under fire for the handling of a separate contract to carry out health checks for citizens claiming out of work benefits.

Now it has emerged that management consultants PA Consulting have bought “the entire start to finish HES [hospital episode statistics] dataset across all three areas of collection – inpatient, outpatient and A&E”, which is essentially a database that collects patient information across all NHS Trusts in the UK. Not only was the data handed to a commercial organisation, PA Consulting then uploaded the entire database onto Google's servers in the US to understand how “cloud can transform the way the NHS collects and uses data”.

This was revealed on Twitter by Tory MP Sarah Wollaston, who has a prominent position on the influential health select committee in the UK, where she tweeted:

Cue privacy campaigners, political commentators and the general public go into Twitter meltdown (one witty  user even sent Google a question asking for the results of her pregnancy test!). This reaction has to be expected in a post-PRISM world, where even if enterprises are not particularly phased by NSA snooping allegations (see Stuart's piece on data in the Euro region), the general public has serious concerns. This is particularly true when discussing health data.

So what do we know?

PA Consulting's report on how it was using the data outlines that the company has an existing relationship with Google and as a result decided to upload the NHS data to the internet giant's cloud using tools such as Google Storage and BigQuery. The benefits for PA were clear:

“We found that queries that took all night on our servers were returned in under 30 seconds using BigQuery. This was the performance on the raw uploads with no optimisation. This stunning improvement in speed applied even to more sophisticated analysis.

“Within two weeks of starting to use the Google tools we were able to produce interactive maps directly from HES queries in seconds. In the old days it would have taken more than a month to produce just on clever map.”

PA Consulting said that these “exceptional results” should should make a huge difference to users, where the slowness of the current process “severely limits the quality and number of interesting results”. It also believes that the analysis should be highly influential in changing the way the NHS budget is allocated. The report states:

“By the standards of IT projects, this could be easy to achieve as it would not require a big upfront commitment of capital.”

Whilst the public backlash gets into full swing, HSCIC (Health & Social Care Information Centre - the body responsible for signing the deal with PA Consulting has issued a statement saying that no Google staff would have been allowed to access the data.

“The agreement obliged PA Consulting to abide by conditions to protect the confidentiality of the data, including restricting the data to a named list of individuals, a prohibition on sharing any information with the risk of identifying individuals and a requirement to destroy the data after the agreement end date.

“The NHS IC had a written confirmation from PA Consulting prior to the agreement being signed that no Google staff would be able to access the data; access continued to be restricted to the individuals named in the data sharing agreement.”

So on the upside, PA Consulting didn't go rogue and carry out the deal with Google without the HSCIC knowing and there seems to have been controls in place to ensure how the data was used and who had access to it. However, this is still unlikely to be much comfort to those concerned with Snowden's allegations, where controls and agreements seem to have been  suitably ignored by most

PA Consulting's line is...

"Over the past two years we have run a project to show the NHS how insight can be quickly and cost-effectively generated from large volumes of health data, enabling better care for patients. PA purchased the commercially available Hospital Episode Statistics dataset from the Health and Social Care Information Centre. The dataset does not contain information that can be linked to specific individuals and is held securely in the cloud in accordance with conditions specified and approved by HSCIC. Access to the dataset is tightly controlled and restricted to the small PA project team.  

"Our new approach to extracting insight from large volumes of data can help the NHS improve patient care. We have shown where services are needed most by patients and identified previously unseen side effects of drugs and treatments.  Our approach protects patient confidentiality and allows insights to be derived at significantly lower cost, and a hundred times faster, than any traditional approach."

Google needs to prove itself

I spoke to Charlotte Davies, Ovum's lead analyst on healthcare life sciences, who was equally surprised by the revelations. Davies believes that although a lot of the recent concern about extracting patient data from GP systems could be avoided by better marketing by the NHS, better opt-out options for the public and a clear campaign explaining the benefits – a commercial organisation handing public health records to Google is a step to far.

Charlie Davies

She stated that although cloud computing will play a part in NHS data management in the future, more needs to be done to understand how population-level data can be protected.

“I don't think Google has proven itself enough on having the capabilities to protect healthcare data. That's the pendulum going completely the other way - PA consulting purchased the data, so people will want to know who really gains value from this? On paper [the database facing controversy elsewhere] is within national boundaries and can help the UK become a centre of excellence for health. But this is looks like it might be about commercial interests and it is a shoddy approach to data sovereignty.

“It's not a case of saying you can't put data in the cloud – to my mind it's just another example of a hasty move and I am not surprised by the privacy backlash. The public are going to focus on the way commercial companies make themselves money. You have to have the involvement of government and national bodies with this sort of work.”

My take

I take a balanced view on this. I don't think we should rule out Google and the use of cloud computing to store and analyse health data, if there are real benefits to be had and the proper controls are put in place. PA Consulting seemed to find that there could be not only significant cost and time savings for the NHS using cloud, but that the use of tools provided by the likes of Google could also deliver enhanced products for Trusts and the health industry.

But equally I don't agree with selling population-level health data to a management consultant firm, which then carries out work with a cloud computing company – especially one based in the US. The interests of the firms may outweigh the interests of the general public and that could lead to a number of risks. I think if the NHS wants to use cloud computing, then do it. But the NHS should work directly with the companies that it thinks are best suited to the interests of the public and they should go through the appropriate channels (possibly the G-Cloud?), where they know that the correct certifications are in place and controls are used to ensure the best possible protection to privacy.

Finally, do all this transparently. Explain to Trusts, the public, the media, the companies involved what the benefits are and how the data is being protected. Having a story like this exposed by a Tory MP and then written up by the Guardian is only going to set back the use of cloud computing in this already very sensitive sector.

A grey colored placeholder image