This is turning into a slow-motion political train wreck,
with the care.data scandal
and the revelation
that the hospital episode statistics data sold to numerous companies
contained patient postcodes and dates of birth,
so the anonymity claims were simply false.
UK government departments and their agents store reams of personal information about us. They have to, to do their job.
That data is kept confidential. There are certain uses to which it can legitimately be put. Beyond that – verboten.
There are always poachers circling the game reserve. Most recently, it was Stephan Shakespeare. Then Tim Kelsey. And then David Gauke.
They all want to make more personal data available to researchers or entrepreneurs, to improve policy-making, to improve administration, to stimulate growth in the economy or to make medical break-throughs.
It is questionable whether any of those objectives would be achieved.
Even if the case for releasing more personal data could be made, there remains the problem of privacy/confidentiality. And Messrs Shakespeare, Kelsey and Gauke all offer the same safeguard – anonymisation.
If the research data is anonymised, then people can't be identified, so their privacy isn't breached, no confidence has been broken. True? Or false? Does anonymisation work? Is your privacy safeguarded by it?
The answer isn't clear.
Messrs Shakespeare and Gauke both recognise that it isn't easy to anonymise people's personal data. You can remove all sorts of details from a file of research data in the name of anonymisation, and yet the data subject can still be identified by cross-referencing what's left against other files.
They both cite work done by the Administrative Data Taskforce to improve the reliability of anonymisation. Mr Shakespeare tells us that the Information Commissioner's Office is working on the same problem and so is the Office for National Statistics.
But are they getting anywhere? Or can your identity still be deduced by cross-referring anonymised data against other files?
Professor Martyn Thomas sounded a note of caution when he gave evidence to the House of Commons Science and Technology Committee a year ago on 5 June 2013. "Anonymised research data" is an oxymoron, he said. If the data has really been anonymised, then it's no use for research and if it is useful for research, then it can't have been anonymised.
Then last month, on 4 April 2014, Professor Ross Anderson gave a lecture to the Open Data Institute (ODI) entitled Why anonymity fails. The ODI are obviously convinced by his arguments and describe the current travails of Tim Kelsey's care.data as a "slow-motion political train wreck".
So does anonymisation work or doesn't it?
Article 29 of the European data protection directive (95/46/EC) establishes a working party to monitor and update the directive. They published an opinion on 10 April 2014 (hat tip: Pinsent Masons).
Yes, anonymisation does work, says the working party ...
... although it's still risky ...
The Opinion concludes that anonymisation techniques can provide privacy guarantees and may be used to generate efficient anonymisation processes ...
... and even if it does work at one point, it can stop working later:
Finally, data controllers should consider that an anonymised dataset can still present residual risks to data subjects.
No doubt the Administrative Data Taskforce, the Information Commissioner's Office and the Office for National Statistics have done all sorts of good work. Nevertheless, when you hear the gung-ho Messrs Shakespeare, Kelsey and Gauke or anyone else assuring us that our anonymised personal data can be safely released for research without identifying us, unless you enjoy train crashes it's best to listen sceptically.
... anonymisation should not be regarded as a one-off exercise and the attending risks should be reassessed regularly by data controllers.