Wednesday 28 May 2014

David Gauke MP and the UK's tax revolution 2


This is turning into a slow-motion political train wreck,
with the care.data scandal
and the revelation
that the hospital episode statistics data sold to numerous companies
contained patient postcodes and dates of birth,
so the anonymity claims were simply false.

UK government departments and their agents store reams of personal information about us. They have to, to do their job.

That data is kept confidential. There are certain uses to which it can legitimately be put. Beyond that – verboten.

There are always poachers circling the game reserve. Most recently, it was Stephan Shakespeare. Then Tim Kelsey. And then David Gauke.

They all want to make more personal data available to researchers or entrepreneurs, to improve policy-making, to improve administration, to stimulate growth in the economy or to make medical break-throughs.

It is questionable whether any of those objectives would be achieved.

Stephan Shakespeare
An Independent Review of Public Sector Information
May 2013

Recommendation 2

...

Detail:

i) We should define 'National Core Reference Data' as the most important data held by each government department and other publicly funded bodies ...

ii) Every government department and other publicly funded bodies should make an immediate commitment to publish their Core Reference Data ...

iii) Alongside this high-quality core data, departments and other public sector bodies should commit to publishing all their datasets (in anonymised form) ...

(pp.11-2)

----------

Tim Kelsey
Long live the database state
July 2009

If the next government, of whichever party, wants a better public sector it must encourage more use of personal data; not less. What should be done? Data sharing must be made easier, first by removing the legislative obstacles to sharing government databases. The government should also pledge to publish as much new anonymised data as possible ...

----------

David Gauke
HM Revenue & Customs
Sharing and publishing data for public benefit – Consultation document
July 2013

Q3 Do you agree that HMRC should be able to share anonymised individual level data for the purposes of research and analysis to deliver public benefits wider than HMRC’s own functions? Please give reasons for your answer.

Q4 Do you agree with the proposed safeguards on the proposal to share anonymised individual level data? Should any further controls be considered on what can be shared, with whom or how?

Q5 How should the generation and release of anonymised or aggregated data be funded? Please give reasons for your answer.

(p.27)
Even if the case for releasing more personal data could be made, there remains the problem of privacy/confidentiality. And Messrs Shakespeare, Kelsey and Gauke all offer the same safeguard – anonymisation.

If the research data is anonymised, then people can't be identified, so their privacy isn't breached, no confidence has been broken. True? Or false? Does anonymisation work? Is your privacy safeguarded by it?

The answer isn't clear.

Messrs Shakespeare and Gauke both recognise that it isn't easy to anonymise people's personal data. You can remove all sorts of details from a file of research data in the name of anonymisation, and yet the data subject can still be identified by cross-referencing what's left against other files.

They both cite work done by the Administrative Data Taskforce to improve the reliability of anonymisation. Mr Shakespeare tells us that the Information Commissioner's Office is working on the same problem and so is the Office for National Statistics.

But are they getting anywhere? Or can your identity still be deduced by cross-referring anonymised data against other files?

Professor Martyn Thomas sounded a note of caution when he gave evidence to the House of Commons Science and Technology Committee a year ago on 5 June 2013. "Anonymised research data" is an oxymoron, he said. If the data has really been anonymised, then it's no use for research and if it is useful for research, then it can't have been anonymised.

Then last month, on 4 April 2014, Professor Ross Anderson gave a lecture to the Open Data Institute (ODI) entitled Why anonymity fails. The ODI are obviously convinced by his arguments and describe the current travails of Tim Kelsey's care.data as a "slow-motion political train wreck".

So does anonymisation work or doesn't it?

Article 29 of the European data protection directive (95/46/EC) establishes a working party to monitor and update the directive. They published an opinion on 10 April 2014 (hat tip: Pinsent Masons).

Yes, anonymisation does work, says the working party ...
The Opinion concludes that anonymisation techniques can provide privacy guarantees and may be used to generate efficient anonymisation processes ...
... although it's still risky ...
Finally, data controllers should consider that an anonymised dataset can still present residual risks to data subjects.
... and even if it does work at one point, it can stop working later:
... anonymisation should not be regarded as a one-off exercise and the attending risks should be reassessed regularly by data controllers.
No doubt the Administrative Data Taskforce, the Information Commissioner's Office and the Office for National Statistics have done all sorts of good work. Nevertheless, when you hear the gung-ho Messrs Shakespeare, Kelsey and Gauke or anyone else assuring us that our anonymised personal data can be safely released for research without identifying us, unless you enjoy train crashes it's best to listen sceptically.

No comments:

Post a Comment