Saturday, 21 May 2016

"Data Science Ethical Framework" – contempt for the public

Housewives as a whole cannot be trusted to buy all the right things, where nutrition and health are concerned. This is really no more than an extension of the principle according to which the housewife herself would not trust a child of four to select the week’s purchases. For in the case of nutrition and health, just as in the case of education, the gentleman in Whitehall really does know better what is good for people than the people know themselves.

That was Douglas Jay in 1937, writing in The Socialist Case. How much has changed 79 years later?

-----  o  O  o  -----

The Government Digital Service (GDS) have just invented data. Apparently we didn't have data before. Don't tell anyone, but policy used to be made by reading Whitehall tea leaves.

And now we have something called "data science" to go with it. Data science is very important.

It's going to create innovation and thereby cause the economy to expand. It is unknown how we ever managed to have innovation before data science. It is equally unknown when data science will cause the economy to expand and how much the economy will expand by.

Data science is also going to allow Whitehall to perfect customised public services. It will revolutionise the relationship between people and the state. It will improve our lives.

Hard to believe, but there are objections to Whitehall's claims. Objections which caused the Cabinet Office to publish a Data Science Ethical Framework the other day. "This framework is a first iteration - a beta, if you like - of a set of principles wider than the legal framework, to help stimulate innovative and responsible action", says Rt Hon Matt Hancock MP, Minister for the Cabinet Office and Paymaster General rather airily in the introduction.

It might have been nice if the Cabinet Office had taken a little more care. Been a little less slapdash. It might have shown that they recognise the importance of these ethical considerations and that they take public concern seriously. But at the end of the day it's only the public. And a first iteration – a beta, if you like – is all they really need, these people.

"All they really need" is No.1 of the six key principles that constitute the data science ethical framework (p.3):
  1. Start with clear user need and public benefit
  2. Use data and tools which have the minimum intrusion necessary
  3. Create robust data science models
  4. Be alert to public perceptions
  5. Be as open and accountable as possible
  6. Keep data secure
GDS have ten design principles for all their work, including data science. No.1:

Note that GDS have to try somehow to maintain their empathy with users even though the users are so hopeless that what they ask for isn't always what they need. Thank goodness for the gentleman in Whitehall who knows better.

That first key principle of the data science ethical framework is amplified by the time you get to p.5 of the Cabinet Office document where the first factor that needs to be assessed is "How does the department and public benefit". It's not just user needs. Government departments have needs, too, and don't you forget it.

The data science framework advocates obeying the law. Which is good of GDS. "The law (e.g. the Data Protection and Intellectual Property Acts) sets out some important principles about how you can use data" (p.3).

But we shouldn't expect that principle and GDS's patience to last forever. We have it on the authority of Stephen Foreshew-Cain, executive director of GDS, that by 2030:
The way that the law is made will have changed. Today we are often blocked by the stuff written on the faces of bills about which we have limited understanding of feasibility, but by 2030 we will have legislation that supports service delivery, not blocks it.

White papers & green papers would be replaced by public prototypes of new or iterated services ...
By then, Mr Foreshew-Cain believes, the law will be the product of data science and not vice versa. He follows the lead there of Francis-now-Lord Maude, Matt Hancock's predecessor, who believed that the laws forbidding data-sharing between departments are just so many "myths" that need to be "busted".

Sod the law, Lord Maude more or less said, "we’re the JFDI school of government". You have been warned.

As long as the government's use of data science operates on obviously open data, there's no ethical debate to be had. Which could account for the absence of any ethical debate in GDS's ethical framework.

Government departments collect personal information from us for a specific purpose. Under what circumstances is it ethically acceptable to share that personal information with others for other purposes? You won't find out by reading GDS's jottings on the subject. They simply assume that personal information is open data, see key principle No.2, "use data and tools which have the minimum intrusion necessary".

There is one exception of course. Who wrote this document? We don't know. Their names have been withheld. GDS can sometimes show a little discretion. And who are we, the public, to intrude?

Actually, the document may not have been written by human beings at all. We learn on p.3 that:
Digital advances are producing huge amounts of new forms of data, allowing computers to more quickly process this data and makes decisions without human oversight [emphasis added]. This creates new opportunities and many new challenges we have not had to consider before.
Perhaps it was written by a slapdash robot with no interest in ethics.

You may say that we're being a bit unfair. There is an ethical consideration in the document. Key Principle No.4, "be alert to public perceptions". But don't forget, "what they ask for isn't always what they need". Only GDS know what people really need.

"Be as open and accountable as possible" (Key Principle No.5) but otherwise GDS's data scientists have carte blanche to carry on intruding.

The ethical framework document may not include any ethical considerations but it does list a few successful open data projects. The Office for National Statistics project to use mobile phone data to manage traffic congestion, for example (p.8). There are two more examples on p.10, three on p.13, one on p.14 and two on p.16. That's nine successful and innovative UK open data projects to date. Successful in GDS's eyes.

If those successes are possible with the present unbusted myths/laws against data-sharing, why do we need to invert the Constitution and make personal information open by default?

The question isn't posed in GDS's ethical framework and unsurprisingly therefore it is unanswered. What are the putative benefits of data science which could outweigh the risks?

GDS advocate key principle No.3, "create robust data science models". The table opposite shows a complete list of all the robust data science models GDS have built and demonstrates incontrovertibly their huge net benefits.

It's all a bit of a mystery, isn't it. A mystery to us, the public, at least. GDS know what they're talking about, of course, but not us. How could we?
The public cannot easily distinguish between the ethics of data science (the production of the insight) and the decision or intervention taken as a result. They are more likely to be content [dear lambs] if it is a supportive intervention rather than a punitive one (unless someone has broken the law) ... (p.8) This is really no more than an extension of the principle according to which the housewife herself would not trust a child of four to select the week’s purchases.

Updated 30.9.16

According to LinkedIn, Paul Maltby has spent three years and nine months since January 2013 as the Cabinet Office's director of open data and transparency. What is there to show for it?

Today he published How does data fit with digital?. "... in some areas we are removing barriers for data access", he says, "elsewhere we will need to consider new protections for how we store, access and use data".

What sort of "new protections"? Answer, the data science ethical framework discussed above four months ago in May 2016. It took three-and-half years to produce that defective first draft devoid of ethics and there's been no progress since.

We're talking here about personal information. Mr Maltby advocates "removing barriers" to sharing personal information, the ethical framework provides no replacement protection and meanwhile he mocks any critic as a libertarian.

"... consent is not on its own a viable protection", he says, and "we should be wary of a purely consent-based approach". Government based on consent is no match in Maltbyworld for "collective interests", the "collective good", "collective decisions" and "collective interests".

Mr Maltby is obviously an expert in public administration as well as data science.

He was deputy director of Tony Blair's strategy unit for 4½ years (2003-08, no known results) and director of strategy at the Home Office for the next three-and-a-bit years (2008-11, no known results), he is a gentleman in Whitehall and he is a user – you must empathise with him.

But is he right? Remember rule #1, what users ask for isn't always what they need.

"Strengthen working discipline in collective farms"
– Soviet propaganda poster issued in Uzbekistan, 1933

Updated 1.10.16

Her Majesty's Revenue and Customs (HMRC) was created by the Commissioners for Revenue and Customs Act 2005 (CRCA).

By default government withholds personal information
CRCA makes it a criminal offence for HMRC to disclose taxpayer records "except in limited circumstances" (p,32):
CRCA prohibits the disclosure of information held by HMRC in connection with its functions except in limited circumstances set out in legislation. This prohibition applies to all information held by HMRC in connection with its functions and reflects the importance placed on 'taxpayer confidentiality’ by Parliament when the department was created. There is additional protection for information that relates to an individual or legal entity whose identity is specified in the disclosure or can be deduced from it (‘identifying information’), in the form of a criminal sanction for unlawful disclosure.
By default government discloses personal information
In April 2014 HMRC announced that it was planning to start sharing its information with other organisations. Please see You are for sale 2 and David Gauke MP and the UK's tax revolution 1 and 2.

Why did HMRC want to stand CRCA on its head?

The answer goes back to the June 2013 G8 summit at Lough Erne when the delegates agreed that "data held by Governments will be publicly available unless there is good reason to withhold it". That is an inversion of the status quo, at HMRC and throughout the UK's public administration.

We identified a number of interested parties – not just the G8 and David Gauke MP but also Rt Hon Francis-now-Lord Maude MP, Stephan Shakespeare, Tim Kelsey, Professor Sir Nigel Shadbolt, Kieron O'Hara and The Hon Bernard Jenkin MP.

Lord Maude has pulled out and Mr Kelsey has been transported to Australia but to that list must be added GDS's Paul Maltby. The Guardian newspaper published an interview with him on 13 June 2013:
You joined the Cabinet Office as director of open data and transparency in January. What will be your biggest challenge?
Promoting open data on the international stage. The UK is president of the G8 this year, and forming a collective, international agreement on open data is one of our central aims.
Open data v. personal information
Disclosing open data is a Good Thing. It is a Bad Thing that GDS are unenthusiastic about disclosing their own data, please see GDS yet to decide over DOS spend data publication from two days ago.

Not all data is open data. Some data needs to be withheld. National security data, for example. And, until Lough Erne, personal information. The interested parties listed above, however, make no distinction between open data and personal information.

Open data will cure cancer, make children happier and expand the economy
We have previously reported on Stephan Shakespeare's zany review of public sector information:
Is that exciting? It couldn't be more exciting: from data we will get the cure for cancer as well as better hospitals; schools that adapt to children’s needs making them happier and smarter; better policing and safer homes; and of course jobs (p.5) ...

Forecasting future benefits is also hard to predict. How businesses and individuals might use datasets in the future to generate new products and services and by implication impact economic growth, is equally unknown (p.30) ...
Forecasting future benefits was precisely Mr Shakespeare's job. He couldn't do it.

Mr Maltby refers to that farrago in his Guardian interview:
What's your next priority?
The Shakespeare review of public sector information has given us fire in our belly to face the next challenge – opening up more data to domestic businesses so that British companies can really succeed.
Neither Mr Shakespeare nor Mr Maltby can present a coherent case for making personal information held by the government publicly available/open to domestic businesses. How would that make British companies "really succeed"? They can't tell you. It's "hard to predict". The dynamics are "equally unknown".

What would constitute a "good reason to withhold it [personal information]"? They can't tell you. They can't distinguish personal information from open data. They aren't interested in doing so.

What causes Mr Maltby to persevere with his mission to overturn established practice and incontinently to share personal information willy-nilly between government departments and with other organisations? It's a mystery. It isn't logic. The decision isn't based on data. It's not scientific. Or business-like. Or responsible. The only answer offered is ... fire in his belly. But fire in the belly is unprecedented as a rule of inference in public administration.

Progress to date
Luckily the fire isn't very hot:
  • Mr Maltby has acquired five million of our pounds to try to duplicate the Royal Mail's postcode address file. That was six months ago and there has been no published progress report since then.
  • He has caused other people to produce two registers: a register of countries; and a register of English local authorities. There are just two of them, they're not live yet, you can't rush these things, they're still being tested ...
  • ... and that's just as well. He has set up the Register Design Authority (1.4.16) with "domain control for the domain" which would put GDS in control of all Whitehall information if they actually had any registers and if Government as a Platform ever became a reality (18.12.15).
  • He has made some contribution to the frivolous Digital Economy Bill. Specifically, he has tried to make the ethical problems go away by changing the word "data-sharing" to "data access".
  • And he has produced the ethics-free data science ethical framework above which places no barriers whatever between publication/sharing and our personal information.
There are other inversion threats – the identity assurance programme, Companies House, ... – but at this rate our privacy will be safe at least from Mr Maltby for some time to come.

Updated 12.10.16

The Digital Economy Bill Committee took evidence yesterday from, among others, Mike Bracken and Jeni Tennison (roughly 11:00-11:25). Both of them criticised the Bill for its lack of clarity about access to data. When is data freely available for researchers and innovators to use? Not clear. When can departments share data? Not clear. Can departments be forced to make data available? Not clear. How do we avoid another failure like Not clear.

The Digital Economy Bill is a facetious bit of work put together by Ed Vaizey MP and now coming apart in the hands of Matt Hancock MP. Paul Maltby's contributions to the Bill have not helped.

It should be noted that he is now on the way out: "The [Committee] session was held following the announcement by the Cabinet Officer that GDS director of data Paul Maltby would be stepping down from his role once his contract expires in late December ... A Cabinet Office insider indicated Maltby’s leaving was due to the end of his contract and not an indication that he is being replaced".

Updated 17.10.16

The Digital Economy Bill Committee took further evidence on 13 October 2016. They heard from Jerry Fishenden between about 11:30 and 12:00 and then from the Information Commissioner's Office (ICO) between about 12:20 and 12:50.

Mr Fishenden is the father of the Government Gateway and was representing the Privacy and Consumer Advisory Group, of which he is co-chair. His oral advice is amplified in written evidence submitted to the Committee.

He warned that the Digital Economy Bill moves control of our personal information out of our hands and into the hands of officials. This is proposed in the putative interests of "data-sharing", which is nowhere defined in the Bill. The management of our personal information would depend on the codes of practice adopted by officials but these are not included in the Bill – so how can anyone know how the law would work?

One member of the Committee, Calum Kerr, clearly took the point:
Dr Fishenden, your exasperation with what is in the Bill is shared by other witnesses. We are faced with whether we can strengthen it in such a way that it is workable, or whether we should just oppose it, despite all the benefits. (Q228)
The ICO was represented by the Commissioner herself, Elizabeth Denham, and Steve Wood, the Deputy Commissioner. The Commissioner warns that the Digital Economy Bill proposes to share people's personal information without their consent. In the absence of consent, she says, there have to be other safeguards. This slovenly piece of draft legislation doesn't contain any.

Once again, the danger is that the Committee is wasting its time, the Bill isn't ready yet for the Committee's attentions.

Updated 20.10.16

Government Computing:
GDS new director general Kevin Cunnington has been giving further information about how he sees the organisation developing under his leadership. The overall GDS strategy is still being worked on, he said, but is expected to be out by Christmas.

He indicated that he plans to create a profession for digital, data and technology and he is also going to get a grip of the GOV.UK Verify identity assurance scheme.

“Two things that the [GDS] Advisory Board asked us to concentrate on are sort out Verify and get it to scale and the other is to tackle the really hard data issues” ...

Updated 8.12.16

Today is GDS's fifth birthday. By way of celebration, Kevin Cunnington, Director General, published Now we are 5. "Here are some of the things I am looking forward to us working on in the next year", he says. Here is the first of those things:
Fixing data

To make things that are truly better for citizens, we know that we need to fix how data is stored and used in government. Current structures prevent departments from giving each other access to information. The creation of joined up services across government is inhibited by legacy structures. GDS will work to lower these barriers, and help to establish secure, ethical ways for working with data for the benefit of the citizen. As part of this work, we will be publishing a roadmap of open APIs (application programming interfaces) for data.
How will GDS "establish secure, ethical ways for working with data for the benefit of citizens"?

There's been no sign of any understanding of the ethics of data-sharing so far, please see above.

There still isn't.

We're promised a "roadmap of open APIs". That might make it easier to overcome the obstacles to sharing data. The obstacles decreed by a supreme parliament and deployed by a so far obedient administration. But those APIs will just make it harder, not easier, to maintain an ethical approach and, indeed, to maintain security.

Updated 21.2.17

John Manzoni, chief executive of the UK Civil Service, gave a speech this morning, Big data in government: the challenges and opportunities.

Sharing our personal information all across government and beyond will improve our lives, he assumes. Here we go again with the "single source of truth" (18.12.15).

The Electric Kool-Aid Acid Test
Mr Manzoni recognises that "public trust is absolutely critical to achieving our ambition for a data-driven government".

How is that trust to be retained/reclaimed?

"In partnership with civil society, GDS has published an ethical framework for data science in government ...". They have indeed published a skimpy first draft paper with many of those words in it, Data Science Ethical Framework, please see above ...

... but it doesn't amount to a data science ethical framework and it undermines public trust.

Mr Manzoni recognises also that the on-line government he craves needs on-line identity assurance: "Verify [GOV.UK Verify (RIP)] - the government identity service for citizens - is enabling people to access a whole range of online government services easily, securely and in a way which builds their trust", he says. He's wrong.

And he says "by 2020, we are aiming to get 25 million people using the service". Wildly unrealistic, to coin a phrase.

Updated 12.4.17

Paul Maltby left his job as director of data in December 2016. In January 2017 he contributed to an exciting pioneers-pushing-back-the-frontiers blog post, Growing a government data science community: "... this is the story of how this joint project team overcame these hurdles, developed a community in government of more than 350 individuals with a data science capability, and started to put this capability to use to drive value for citizens".

The data pioneer corps had a four-point strategy, led by the imperative "to ‘show not tell’, by doing some practical demonstration projects, as opposed to writing strategy papers to explain in the abstract what the project might mean".

Have they achieved that? You be the judge:
  • "From these early beginnings to build a community has grown a wide range of opportunities to share and connect. These include a dedicated messaging app, where code and frustrations can be shared, and an assortment of data drinks, lunches and dedicated community groups within departments".
  • Also, "there are now tens of data science case studies – a mixture of prototypes (some successful, some less so) and increasingly serious value-adding propositions [no examples given, not shown, not even told]".
The fourth imperative on the strategy was "to ground this work in an ethical approach that, from the start, aimed to consider what we should do with these potentially powerful tools, not just what we could do with them". This required "an updated policy and legislative framework, not only to remove unnecessary friction (through data access provisions [i.e. data-sharing provisions] in the Digital Economy Bill, for example), but also to put in place new rules and procedures, for instance, on the ethical application of these new tools".

The "ethical application" link takes you to Data Science Ethical Framework, which is the subject of this very blog post here and which we have seen is neither ethical nor a framework.

As to the Digital Economy Bill, we have recorded a number of criticisms above, to which you may care to add:
  • Privacy groups urge dropping entire Digital Economy Bill data clause and
  • The Thirteenth Report/demolition job of the House of Lords Delegated Powers and Regulatory Reform Committee covering Parts 5-7 of the Digital Economy Bill, e.g.:
    • "21. We consider it inappropriate for Ministers to have the almost untrammeled powers given by clause 30".
    • "23. ... a higher level scrutiny cannot justify the delegation of a power which is inappropriately wide".
    • "37. We regard this as a wholly unconvincing reason for excluding Parliamentary scrutiny".
    • "69. ... We consider that this provision is inappropriate in the absence of a convincing explanation as to why it is needed".
    • And again, later, "82. ... We consider that this provision is inappropriate in the absence of a convincing explanation as to why it is needed".
Kevin Cunnington, director general at the Government Digital Service (GDS) "plans to create a profession for digital, data and technology". Based on the evidence above, GDS's data analytics achievements seem puny and anti-democratic, and you might conclude that he's got his work cut out ...

... but that's wrong.

Mr Cunnington can't create the profession – it's already there.

As Mr Maltby notes, but doesn't take on board, "there were some who dismissed the new data science agenda as 'trying to pretend it invented maths' and claimed data science had been practised in government since the time of the experimental physicist Patrick Blackett and the amazing innovations in operational research during and since World War II" (c.f. @gdsteam invent the right angle).

GDS will surely find it easier to succeed and "drive value for citizens" by collaborating with their colleagues if they recognise that Whitehall wasn't a thickly-wooded island inhabited by primitive and superstitious tribes before GDS brought MacBooks and civilisation. There's a data science profession already there and there's grown-up legislation with prudent helpings of friction already governing data-sharing.

No comments:

Post a Comment