"There are literally multiple lists of the same things"
Here's a selection of Government Digital Service (GDS) posts and a film in the week leading up to purdah:
Janet Hughes and Stephen Dunn
Martha Lane Fox
Let's take a look at Public Servant of the Year ex-Guardian man Mike Bracken CBE CDO CDO's 28 March 2015 offering, Data as a public asset.
Mr Bracken is the executive director of GDS. He is the senior responsible owner of GOV.UK Verify (RIP). And he was also already the UK government's chief digital officer (CDO) when they unexpectedly made him chief data officer (CDO) as well.
What is a chief data officer?
Mr Bracken explained all to his faithful amanuensis Alex Howard, who reports the interview in UK's first chief data officer to focus on making data a public asset:
Only one example of these "great things" was given and that doesn't work.
When we open up well-structured data sets to the market, great things happen ...
That was just a part of the CDO's answer to one of Mr Howard's tougher questions, Do governments perhaps need a "Chief Wisdom Officer?". Mr Bracken's answer also included:
Some readers may still be mystified as to what a chief data officer does all day. "Canonical registers"?
In the absence of standards, we have allowed a growing number of competing registers of data. There are literally multiple lists of the same things. We haven't settled on canonical register of business data, for instance. Companies House has something perceived to be canonical, but no one uses it properly inside government--yet organizations like DueDil use it to transform the market for business services. We need to make decisions about what registers and data are canonical, and we need to work out some basics, like what an open address format should look like.
Be assured you are not alone.
As a first step in the attempt to get our duck in a row, note that chief data officers don't approve of "multiple lists of the same things". Tidy-minded, orderly and well-organised, chief data officers believe that there should be just one list.
Take for example GOV.UK Verify (RIP). There are three "identity providers" at the moment, each beavering away compiling their own register of Brits. That makes three lists. If and when the other six "identity providers" go live, we'll be up to nine lists of the same things/people. To a CDO, that's an unconscionable mess – expect him to use all his powers to consolidate them into just one canonical register.
The CDO is an advocate of open data, freely available to all. As long as it's "canonical" and "well-structured", once the market gets hold of it "great things happen".
Strangely, Mr Bracken gives the example of DueDil:
"Strangely", because DueDil are brimming with innovative ideas, which they have worked to implement, collating and analysing the data stored by Companies House, while nevertheless paying for that data just like the rest of us.
We are a growth-stage technology company on a mission to organise the world’s private company information.
What's more, if and when Companies House's data is made freely available to everyone, that could have an adverse, or even lethal effect on DueDil.
This problem has been spotted by Stephan Shakespeare, the co-founder of YouGov, who wrote a report for the government on open data, or "PSI" as he calls it, Public Sector Information. What should DueDil do when they get torpedoed by the raw materials of their business suddenly becoming free? All heart, according to Mr Shakespeare they should "embrace the change".
Innovation is not created by open data, pace Professor Sir Nigel Shadbolt, Chairman and Co-Founder of the Open data Institute and of whom more anon, but it certainly can be inhibited.
It's not just DueDil who are affected. Mr Shakespeare's method relies on expropriation. The rights to data are taken from one organisation and given by the state to innovators. The data is a public asset. That's habit-forming. How long before all innovators have their rights expropriated in turn?
Data processing skills
Why does the UK government chief data officer pick on addresses as the most telling example, the very exemplar of his mission? There should be a single, standard format for addresses, he says. Of all the objects to light on, why addresses?
Perhaps the Electoral Commission give us a clue.
GDS worked with the Commission on the implementation of Individual Electoral Registration. That involved matching records on the electoral roll with records on databases maintained by the Department for Work and Pensions (DWP), the Department for Education (DfE), the Welsh Department for Education and Skills, Royal Mail and the Student Loans Company.
It didn't go well. There were problems with matching addresses. The Electoral Commission said in their report (p.5):
The addresses [on the chosen DfE database] appeared to be more complete than those held in other national databases but a poor data specification from Cabinet Office [the home of GDS] meant that the format was inconsistent ...
The move from household to individual electoral registration was legislated for by the Electoral Registration and Administration Act 2013.
While the new law was still being debated, an impact assessment determined that it would be illegal for Electoral Registration Officers to cross-reference their data with the data held by other government departments – "Data matching – national rollout would require primary legislation".
How to overcome that Constitutional protection afforded by the wisdom of ages? Easy. Modernise it away. The 2013 Act simply declared this data-matching to be legal.
The chief data officer's GOV.UK Verify (RIP) pan-government identity assurance programme raises unanswered questions about cross-referencing the personal/private/confidential information about you held by government departments and by the credit referencing agencies. How can GDS's "identity providers" obtain your consent to perform this data-matching before they know that it's you on-line giving that consent?
They can't. But perhaps we're asking the wrong question. Perhaps your consent will become irrelevant. Perhaps another Constitutional protection will be erased and the "identity providers" will be authorised to proceed willy-nilly by some impending stroke of the legislative pen.
So much was data-matching anathema to the 2010-15 UK government that Rt Hon Francis "JFDI" Maude MP, Cabinet Office Minister and managing director of Morgan Stanley, demanded a retraction from the Guardian newspaper when they claimed that he was promoting it:
That was back in April 2012. Whether you call it "data-sharing" or "data-linking", it doesn't matter – both GOV.UK Verify (RIP) and the open data initiative depend on lots of it and Mr Maude has been intent throughout his time in office on clearing up the nation's confusion about data-matching:
This is not a question of increasing the volume of data-sharing that takes place across government, but ensuring an appropriate framework is in place so that government can deliver more effective, joined-up and personalised public services, through effective data-linking.
Are you convinced?
I want to bust the myths around the complexities of data sharing ... we aim to find effective ways of using and sharing data for the good of everyone.
Open data doesn't automatically cause better public services any more than it automatically causes innovation (Shadbolt, above) or finds cures for cancer (Kelsey, below). Just think of the Child Support Agency. They had unparalleled access to detailed information about the families in their care and managed nevertheless to increase their misery.
It's defective magic. Mr Maude's argument is meretricious. You could give up your privacy and yet not reap the promised reward. Then you'd look silly, wouldn't you.
"Just a minute", you may say, "there's a distinction between open data and the rest, and sharing open data can be a good thing". Agreed. Making the details of government expenditure open to everyone can help to hold the executive to account.
But where is the line between open data and personal data? When does it stop being personal data and become a public asset? The answer is that it's hard to say. It takes a lot of effort by a lot of fair-minded people to decide the answer, case by case.
Too much effort, according to David Gauke MP, until recently Exchequer Secretary to the Treasury. The Constitutional position that taxpayer information, for example, held by Her Majesty's Revenue and Customs, is confidential by default and can only be disclosed after due deliberation, is holding back the economy.
Similarly for medical records, according to Tim Kelsey, the national director for patients and information at NHS England, confidentiality is a brake on research and privacy campaigners are murderers:
Tim Kelsey, David Gauke, Francis Maude, Stephan Shakespeare, ... if they have their way, personal information will become open data, the distinction will be lost.
No one who uses a public service should be allowed to opt out of sharing their records.
"Not a chance", you may say.
Really? Take a look at Mr Gauke's Ministerial Foreword to Sharing and publishing data for public benefit. In the interests of fighting tax evasion, the UK has convinced the G8 to reverse the settled Constitutional position (p.4):
Previously, a lot of personal information was withheld because there wasn't time to discuss the merits of disclosing it. Now, a lot of personal information will be disclosed because there isn't time to to discuss the merits of withholding it ...
... the UK helped secure the G8’s Open Data Charter, which presumes that the data held by Governments will be publicly available unless there is good reason to withhold it.
... just as you probably don't have time to carry on reading this post.
Winding up, what was that Professor Sir Nigel said back in 2008 when he and his PhD student Kieron O'Hara published The Spy in the Coffee Machine? Oh yes:
And to end, just note that no reference has been made to Mr Bracken's Data as a public asset since the opening paragraphs. The UK's chief data officer is interested in "the need for government-wide data standards and a mechanism for enforcing them". But silent and apparently uninterested in Constitutional protection.
... sharing information across government databases will dramatically increase governmental powers – otherwise the UK government [1997-2010] wouldn't have proposed it. (p.95)
... we should never forget that bureaucracies are information-thirsty, and will never stop consuming. Indeed, they will never even cut down. They will break or bend their own rules, and any prior specification of how information use will be limited, or data not shared, is not worth the paper it is printed on. (p.212)
[Oops – meant to link to here, apologies]
Two days ago on 6 May 2015 the Guardian newspaper published The seats to stay up for – and the numbers that matter – on election night. On the basis of the best political polling available, they predicted that the Labour party would win 273 seats and that the Conservatives would win the same, 273 seats.
Yesterday we had the election.
Today, Labour have 232 seats in the House of Commons and the Conservatives 331. 41 fewer than predicted and 58 more, respectively.
Understanding the data you have collected is clearly difficult, even for experienced professionals with the eyes of the nation on them.
How much more so for the rest of us, including GDS?
At the time of writing, GDS operate 797 service dashboards crammed full of data collected automatically to tell viewers all about the performance of GDS's digital public services.
It's hard to collect meaningful data and hard to understand it. Ask the ONS. Ask a chief data officer.
We all need to be sceptical when interpreting those figures on the digital service dashboards. Everyone. Including GDS.
One day you may take a set of figures on a dashboard to imply the digital equivalent of a Labour-led coalition only to find yourself facing a Conservative majority government the next day.