Although I now live in Washington DC, I’m from Cambridge in England. I still like to keep track of my Member of Parliament (MP) back home, Julian Huppert, using a site first developed in 2004 by mySociety called TheyWorkForYou.
It presents an aggregation of data on MPs from sources including official transcripts of parliamentary discussions (Hansard), election results, the Register of Interests and Wikipedia entries. In short, it’s a “digital dossier” on all of the country’s parliamentarians.
But take another look at Jullian’s page and the “Numerology” section halfway down. Isn’t it reassuring to see that he: “Has used three-word alliterative phrases (e.g. "she sells seashells") 244 times in debates — average amongst MPs.”
Hacking existing online behaviours
Why do they include this seemingly irrelevant statistic? If you click on the link next to it, you see their response titled: “Why should I read in more depth than just the numbers?”
They put that number there to catch your attention and remind you that: “Our advice — when you're judging your MP, read some of their speeches, check out their website, even go to a local meeting and ask them a question. Use TheyWorkForYou as a gateway, rather than a simple place to find a number measuring competence.”
They’re doing a good job of hacking a behavior that’s all too easy to fall back on: looking at a number published on a data website and using it verbatim to make a point, perform an analysis or make a decision.We need to stop doing this.
The responsibilites of data publishers and data users
Nicholas Kayser-Bril showed the importance of providing and understanding context with data in his blog post yesterday. Data providers often make assumptions about users - that people know which data are computed estimates, which ones are derived from surveys, which ones come from administrative records etc. It’s the responsibility of publishers to be as open as possible about the sources and methods used so that users can work with data in an informed manner.
I think we do a good job of this at the Bank - we share metadata and accompanying notes both on the web and via APIs, but there’s plenty more detailed work we’re pushing to do.
But it’s also the responsibility of users to treat data websites as a “gateway” to the full picture - data literacy requires both an understanding of statistics and data handling but also of the provenance and frankly, politics of data.What do you think?