November 2016

Open data, closed algorithms, and the Black Box of Education

hey, what's going on in there?
Education is a ‘black box’ -- or so a prevailing view among many education policymakers and researchers goes.

For all of the recent explosion in data related to learning -- as a result of standardized tests, etc. -- remarkably little is known at scale about what exactly happens in classrooms around the world, and outside of them, when it comes to learning, and what the impact of this has.

This isn't to say that we know nothing, of course:

The World Bank (to cite an example from within my own institution) has been using standardized classroom observation techniques to help document what is happening in many classrooms around the world (see, for example, reports based on modified Stallings Method classroom observations across Latin America which seek to identify how much time is actually spent on instruction during school hours; in many cases, the resulting data generated are rather appalling).

Common sense holds various tenets dear when it comes to education, and to learning; many educators profess to know intuitively what works, based on their individual (and hard won) experience, even in the absence of rigorously gathered, statistically significant 'hard' data; the impact of various socioeconomic factors is increasingly acknowledged (even if many policymakers remain impervious to them); and cognitive neuroscience is providing many interesting insights.

But in many important ways, education policymaking and processes of teaching and learning are constrained by the fact that we don't have sufficient, useful, actionable data about what is actually happening with learners at a large scale across an education system -- and what impact this might have. Without data, as Andreas Schleicher likes to say, you are just another person with an opinion. (Of course, with data you might be a person with an ill-considered or poorly argued opinion, but that’s another issue.)
side observation: Echoing many teachers (but, in contrast to teaching professionals, usually with little or no formal teaching experience themselves), I find that many parents and politicians also profess to know intuitively ‘what works’ when it comes to teaching. When it comes to education, most everyone is an ‘expert’, because, well, after all, everyone was at one time a student. While not seeking to denigrate the ‘wisdom of the crowd’, or downplay the value of common sense, I do find it interesting that many leaders profess to have ready prescriptions at hand for what ‘ails education’ in ways that differ markedly from the ways in which they approach making decisions when it comes to healthcare policy, for example, or finance – even though they themselves have also been patients and make spending decisions in their daily lives.

One of the great attractions of educational technologies for many people is their potential to help open up and peer inside this so-called black box. For example:
  • When teachers talk in front of a class, there are only imperfect records of what transpired (teacher and student notes, memories of participants, what's left on the blackboard -- until that's erased). When lectures are recorded, on the other hand, there is a data trail that can be examined and potentially mined for related insights.
  • When students are asked to read in their paper textbook, there is no record of whether the book was actually opened, let along whether or not to the correct page, how long a page was viewed, etc. Not so when using e-readers or reading on the web.
  • Facts, figures and questions scribbled on the blackboard disappear once the class bell rings; when this information is entered into, say,  Blackboard TM (or any other digital learning management system, for that matter), they can potentially live on forever. 
And because these data are, at their essence, just a collection of ones and zeroes, it is easy to share them quickly and widely using the various connected technology devices we increasingly have at our disposal.
A few years ago I worked on a large project where a government was planning to introduce lots of new technologies into classrooms across its education system. Policymakers were not primarily seeking to do this in order to ‘transform teaching and learning’ (although of course the project was marketed this way), but rather so that they could better understand what was actually happening in classrooms. If students were scoring poorly on their national end-of-year assessments, policymakers were wondering: Is this because the quality of instruction was insufficient? Because the learning materials used were inadequate? Or might it be because the teachers never got to that part of the syllabus, and so students were being assessed on things they hadn’t been taught? If technology use was mandated, at least they might get some sense about what material was being covered in schools – and what wasn’t. Or so the thinking went ....

Yes, such digital trails are admittedly incomplete, and can obscure as much as they illuminate, especially if the limitations of such data are poorly understood and data are investigated and analyzed incompletely, poorly, or with bias (or malicious intent). They also carry with them all sorts of very important and thorny considerations related to privacy, security, intellectual property and many other issues.

That said, used well, the addition of additional data points holds out the tantalizing promise of potentially new and/or deeper insights than has been currently possible within 'analogue' classrooms.

But there is another 'black box of education' worth considering.

In many countries, there have been serious and expansive efforts underway to compel governments make available more ‘open data’ about what is happening in their societies, and to utilize more ‘open educational resources’ for learning – including in schools. Many international donor and aid agencies support related efforts in key ways. The World Bank is a big promoter of many of these so-called ‘open data’ initiatives, for example. UNESCO has long been a big proponent of ‘open education resources’ (OERs). To some degree, pretty much all international donor agencies are involved in such activities in some way.

There is no doubt that increased ‘openness’ of various sorts can help make many processes and decisions in the education sector more transparent, as well as have other benefits (by allowing the re-use and ‘re-mixing’ of OERs, teachers and students can themselves help create new teaching and learning materials; civil society groups and private firms can utilize open data to help build new products and services; etc.).

That said:
  • What happens when governments promote the use of open education data and open education resources but, at the same time, refuse to make openly available the algorithms (formulas) that are utilized to draw insights from, and make key decisions based on, these open data and resources?
  • Are we in danger of opening up one black box, only to place another, more inscrutable back box inside of it?

