Discomfort and Sharing
April 30, 2017
In mid-April, I spent a couple of days at the Governor’s Data Analytics Summit and pre-summit workshop. About 400 people attended representing state government, higher education, local government, nonprofits, and tech companies. (You can see the relevant tweets here https://twitter.com/hashtag/gdas2017?f=tweets&vertical=default) I gave a couple of presentations about VLDS (overview and data governance). Overall, it was a good event with lots of material for thought. What impressed me the most during the two days were the thoughtfulness about data governance and how data analytics are changing the way organizations work. By and large they are all trying to use data to improve the world.
What we are trying to do with VLDS, using shared data to improve policy recommendations and help all Virginians find the level of excellence they seek in their lives, is a challenging endeavor because sharing can be scary, or at least uncomfortable. There is always discomfort in the unknown. Within our agencies we get to know our data pretty well. We understand its context, provenance, and limits. We have a sense of what it can do. When we share data with other agencies, especially through sharing only de-identified data consistent with our privacy promise, state, and federal law, there is no telling what we might learn. The worst possibility being that long-held policy initiatives might be producing exactly the wrong results. Or that we’ve spent five times more money than needed to accomplish the same outcome. While we have not uncovered such results, we recognize these findings are possible. If they exist, we hope to find them and develop alternatives.
Those are big scary things. On a smaller scale we might learn that we have been measuring the wrong things. We actually assume this to be the case to some degree and that is why we are sharing data, to improve what we are measuring. None of us want to learn that the things we have historically measured or counted just don’t matter. The numbers might look good on paper and seem like they make sense, but could still be wrong. We are open to this, we challenge ourselves to learn these things, but it just isn’t always comfortable.
The important thing is that we are actively looking. We are up to the challenge. We are also pushing to expand our knowledge of the data. Our self-imposed deadlines for our first suite of reports that will expand the understanding of our shared data and the citizen interactions across agencies has come and gone. The reason for this is simple – we are learning more and more about the differences in the agencies and how they collect and define data. There are also challenges in how we think about merging academic year data with fiscal year data. And dozens of other trivial challenges that become less trivial across millions of records. Each new iteration brings closer to the final. It is exacting work requiring patience and a willingness to grasp tedious complexity.
Finally, understanding why something is what it is takes time. When we run across something that doesn’t fit our expectations, we have to investigate. We spend hours tracing aggregations back to their source calculations, running models to find the flaw or verify what surprised us. Its detective work and fun. We’re learning every single time we work with the data. This is what we are about – learning. And trying to make a difference.