Inside out data

August 29, 2021 4 minute read

There is a good general principle for understanding data and information: data in, information out. Things get a lot more complex than that, and it’s hardly a perfect rule, but I like it.

Data analysis should follow this pattern. Adding a feedback loop allows for continuous improvement. You start with your data, process it, and end up with information. With expertise you can hopefully gain some insight. Then you adjust things and go back to the start.

I’ve noticed that discussions involving public library data rarely consider data as an input. Key data for public libraries is often described as an output. And always one used to determine the quality of the service.

Is it just because these tend to be discussions about library statistics, rather than data, and the term data is mistakenly used interchangeably with statistics? Statistics are different from data, and can legitimately be argued to be an output. But in library-world these stats tend to be so basic that they are often just counts of the raw data collected in systems. For example, they might be a count of loans per year, rather than the actual table of loans.

And perhaps that is the problem - we should have moved on from simple aggregation and counts of data being the limit of our data analysis. A report of the count of loans per library may previously have been advanced management information, but it now just looks like a loss of detail from the raw data.

This also comes from a prevalent outdated performance methodology. The theory goes that you decide what is good performance and you report on it. So if you think more loans is good, and fewer loans is bad, then all you need is a regular count of loans. And people then consider that an output.

Basic counts of library use are still relevant for understanding how well the service is used. But performing more sophisticated data analysis is far easier than it used to be. The tools exist to use the complete raw data on library loans, alongside data on all manner of other sources. That makes it impossible to pre-define what you are looking for. You treat the data sources as inputs and explore them for insight.

Provide people with library data to explore. They can then make decisions on the information from that analysis process. Data in, information out.

Examples

But is this all in my imagination? Here are a couple of examples.

Bruce says statistics around libraries often involve outputs, like numbers of books loaned. “What we are really interested in are not just outputs but outcomes.”

Bruce Leeke, from Suffolk Libraries, in the article How Suffolk libraries have turned the page to a new chapter

This was from a recent article about Suffolk libraries. I’d agree with Bruce here that number of loans is limited as an output. But it’s not really an output at all, it’s an input. If you come at it from that angle, the task is to start processing it in detail in order to get real insight. A greater focus on loans data rather than lesser.

An output is a measurement of activity size and scope. An outcome shows the social value added. An output is a quantitative measurement. An outcome is generally a qualitative measurement.

Public Libraries Online, the article Inputs, Outputs, and Outcomes - Oh My!

I’m not sure this is correct in what an output is either, it presents outputs as simplistic aggregations of data (activity size).

In both these examples ‘outcomes’ are raised - as preferential to outputs. This doesn’t make any sense because if an outcome is taken from qualitative measurement it is still an output.

Why is this happening?

I think it’s fair to say that there is a confused picture of inputs, outputs, and outcomes across the public library sector.

There isn’t really a mystery of why this is happening:

Historic performance measures are the few reportable outputs from library services
These display poor performance, as library use is declining
Alternatives to this data is sought - ‘outcomes’. This is primarily in initiatives to communicate the value of libraries for funding/survival purposes

My full sympathy with library leaders having to go through this, when there are so many more useful things to be doing. But we are in a age of accessible and powerful data analysis tools. Within minutes you can take library data and other open data sources to gain useful operational insight to actual tackle declining use.

That could be information such as the communities who visited the library yesterday, and which didn’t. You can then start understanding why that was the case. Understanding those who aren’t using libraries would be of huge practical use. While the idea of outcomes seems more concerned with what existing users get from the service, when they’re already using it.

Libraries Hacked

Examples

Why is this happening?