National public library data

5 minute read

There is agreement on the need to compile open data to give an accurate picture of UK public libraries. Their locations, staff, type, financial details, stock, and what goes on in them.

How can we encourage all UK public libraries to publish accurate and timely data to a set standard? Some thoughts here.

Think local

National library statistics provide high-level comparative stats. For example, issues per year, per authority. Not much about individual libraries.

The problem with this data is that it has no obvious use. No local authority would accept data without the detail of each library. So why compile such data at a national level?

Although the end goal is to have national data, the starting point needs to be local data in a standard form.

Task 1: Create a schema for library data that presents data that can be used at a local level

Make it useful

National stats are simple for reasons. One of these is to make the data more manageable. Easier to compile and publish on paper, or a simple spreadsheet. Having borrowing data per library may be OK for a single service, but at a national level? Excel would explode!

Except it wouldn’t, not any more. Computers are powerful and Excel will be fine. The more detail the better. Even monthly data isn’t enough. What does that tell us about how Monday compares to Friday? Nothing.

Task 2: Make the data schema as comprehensive and detailed as possible

Dogfooding, is a term describing people using their own product.

In open data, this means testing and using the data. Don’t publish what you know wouldn’t be useful for yourself.

We want published data to be used, so let’s make it useful, with library expertise. Then get early feedback from the public and data analysts as to what else they’d like to see.

Task 3: Trial data exports with volunteer authorities. Run public events to get feedback on the data

Promote difference

Libraries do things differently, and their data will be different. For example, some services may use 100 item types to categorise their stock, others may have 20.

Attempts at standardising data often attempt to make these uniform. For example, CIPFA provide a set of item types that each library has to convert their own to when reporting their data. A waste of time, and ultimately a loss of data, as those differences are one of the interesting parts of the data.

Task 4: Ensure the schema enforces a standard structure, but allows for interesting differences in the content

If you tell services that you want comparable data, the assumption may be that you want to see which are ‘good’ or ‘bad’.

That’s a goal only served by over-simplified data. People interested in libraries will want to understand differences. We know that some library services have increases in borrowing, while others have seen a decrease. That could be for all sorts of reasons, differences in opening hours, socioeconomic factors, location of libraries, shifting trends in high street use, etc.

It’s important to emphasise that more is not better. The numbers don’t matter, the reasons for them do. Some libraries may have an increase in borrowing due to their location. Good for them, but they’re no more important than a library with a decrease in borrowing that serves a particularly deprived area.

Task 5: Commission data analysis to look at key questions using trial authority data. Make that process available to any authority publishing using the data schema.

Tell stories

Data has a bad reputation in public libraries, and the people tasked with doing it will be sick of it.

At one local authority, data reporting meant keying measures into a ‘key performance indicator’ system every month. This included book issues and visits per library. Data from other council services also went into the system. It would be a dull job even for someone who loved data. But for libraries, reporting performance to senior leaders and councillors as a slow downward trend must be soul-destroying.

That’s not how to use data. Data tells us stories about things we are curious about. We don’t need it to prove worth, but it’s useful for improving services, and making data-informed decisions.

A culture of curiosity about libraries should be enough to encourage data publishing. When is the library most busy? Is that the same everywhere? Is it affected by weather? What about local events and footfall in the surrounding town?

Data can inform policies. Ever wondered if fines are effective? If you can’t remove fines, would lowering them actually bring in more money? What about loan periods for different types of items? Are they based on data?

An Open Data Literacy project in Seattle has been working with interns to publish stories about library data, and they are wonderful. See this one about using census data in public libraries or the hipster reading list of books that haven’t been taken out for 10 years.

Task 6: Blog and experiment with the data. Create interesting stories and make them reproduceable for those with the same data

Automate it

No-one likes doing the same thing with data all the time. Extracting and publishing datasets should be as automated as possible.

We don’t have that many library systems in public libraries. 6 or so? Capita, Civica, Axiell, SirsiDynix, Infor, Koha. Probably something else as well, and Durham seem to have their own. But that should be 6 pieces of work to automate those datasets, not 200.

Task 7: Pay a reasonable fee to library suppliers, or individuals experienced with each system, to extract the data on an automated basis. Document it and ensure it’s made available to every library service

Provide training

Most essentially, library staff need the skills to do data work themselves. To choose what they want to find out about their library, and to get away from boring data reporting that doesn’t serve their needs.

There are some excellent resources for data training. Artefacto publish listings of Free resources to help library staff level up and learn new skills, and library carpentry lessons and workshops are freely available for anyone to use.

But having standard data would allow for training materials that taught library staff data skills with their actual data. Workshops and training events would give staff the opportunity to use their own data with set examples.

Task 8: Create training materials with the data. Run workshops across the UK to provide hands on training opportunities for those authorities publishing it

Updated: