Book thoughts: Data-driven decisions
Data-Driven Decisions: A Practical Toolkit for Library and Information Professionals, written by Amy Stubbing, is available from Facet Publishing, and all good bookshops (but maybe call ahead). I was kindly given access to a temporary review copy by the publisher, and bought the ebook.
Firstly I don’t describe this post as a ‘review’. The book is good and worth buying, so do so if you can, or encourage your organisation to. Here are some thoughts on the content - mainly from subjective opinion.
Whatever your reason for picking up this tome today, I hope it allows you to love and use data a little bit more than you did before.
Data-Driven Decisions. 1. Introduction
The introduction makes the book accessible to those from a non-data background. You don’t need to be good at maths, or even to enjoy it, to be able to use data. You don’t need to love data, but hopefully you will by the end of the book.
This seems particularly important in the library sector, from my experience. People who are very good at their jobs, but not confident in their data skills are often given the task of data reporting. This can be a destructive situation, where they are left to figure it out for themselves, as well as being given fairly dull tasks. That doesn’t create a situation in which they can begin to love data, which is a shame. If you are curious about the operation of your library, and the people who use it, then data is a way to explore those things - loving library data is about loving libraries.
The introduction is also powerful in tying the better use of data to the challenges library services are facing, and advocating for data in tackling these problems.
What we can do though, is prepare, stay ahead of the curve, and make our services as effective, innovative and relevant as possible
Data-Driven Decisions. 1. Introduction
The book is jargon-free, and in a genuine way, rather than trying to remove or explain jargon. Perhaps because the author has learned the techniques and methods through experience, rather than formal training, so the language comes naturally.
I dislike the term ‘data-driven’, though I’m sure few people are completely driven by data over other factors. But I’d love for library services to promote ‘data-informed’ decisions as an alternative phrase. It’s a minor distinction, but emphasises that decisions should be made by people, informed by data.
The toolkit takes the form of six steps, each presented as chapters: Identify, Collect, Map, Analyse, Act, and Review. This is similar to other guides you’ll find and is a good structure to do targeted data and performance analysis. They’re all well covered, with plenty of good examples, and the consistent conversational style.
There are some aspects of the guidance where it feels like bad practice, or at least practice that could lead to bad habits. There is repeated emphasis to ‘save as you go’, and store data at its different states to potentially reuse later.
Well, while it’s not wrong that you could pull the data you need again at a later date, when you start using data regularly, redownloading reports and organising data when you have already worked on it is a monumental waste of time.
Data-Driven Decisions. 4. Collect
I’d take an alternative view to this. Creating duplicate copies of underlying source data leads to trouble in terms of data retention and archive policies, as well as being bad for data storage and carbon footprint.
I’m sure it’s not the practice of the author to be inefficient with data, but I’d even go as far to stress that you should prioritise deleting data over saving it. Delete delete delete. Don’t use the recycle bin - go straight to a hard delete. I’ve seen library services routinely saving data extracts onto shared drives (including personal data), where that data was already held and backed up in the main Library Management System. They’ve then created their own additional backups of that data (despite the shared drives also having backup schedules), which are again backed up by IT processes. The exponential growth of data storage becomes horrific.
On the point about re-doing things you’ve done in the past, this is where I’d suggest saving your steps as you go, but not data. For example, that could be an Excel formula you used which could be saved in a template. If there are 10 steps to convert the data from source then you want to save those steps to make that process repeatable, not the 10 copies of the data at each stage.
Anyway, what about the overall approach (identify, collect, map, analyse, act, review)? I think it’s reliable the majority of the time, and likely most useful for what people need in general. And in providing many practical examples, they undoubtedly include the exact things people reading the book will be doing.
I personally don’t know if those linear steps are as useful as they once were. I’d like us to develop a single approach to data systems, management, stewardship, and analysis that looks into what data is being held, minimising that as much as possible, and steps for generating real-time dynamic insight from that data.
Organisations I’ve worked with that are advanced in data analysis think less about identifying what they need to collect, but a lot about what they hold. Data held is the data you accumulate, primarily through day to day operations. And if you don’t have an operational need to hold that data then you probably shouldn’t have it. Again - delete delete delete.
I appreciate this is likely to be beyond the power of the majority of library staff, where the Library Management System is a primary data source, but not something they have a great deal of control over to manage the data.
It’s all very well for me to go on about alternative approaches, but the book also explores lots of these. Guest authors tackle various subjects: moving from transactional to a transformational service, collection mapping, user experience and qualitative data, alternative data sources (like social media), building a data culture, and data visualisation.
All of these are great. They keep up the same tone, which is impressive given the variety of authors. I particularly enjoyed the collection mapping section, which showed (among many other things) how important it is to share open data between libraries when assessing a collection.
These additional sections are really interesting, but to some extent are both too brief and in too much detail. They go a bit further than an overview, but not enough for practical use. The digital and social media chapter for example includes social media data, sentiment analysis, altmetrics, and web analytics. This is easily enough for a book - you could have a book (and there are probably many) on just web analytics.
Go and get it!