Clay Shirky, Fantasy Football, and Using Data to Glean the Future of
Library Collections
Greg Raschke and John Vickery,
North Carolina State University
Charleston Conference
November 3, 2010
Assumptions
Economics are not sustainable Collections budgets will not grow at rate of past 30 years Unit growth and growth in cost per unit are not sustainable
Need to lower costs of overall system Lower unit costs Use data and users to be more precise
Therefore collection practices and strategies must change This change will be hard – much reason for optimism
Supply-Side Collections Print-based, unpredictable
demand, and legitimate need for just in case collections
Lead to judging quality by size (as in the ARL rankings) and libraries were then held captive to this standard
Contributed to inelastic demand for journals and a combination of speculative and package buying
Use is secondary to size, dollars expended, and other input measures
Credit to David Lewis (http://ulib.iupui.edu/users/dlewis)
Supply-Side Can Not Continue
Demand-Driven Collections
Make information easily, widely, and cheaply available
Collections as drivers of research, teaching, and learning
To make special or unique collections held/managed by the library available to the user community and the world
Demand-Driven – Changing Practice Tension between time-honored role as custodians of scholarship
versus enabling digital environment for scholars Not just PDA – portfolio of approaches, but certainly more
responsive Utilize new tools and techniques to become advanced analysts Truly embrace evidence based decision making
Look at how collections are actually used, not at expressed need
Demand-Driven – More Assumptions Less tolerance for and less
investment in lower use general collections
Resource management based increasingly on use
Modify collecting based on changes in the actual use
Risks of doing nothing – newspapers
Demand-Driven – Assertions
Rewards of adapting – more used and vital than ever
Use based and user driven collecting models will take growing share of budget
Bet on numbers Bet on good and quick Put resources into enabling
digital environment for scholars and custodian role will come out of that strategy
Why So Much Data?
Data analysis is a key component in solving/managing: Increasing pressure for accountability Increasing capability to gather and analyze data Increasing precision in the way we build collections and expend
resources Advocacy
Changing practice and data analysis at NCSU
Serials Review 2009 – Open, Data-Driven, and Real-Time Analysis Standardized usage data
(where available) Bibliometrics - publication
data and citation patterns (e.g LJUR)
Impact factor and eigenfactor User community feedback via
interactive, database-driven applications
Weigh/calculate/quantify user feedback
Weigh price against multiple data points
Usage ((07 usage+08 usage/2)+(publications*10)+ (citations*5)+(Impact Factor)
Community Feedback ((Weighted Ranking x % Match) x Total # Rankings) + 0.1 x # of "1s“
Price/feedback value Price/use Merge results to filter out top
20% and bottom 20%
Looking closer – Finding balanceAn example - a closer look at print item usage
Traditional ILS reporting tools can make this difficult
Advanced analytical tools can help
What types of questions can we ask?
Should Patron-Driven records not purchased be purged after 2 years? How does print item usage break down? Do print items even get used?
If it’s not used after 2 years…Should PDA records
be purged?
Maybe…
We haven’t even hit 50% usage
But what if we take a longer view…
If it’s not used after 2 years…
Things begin to look different
Looking even closer… How does
print item use break down?
Single circ usage is consistently ~14%
Would this change in a PDA only world?
Expenditures to University Data
Expenditures to University Data
Expenditures to University Data
Expenditures to University Data
Measurable Uses of the Collection 2009/2010
Full-text journal downloads* 3,672,600
Database use 1,989,972
Print book circulations/renewals 525,430
Digital collections requests 471,403
E-books 149,815
Reserves** 327,267
Total Uses 7,136,487
* Includes use of NC LIVE full-text content** Includes textbook, print, and e-reserves usage
Measurable Uses of the Collection 2009/2010
Challenges
Have ability to be more precise, more used, and more relevant than ever – need to make the necessary changes
Apps are a risk – silo(ing) networked, web environment – connections where libraries can excel
Not enough data - still lack much of the comprehensive data we need – must improve quickly
Data can punish niche areas, disciplinary variation, and titles without data
Open resources impact ability to control and command data