Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Using the Power of Excel…
To help with cutting a budget
About
Karen Harker
MLS, MPH
Collection Assessment
University of North Texas
About you
Individual66%
Group34%
Registrations
34%
32%
12%
10%
12%
Registrants
Academic-Public
Academic-Private
Public Library
Community/Technical
Other
Where you are
Objectives
Poll on your Excel skills
Take advantage of Excel’s features & functions
Key features & functions to cover:
• VLookup() – for organizing data
• PercentRank.inc() – for ranking items
• Conditional formatting – for visualizing data
• Pareto distributions (80/20) – for evaluating Big Deals
About the Webinar
Intermediate to advanced
Function Wizard
Pauses built-in
Criteria Used
Purpose
• Select resources to cut from budget based on evidence of value.
Objective
• Rank resources from least to most value based on criteria• Price• Use• Cost per use• Pareto number (80/?)• Inflation factor• Subject librarians’
ratings
About the Data
Usage
• 3 year average annual usage
• Highest & best measure of usage for type of resource
Pareto Distribution
• Distribution of usage across titles in a package
• Benchmark: 80% of usage from 20% of titles
• Comparisons of the second number (80/??)
Highest & Best Uses
• Full-text downloadsIndividual journals
• Full-text downloads• Distribution of usage across titles
Ejournal packages
• Items streamed/full-text downloadsAudiovisual
• Abstracts/record viewsLiterature databases
• Abstract/record views• Full-text downloads
Full-text databases
• Abstract/record viewsOnline reference (miscellaneous)
Data Sources
Integrated Library System (ILS)
• Sierra
• Bibliographic information
• Order record number (“o999999”)
• Key Identifier
COUNTER Reports of Usage
• JR1: Full-Text
• DB1: Abstracts
Export
• Excel
• CSV
Functions
Making Excel do the Work
Functions
What are functions?
• Mini programs that return a value
What are they made of?
• Equal sign (=)
• Tag or label
• Inputs or parameters, in parentheses and separated by commas
Example
• =Sum($E$2:E10)
• adds the numbers or the values of cell references and returns the total.
Order matters
• The order of the inputs or parameters matters.
Two Key Functions
VLOOKUP()
• Look something up
PERCENTRANK.INC()
• Distribution
What does VLOOKUP() do?
Master List
• ID• Title• Price• Usage• Inflation• Ratings
Resource Type
• ID• Title• Price• Usage• Inflation• Ratings
� V is for Vertical
� Looks down the first column of a list for a specific value, then…
� …returns the value of a specific column in that row.
� Allows you to link lists by an ID number
VLOOKUP() Parameters Decoded
=VLOOKUP(A2, 'Master List'!$A$1:$D$305, 2, FALSE)
Lookup_value
Table_array
Col_index_num
Range_lookup
What are you looking up?
Number or cell reference
Where are you looking it up?
The range that has the data you are needing.
File or Worksheet
Cell Range
VLOOKUP() Parameters
Colu
mn index
#
Looku
p r
ange
Ran
ge looku
p
Looku
p v
alueWhat are
you looking up?
Number or cell reference
Where are you looking it up?
The range that has the data you are needing.
What do you want to return?
Column Number
These are numbers, NOT letters.
How precise do you want to be?
True -Approximate match is OK
False - Only exact match.
Simple Example
� You want to look up an ID(#39) and return the name:� Lookup value - 39
� Lookup range - A1:C10
� Column index number - 3 (column C or Full Name)
� Range lookup - False (exact matches only) =VLOOKUP(39,A1:C10,3, False)
returns "Suroor Fatima"
Column C is the 3rd
Column
Challenges
� Challenges questions: what do these return?� =VLOOKUP(42,A1:C10,2,False)
� Operations
� =vlookup(35,A1:C10,3,False)
� Yossi Banai
� =vlookup(38,A1:C10,2,False)&", "&vlookup(38,A1:C10,3,False)
� Operations, Axel Delgado
� =vlookup(54,A1:C10,3,False)
� #N/A
IFERROR()
� =IFERROR(some function, what to return if there is an error)
� Embedded functions
� Excel processes innermost functions and works outward
� =IFERROR(VLOOKUP(52,A1:C10,3, False),”N/A”)
� If “52” can’t be found, returns “N/A”
Applying VLOOKUP()
Setting up the files
Multiple Files or Worksheets
Master List Columns Resource Type Worksheets
A. Order # (ID)
B. Title
C. Renewal Price
D. Type
1. Filter Master List on Type
2. Copy & Paste Order #� From Master List
� To Worksheet
3. Use VLOOKUP() � Title
� Renewal Price
4. Add Other Data� Usage
� Ratings
Master List Worksheet
ID in first column
Resource Type
• Individual titlesEjournal
• Literature• A&I or Full-text
Database
• Big DealsPackage
• Online reference source
Reference
Resource Type Worksheets
1. Filter Master List on Type
2. Copy Order #
3. Paste in resource type worksheet (E-journal)
Use VLOOKUP() in Resource Type sheet to
get Title from Master List
Title = B =
Col. 2
Filling in the cells
� Copy & paste the formula is quick & easy - BUT…
� Use Relative cell references for the Lookup_value – A2
� Use Absolute cell references for the Table_array- $A$1:$D$305
Use VLOOKUP() to Get Price
Price in 4th
Column
Use VLOOKUP() to Get Price
Add Usage Data to Resource Type
Worksheet
VLOOKUP() fromMaster List
VLOOKUP() toMaster List
Titles & Price from Master List to Resource
Type Worksheets
Master List
Ejournals
Database
Package
Usage Data & Rankings to Master List from
Resource Type Worksheets
Master List
Ejournal
Database
Package
It’s all relative
Using PercentRank.inc() to Compare Resources
Comparing Resources Against Each Other
Relativity
• How a resource "stacks up" against others of its kind.
Sort by some value
• CPU• Usage• Cost
Distributions vary
• Wide• Inconsistent
Use percentiles
• Understand the distribution
Check it out
0
2
4
6
8
10
12
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 200 250 500 1000 More
Fre
qu
en
cy
Bin
Histogram of Usage
Title A has 45 uses
Title B has 155
uses
Where does Title A fall relative to all the other titles? Title B?
PERCENTRANK.INC()
Returns
• The rank of a value as a percentage
• 0 to 1.00 inclusive
Parameters
• Array: Column of interest
• X: The value of interest
• Significance: # of significant digits
Example
• =PercentRank.inc(E:E,E2,2)
PercentRank() of CPU2 digits past
decimal
Compare the Ranks of Different Measures
50th percentile for usage
80th percentile for CPUCPU:
Lower is better.
Usage: Higher
is better.
Directions of Comparisons
Comparisons should be in the same direction
• High = good• Low = bad
Decide…
• Low = good• High = bad
…Or
Reverse directions, when needed
Original Ranks
• Low is goodCost
• High is goodUse
• Low is goodCPU
• Low is goodInflation
• High is goodRatings
Transformed Ranks
• High is goodTransformedCost
• High is goodUse
• High is goodTransformedCPU
• High is goodTransformedInflation
• High is goodRatings
Compare the (Transformed) Ranks
1 minus % Rank for CPU
50th percentile for usage
20th percentile for CPU
Higher is Better
Efficiency of ‘big deals’
Distribution of Usage Across Titles Within a Package
Power Law Distribution
� In statistics, a power law is a functional relationship between two quantities, where one quantity varies as a power of another.
� Wikipedia
Pareto Distribution in Libraries
AKA The 80/20 Rule
• 80% of the usage is from 20% of the collection.
• 80% of the uses are from 20% of the users.
Efficiency of an Ejournal Package
• 80% of usage is from ??% of the titles.
• 20% is a benchmark.
• Higher is better.
1. List titles in package.
2. Gather usage data.
3. Sort by usage Z-A.
Title Name 2011 2012 2013 3yr. Avg.
Package 12, Title 45 3783 4094 4562 4146.33
Package 12, Title 57 1722 1226 1162 1370.00
Package 12, Title 29 1313 1351 1252 1305.33
Package 12, Title 53 1263 1242 1335 1280.00
Package 12, Title 43 1081 1255 1250 1195.33
Package 12, Title 50 1076 986 1364 1142.00
Package 12, Title 32 1572 918 765 1085.00
Package 12, Title 13 949 1156 1010 1038.33
Package 12, Title 20 740 921 1018 893.00
Package 12, Title 58 1002 805 789 865.33
Package 12, Title 31 970 902 680 850.67
Package 12, Title 9 568 675 1148 797.00
Package 12, Title 40 703 731 870 768.00
Package 12, Title 46 599 846 838 761.00
Package 12, Title 24 583 709 844 712.00
Package 12, Title 21 639 590 568 599.00
Package 12, Title 42 585 592 459 545.33
Package 12, Title 36 517 459 491 489.00
Package 12, Title 1 466 469 450 461.67
Calculations for Pareto Distribution
% of Uses % of Titles
� Cumulative sum ⁄ total uses
� =SUM($E$2:E2)/ SUM(E:E)
� Locate the value closest to your benchmark (e.g. 80%)
� Cumulative count ⁄ total # titles
� =COUNT($E$2:E2)/ COUNT(E:E)
� Read the value next to the benchmark % uses
Pareto DistributionTitle Name 3yr. Avg. % Uses % Titles
Package 12, Title 45 4146.33 17.59% 1.75%
Package 12, Title 57 1370.00 23.40% 3.51%
Package 12, Title 29 1305.33 28.94% 5.26%
Package 12, Title 53 1280.00 34.37% 7.02%
Package 12, Title 43 1195.33 39.44% 8.77%
Package 12, Title 50 1142.00 44.29% 10.53%
Package 12, Title 32 1085.00 48.89% 12.28%
Package 12, Title 13 1038.33 53.29% 14.04%
Package 12, Title 20 893.00 57.08% 15.79%
Package 12, Title 58 865.33 60.75% 17.54%
Package 12, Title 31 850.67 64.36% 19.30%
Package 12, Title 9 797.00 67.74% 21.05%
Package 12, Title 40 768.00 71.00% 22.81%
Package 12, Title 46 761.00 74.23% 24.56%
Package 12, Title 24 712.00 77.25% 26.32%
Package 12, Title 21 599.00 79.79% 28.07%
Package 12, Title 42 545.33 82.11% 29.82%
Package 12, Title 36 489.00 84.18% 31.58%
Package 12, Title 1 461.67 86.14% 33.33%
Title 45 has over 17% of uses.
In this package, 20% of titles account for
2/3 of total uses.
About 80% of uses are used by nearly
30% of titles.
Compare Distributions of All PackagesORDER # Title Renewal Price # Titles Cost/ Title 3 yr Avg Uses CPU Pareto %
o1044667 Package 13 $ 1,974.97 6 $ 329.16 69 $ 28.62 50%
o4518731 Package 26 $ 3,919.83 8 $ 489.98 1305 $ 3.00 50%
o3099891 Package 268 $ 7,214.26 14 $ 515.30 89 $ 81.06 50%
o3679408 Package 87 $ 4,168.51 41 $ 101.67 1482 $ 2.81 47%
o3462341 Package 17 $ 12,305.61 39 $ 315.53 4817 $ 2.55 45%
o3874291 Package 89 $ 2,383.44 7 $ 340.49 1577 $ 1.51 43%
o1638543 Package 240 $ 22,557.40 355 $ 63.54 13756 $ 1.64 35%
o3906115 Package 25 $ 15,400.53 39 $ 394.89 509 $ 30.26 34%
o4616935 Package 262 $ 217,544.85 599 $ 363.18 63401 $ 3.43 33%
o4203276 Package 28 $ 3,794.65 22 $ 172.48 685 $ 5.54 30%
o2978969 Package 12 $ 64,795.21 59 $ 1,098.22 23585 $ 2.75 28%
o4081791 Package 227 $ 55,241.67 315 $ 175.37 26803 $ 2.06 27%
o3014782 Package 126 $ 137,240.35 5766 $ 23.80 28400 $ 4.83 24%
o2741003 Package 280 $ 288,666.48 1718 $ 168.02 25830 $ 11.18 24%
o1653441 Package 9 $ 38,135.83 12 $ 3,177.99 6870 $ 5.55 23%
o380186x Package 260 $ 12,332.98 37 $ 333.32 6032 $ 2.04 23%
o3768284 Package 239 $ 3,661.66 42 $ 87.18 47 $ 77.91 21%
o4096083 Package 295 $ 485,336.56 1571 $ 308.93 75883 $ 6.40 21%
o3798161 Package 43 $ 55,446.98 437 $ 126.88 9035 $ 6.14 19%
o2612380 Package 177 $ 39,781.00 2062 $ 19.29 230620 $ 0.17 19%
o3933416 Package 20 $ 2,189.88 52 $ 42.11 2171 $ 1.01 17%
o3006785 Package 143 $ 116,987.74 110 $ 1,063.52 4789 $ 24.43 17%
o3244064 Package 301 $ 22,390.00 37 $ 605.14 292 $ 76.68 14%
o1745232 Package 5 $ 5,529.10 1249 $ 4.43 2463 $ 2.24 2%
Conditional formatting
Quick way to highlight outliers or visually represent distributions
Ways to Use Conditional Formatting
� Highlight based on a specific value� Usage Measure (e.g. Abstracts, FTD’s, etc.)
� Greater than .7, .3-.7, and lower than .3
� Visually represent distributions � A visualization of PercentileRank()
� CPU
� Pareto
Conditional Formatting CPU
Set the Conditional Formatting
• Highlight the CPU column
• Select Conditional Formatting->3 color scale
• Red –Yellow – Green (High – Medium – Low)
• Highest is red; Lowest is green
Highest & Lowest 10th Percentile
• Conditional Formatting->Manage Rules->Edit Rule
• Change “Highest” and “Lowest” to “Percentile”.
Conditional Formatting CPU
Conditional Formatting CPU
Changing Rule to Percentile
Change “Lowest” to “Percentile”
Change “Highest” to “Percentile”
Conditional Formatting CPU
Altogether, Now
Master List - Summary columns
• Use VLOOKUP to "grab" the summary data from your Resource Type worksheets
• 3-yr avg uses
• CPU
• Pareto Distribution (Packages only)
Use Conditional Formatting
• Highlight important text
• Visualize distributions
Compiled Master List
Imported from ILS VLookUp() from Resource Type Worksheets
Master List
Visualizing Use Rank by 3 categories.
CPU by Percentile
Rank
Caveats
Use Table Formatting
• Can name your tables
• Automatically copies & pastes formulas
• Easier to add columns
• Adjusts formulas for absolute & relative cell ranges
Don't rename your files
• References will not change
Save all of your files in one folder
• Preserves relationships
Use the same structure in all of the worksheets
• Easier to set up
What (I hope) you’ve learned
• for organizing dataVLookup()
• for ranking itemsPercentRank.inc()
• for visualizing dataConditional formatting
• for evaluating the efficiency of Big Deals
Pareto distributions (80/20)
Questions and Comments
� Libraries are for Use
� Librariesareforuse.wordpress.com
� UNT Faculty Profile
� Karen Harker in UNT Scholarly Works
� Charleston Pre-Conference Workshop:
� Keeping it Real: A Comprehensive and Transparent Evaluation of Electronic Resources
� Cost: $150
� Presenters:
� Karen R. Harker
� Laurel Crawford
� Todd Enoch