Bibliographic Information Visualization and AnalysisChitra Madhwacharyula
Colleen Whitney
Lulu Guo
Background
California Digital Library serves the entire UC system: more than 35 million bibliographic records in the combined catalog
Need to be able to find patterns in this data in order to develop better services and tools
Project Goal
Facilitate exploration of items in bibliographic collections using brushing and linking techniques
Key concepts of interest Circulation patterns by general subject area Circulation patterns by time The relationship of circulation to holdings
Targeted Audience
California Digital Library Staff Characteristics
Develop tools and programs to serve all UC campuses
Not involved with day-to day workings of the libraries, but many are ex-librarians
One related project: improving the quality of information retrieval
Related Work
PaperLens Overview of collection along defined axes Ability to zoom in and see patterns within a subset Ability to highlight an item and see details,
additional patterns within the collection
Related Work
Dataset
Sample data (from UCLA) 320 randomly selected items Related circulation data Mid-1999 - mid-2005 All these items circulate (which is only true for
about 25% of the entire collection) This small set is being used to test the general
visualization approach; we will be working with a much larger set eventually
Our Prototype
Visual Components
Bar Graphs Small multiples segment data by subject Bars indicate number of circulation transactions,
by time Scatter plots
More detail, new dimension within subject/time Item Detail
Temporal pattern at item level
The Software Debate
Tableau Small multiples but no brushing and linking
Spotfire Brushing and linking, but no small multiples
Our Choice JpGraph: PHP graphics package backed by a
MySQL database
Demo
Issues Size of dataset
Complexity of metadata
Data transformations
What sampling technique to use ?
Limitations of JpGraph
Future Work
Expand the dataset
Make interface more intuitive Support filtering by subject and time at the outset Make timeline presentation of item-level
circulation pattern Solicit and incorporate feedback from target
audience
Questions ?