View
361
Download
0
Category
Tags:
Preview:
DESCRIPTION
Citation preview
Team Lightning presents:
LAPL Photo CollectionA case study in Information Retrieval
Presented December 8th, 2009 by Dalena Hunter, Michael Mocciaro, Shelly Ray, Dan Schell, Chris Salvano, Teresa Soleau
Team Lightning: LAPL Photo Collection
LAPL Photo CollectionA case study in Information Retrieval
I. Background: About the photo collection and the system that organizes it.
II. Problem Statement: Three specific information retrieval problems and solutions:
1. Sessions timing out
2. Ranking search results
3. Interface issues
III. Going forward …
Team Lightning: LAPL Photo Collection
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Background on the collection: Materials and System
Background: What is the Los Angeles Public Library Photo Collection ?
Consists of: the Herald Examiner photo collection, Shades of LA, and the Security Pacific National Back Collection.
The Security Pacific National Bank collection is comprised of 8 sub-collections:
a. Los Angeles Chamber of Commerce Collection;
b. Turn of the Century Los Angeles;
c. Hollywood Citizen News/Valley Times Newspaper Collection;
d. Central Library’s Historical California Photographs;
e. Portrait Collection;
f. Federal Writers Project;
g. Ralph Morris Archives;
h. William Reagh Collection
TEAM LIGHTNING: LAPL Photo Collection
Background: The collection and the system
Team Lightning: LAPL Photo Collection
Collection is part of LAPL’s online catalog
Items are described using MaRC metadata schema
Results in truncated keyword search results.
Rich indexing and descriptive elements are only available to staff working with the items themselves.
Background: System constraints
Team Lightning: LAPL Photo Collection
IT department is stretched thin and unable to devote time to backend or UI capability issues.
Only one photo archivist working on the project
Processing memory is limited
Results in system crashes (on a weekly basis) and timeouts
This may affect any attempt to add information or functionality to the system.
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Problem Statement
Problem Statement
Team Lightning: LAPL Photo Collection
What are the impediments to good information retrieval?
Lots of them …
1. Session timeouts
2. Ranking of search results
3. User interface
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Problem #1: Session Timeouts
Problem: Session timeouts
Team Lightning: LAPL Photo Collection
Users get interrupted with message that their session has “timed out”
A major disruption
When did they “time in”?
We suggest: Remove the automated time out feature and allow users to perform more elaborate, linked searches.
Problem: Session timeouts
Team Lightning: LAPL Photo Collection
Eliminating timeouts is #1 recommendation
This will enhance information retrieval by:
Allowing users to progress further in their search in the course of a session
Allowing for the addition to add greater user interface capabilities, such as a "View Personal List" feature
Acts as a form of search memory so that users do not have to remember or record their past searches
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Problem #2: Ranking search results
Problem: Ranking Search Results
Team Lightning: LAPL Photo Collection
The current ranking system (keyword searching):
Keyword search picks up hits in all descriptive fields of a photo’s metadata record
Favors “Subject” and “Summary,” often to the detriment of good recall and precision
Problem: Ranking Search Results
Page 1: Page 38:
Example 1: “Airport” as keyword search
Comparative analysis of “Airport” returns: Records #1 and #379
Problem: Ranking Search Results
Problem: Ranking Search Results
Example 2: “Raymond Chandler” as keyword search
Comparative analysis of “Raymond Chandler” returns: Records #1 and #6
Problem: Ranking Search Results
What’s going on here?
A keyword search favors the “Summary” and “Subject” fields and sorts returned photos by reverse chronological order
Therefore, a photo with 1 “airport” hit in the “Summary” or “Subject” fields and a photo date will be returned ahead of a photo with 3 “airport” hits that does not have a photograph date (n.d.)
How can Team Lightning bring some rationality to a keyword search?
Behold, the proposed ranking system…
Metadata Element Metadata Value Point Value
Click for Images: Direct link to photo --
Title(s): Title of photograph 3
Photographer: Name of photographer 1
Order Number: Control number for ordering purposes --
Filing Information: Filing box location / name 1
Publisher: Date of photograph --
Description: Item’s physical description --
Series: Associated Series Name (Name files) 1
Notes: LAPL control number --
Summary: Photo description 1
Subjects: Controlled vocabulary (LCSH) 2
Other Entries: Other entry names associated with item 2
The “Airport” example using Team Lightning’s Relevancy Ranking:
Elements Metadata Value Point Value
Click for Images: Link --
Title(s): George W. Bush [graphic] --
Photographer: Leonard, Gary --
Filing Information:
Portraits-Bush, George W. --
Publisher: 1999 --
Description: 1 photograph : b&w --
Summary:Closeup view of George W. Bush, Republican presidential candidate, taken at the Los Angeles International Airport. Photo dated: September 1, 1999.
1
Subjects:
Bush, George W. (George Walker), 1946-Los Angeles International AirportPresidential candidates--United StatesAirports--California--Los AngelesWestchester (Los Angeles, Calif.)
2
Total Point Value = 3
RECORD #1
The “Airport” example using Team Lightning’s Relevancy Ranking
Elements Metadata Value Point Value
Click for Images: Link --
Title(s): Los Angeles International Airport [graphic] 3
Filing Information:S-002-348.3 4x5 Transportation-Aviation-Airports-L.A. International Airport.
1
Publisher: [n.d.] --
Description: 1 photograph : b&w --
Summary:Aerial view of Los Angeles International Airport and surrounding area.
2
Subjects:
Los Angeles International Airport and surrounding areaAerial viewsAirports—California—Los AngelesWestchester (Los Angeles, Calif.)
2
Total Point Value = 8Analysis: This photo should appear before the photoof George W. Bush when doing a keyword search for “Airport”
RECORD #379
Elements Metadata Value Point Value
Click for Images: Link --
Title(s): Appian Way Apartments --
Photographer: Solomon, Cliff --Filing
Information:HE Box Raymond Chandler 1
Publisher: 1986 --
Description: 1 photograph : b&w --
Series: Herald Examiner Collection --
Summary:
Front view of the Appian Way Apartments with windows and trim in need of a paint job. Possibly used for location shooting in Robert Altman's version of "The Long Goodbye". Photo dated: Jul. 18, 1986.
--
Subjects:Marlowe, Philip (Fictitious character)Apartment houses—California—Los AngelesMotion picture locations
--
Other Entries:Altman, RobertChandler, Raymond
2
The “Raymond Chandler” example using TL’s Relevancy Ranking
Total Point Value = 3
RECORD #1
The “Raymond Chandler” example using TL’s Relevancy Ranking
Elements Metadata Value Point Value
Click for Images: Link --
Title(s): Raymond Chandler [graphic] 3
Filing Information: HE Box… --
Publisher: 1939 --
Description: 1 photograph : b&w --
Series: 8389 Chandler, Raymond 1
Summary: Novelist Raymond Chandler in 1939 2
Subjects:Chandler, Raymond, 1888-1959Authors
2
Total Point Value = 8Analysis: Though photographs of filming locations of “The LongGoodbye” may be useful for a user, photos of Raymond Chandlershould appear first in a search for “Raymond Chandler”
RECORD #6
Problem: Ranking Search Results
Final Analysis:
Incorporating a metadata “point” system can help improve recall and precision (within a keyword search)
Search results should be based on content across all fields, irrespective of reverse chronological order
LAPL won’t fool me twice
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Problem #3: Interface issues
User Interface: Revised Main Search Screen
New Search Options
Subject Browse By Letter
Simplified Year Limit Options
Team Lightning: LAPL Photo Collection
User Interface: Revised Advanced Search Screen
Advanced Search Options
Added Year Options
Added Boolean search options
Team Lightning: LAPL Photo Collection
User Interface:
LAPL Results Screen
Team Lightning: LAPL Photo Collection
User Interface:
Google Life Results Screen
Team Lightning: LAPL Photo Collection
User Interface:
LAPL item listing
Very small image on initial record
Detailed summary provided
Can browse by Subject
Team Lightning: LAPL Photo Collection
Large Picture on initial record
Limited metadata provided
Can browse related images
Can browse by “label”
One click to purchase screen
User Interface:
Google Life item listing
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Future enhancements
Conclusions
Going forward …
Future enhancements we recommend:
Dynamic term suggestion/real-time query expansion
Team Lightning: LAPL Photo Collection
Going forward …
Future enhancements we recommend:
Cross-walking to Dublin Core for inclusion in an aggregate
Team Lightning: LAPL Photo Collection
Going forward …
Team Lightning: LAPL Photo Collection
Going forward …
Team Lightning: LAPL Photo Collection
Going forward …
Team Lightning: LAPL Photo Collection
LAPL Photo Collection
Conclusions
Team Lightning: LAPL Photo Collection
LAPL Photo Collection
Team Lightning: LAPL Photo Collection
Questions??
Recommended