Upload
jayson-boyd
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Individualized Knowledge Access
David KargerLynn Andrea SteinMark AckermanRalph Swick
Information Access
A key task in Oxygen: help people manage and retrieve information
Three overlapping projects: Haystack:
information storage and retrieval application clients
Semantic Web: next-generation metadata
Volt: collaborative access
Presentation Overview
Motivation Information access behavior and goals
System Design & Architecture Data Model Interacting data and UI components
Working applications Base haystack Frontpage Volt
Motivation
Problem Scenario
I try solving problems using my data: Information gathered personally High quality, easy for me to understand Not limited to publicly available content
My organization: Personal annotations and meta-data Choose own subject arrangement Optimize for my kind of searching
Adapts to my needs
Then Turn to a Friend
Leverage They organize information for their own
use Let them find things for me too
Shared vocabulary They know me and what I want
Personal expertise They know things not in any library
Trust Their recommendations are good
Last to Library/web
Answer usually there But hard to find Wish: rearrange to suit my needs Wish: help from my friends in looking
Lessons
Individualized access Best tools adapt to individual ways of
organizing and seeking data
Individualized knowledge People know more than they publish That knowledge is useful to them and others
Collaborative use Right incentives lead to sharing and joint use
Haystack
Individualized access My data collection, organization Search tools tuned for me
Collaborate to leverage individual knowledge Access unpublished information in others’ haystacks Self interest public benefit
Lens to personalize access to the world library Rearrange presentation to suit my personal needs
Example
Info on probabilistic models in data mining My haystack doesn’t know, but “probability” is
in lots of email I got from Tommi Jaakola Tommi told his haystack that “Bayesian”
refers to “probability models” Tommi has read several papers on Bayesian
methods in data mining Some are by Daphne Koller I read/liked other work by Koller My Haystack queries “Daphne Koller Bayes”
on Yahoo Tommi’s haystack can rank the results for
me…
System Design
Gathering Data
Haystack archives anything Web pages browsed, email sent and
received, address book, documents written
And any properties, relationships Text of object (for text search) Author, title, color, citations, quotations,
annotations, quality, last usage
Users freely add types, relationships
Semantic Web
Arbitrary objects, connected by named links
No fixed schema User extensible
Sharable by any application A new “file
system”?
Doc
D. Karger
Haystack
title
author
Outstanding
qual
ity
says
HTML type
Gathering Data
Active user input Interfaces let user add data, note relationships
Mining data from prior data Plug-in services opportunistically extract data
Passive observation of user Plug-ins to other interfaces record user actions
Other Users
Data Extraction Services
Web Observer Proxy
Triple Store
Mail Observer Proxy
Machine Learning Services
Web Viewer
Volt Viewer/ Editor
Spider
Sample Applications
Sample Applications
Because everything uses the Semantic Web constructions, a variety of application clients can share information Web Browser---data viewer FrontPage---personalized information
filter Volt---collaboration tool
Haystack via Web
Web server interface
Basic operations: Insert
objects View objects Queries
Haystack via Web
Haystack via Web
Viewer shows one node and associated arrows
Service notices we’ve archived a directory; so archives the objects it contains (and so on…)
Haystack via Web
Services detect document type, extract relevant metadata
Output can specialize by type of object
Mediation
Haystack can be a lens for viewing data from the rest of the world Stored content shows what user
knows/likes Selectively spider “good” sites Filter results coming back
Compare to objects user has liked in the past
Can learn over time
Example - personalized news service
News Service
News Service
Scavenges articles from your favorite news sources Html parsing/extracting services
Over time, learns types of articles that interest you Prioritizes those for display
Content provider no longer controls viewing experience No more ads
Personalized News Service
Collaborative Access
Want to leverage others’ work in organizing information No need to “publish” expertise Exposed automatically---without effort Self interest helps others
Volt
Volt is about collaboration between people The Haystack architecture allows easy
collaboration among individuals semantic web references to Haystack
objects Individuals share parts of their
Haystack Group spaces and shared notebooks
Volt
Collaborators
Those I interact with Frequent mail contact Frequent visits to their home page
Those with shared content And who have same opinions about
content Collaborative filtering techniques
ReferralsExpertise search engine
Expertise Beacon
Volt Expertise Beacons
Group spaces and shared notebooks Create individual and group profiles
Profiles can be used to find other people Allows targeted search “Who else is working on this project?”
User controls visibility/privacy
Summary
Next generation information accessSemantic Web
provides a language and capabilities for meta-data
Haystack teases out individual knowledge, stores it in a coherent fashion, and allows a variety of application clients to leverage
individual meta-data
Volt turns individual knowledge into a community
resource
More Info
http://haystack.lcs.mit.edu/http://www.w3c.org/2001/[email protected]@[email protected]@w3.org