41
Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Embed Size (px)

Citation preview

Page 1: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Surviving the Information Explosion

Jaime Teevan, MIT

with Christine Alvarado, Mark Ackerman and David Karger

Page 2: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Let Me Interview You!

Web:–What’s the last Web page you visited? How did you get there?–Have you looked for anything on the Web?

Email:

Files:

–What’s the last email you read? What did you do with it?–Have you gone back to an email you’ve read before?

–What’s the last file you looked at? How did you get to it?–Have you looked for a file?

Page 3: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Overview

Introduction

Related Work

Study Methodology

Results: Search

Discussion

Intro

RW

Study

Res

Disc

Page 4: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Overview

Intro

RW

Study

Res

Disc

Introduction

Related Work

Study Methodology

Results: Search

Discussion

Page 5: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

The Information Explosion

You must extract information from: 3 billion Web pages (Google) Dozens of incoming

emails daily Hundreds of files

on your personalcomputer

Intro

RW

Study

Res

Disc

Page 6: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Haystack:Personal Information Storage

Email Web pages

Files Calendar

Contacts

Haystack

Intro

RW

Study

Res

Disc

Page 7: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Haystack:Personal Information Storage

What was that paper I read last week about

Information Retrieval?Haystack

Intro

RW

Study

Res

Disc

Page 8: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Haystack:Personal Information Storage

Ah yes! Thank you.

Haystack

Intro

RW

Study

Res

Disc

Page 9: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Supporting Information Interaction

Treat different corpora the same? Provide access to meta-data?

– Keyword search (XP, advanced search)– Browse (Hearst)

Intro

RW

Study

Res

Disc

We don’t really know …

Understand access in the wild!

Page 10: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Overview

Intro

RW

Study

Res

Disc

Introduction

Related Work

Study Methodology

Results: Search

Discussion

– Interaction by corpus

– How people search

Page 11: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Interaction By Corpus

Paper documents– [Malone, 1983], [Whittaker & Hirshberg, 2001]

Files– [Barreau & Nardi, 1995]

Web– [Abrams, et al. 1998], [Byrne, et al. 1999]

Email/Calendar– [Whittaker & Snider, 1996], [Bellotti & Smith, 2000]

Intro

RW

Study

Res

Disc

Page 12: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

How People Look for Information

Focus: Web Log analysis

– [Catledge & Pitkow, 95], [Tauscher & Greenberg 97]

Controlled tasks/environment– [Baldonado & Winograd, 1997], [Spool, 1998]

Situated navigation– Micronesian islanders [Suchman, 1987]– Electronic [Marchionini, 1995], [Hearst, 2000]– Information scent [Chi, Pirolli, Chen & Pitkow, 2001]

Intro

RW

Study

Res

Disc

Page 13: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Overview

Intro

RW

Study

Res

Disc

Introduction

Related Work

Study Methodology

Results: Search

Discussion

Page 14: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Method

Subjects– 15 MIT CS graduate students (5 women, 10 men)

Setup– 10 short interviews (~ 5 min.)– 1 long interview (~ 45 min.)

Topics– Web, Email, Files

Intro

RW

Study

Res

Disc

Page 15: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Short Interviews

Modified diary study [Palen, 2002] Randomly interrupted participant Two question types

– Last email/file/Web page looked at– Last email/file/Web page looked for

Goal: Discover patterns in searching and browsing

Intro

RW

Study

Res

Disc

Page 16: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Long Interviews

“Guided tour” of subject’s Web space, email, and file system

Goals:– Discover organizational patterns– Discover problems in

organizational structure– Relate organization to

search/browse behavior

Intro

RW

Study

Res

Disc

Page 17: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Overview

Intro

RW

Study

Res

Disc

Introduction

Related Work

Study Methodology

Results: Search

Discussion

– What and how

– Relating what and how

– Individual strategies

Page 18: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Complex Information Spaces

People had complex spaces Felt in control

Intro

RW

Study

Res

Disc

“That’s an interesting question. I think my email is the worst, because I have so much of it. And there are people on the other end who expect me to reply to it. My file system is pretty well organized. I have to go through it every once in a while, every couple of months and just kind of push things into the right folders and delete the old stuff. The Web just works, usually.”

Page 19: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

What People Look For

Specific Information– A small fact– E.g., URL, phone number, appointment time

General Information– A broad set of information– E.g., good sneakers to buy, info on cancer

Specific Document– The actual document– E.g., a file to print, an email to reply to

Intro

RW

Study

Res

Disc

Page 20: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

How People Look For Information

The last thing you looked for on the WebIntro

RW

Study

Res

Disc

Search is more than just keyword search

– Did you use a search engine?

Browse, use bookmarks, type URLs

“I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].”

Page 21: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

– Traditional search– Jump directly to target– Specify everything up front

Strategies Looking for Information

Intro

RW

Study

Res

Disc

Teleporting

Orienteering– Use local navigation– [O’Day and Jeffries, 1993]– Could include keyword

search

Page 22: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Example: Orienteering

[…]J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information.”

Intro

RW

Study

Res

Disc

Interviewer: Have you looked for anything on the Web today?Jim: I had to look for the office number of the Harvard professor.

[…]I: So you went to the Math department, and then what did you do over there?J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was.

I: So how did you go about doing that?J: I went to the homepage of the Math department at Harvard

Page 23: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Example: Teleporting

What if Jim had teleported instead?

Could have typed into a search engine: “Connie Monroe, office number”

Intro

RW

Study

Res

Disc

Page 24: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

“Keyword Search” and “Browse”“Keyword Search” and “Browse”“Keyword Search” and “Browse”

“Keyword Search”“Keyword Search”– Traditional search– Jump directly to target– Specify everything up front

“Keyword Search” and “Browse”

Intro

RW

Study

Res

Disc

Teleporting

Orienteering– Use local navigation– [O’Day and Jeffries, 1993]– Could include keyword

search

Teleporting

Orienteering

Page 25: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Orienteer to specific information

Relating How and What

People orienteer a lot What people look for related to how they look

Specific General Document

Orienteer 47 19 41

Teleport 34 23 17

Intro

RW

Study

Res

Disc

Surprise:

Page 26: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

– Did you know what email contained that information?

Why So Much Orienteering?

Your last email searchIntro

RW

Study

Res

Disc

People look for the information source Specific information searches Document

searches

– What were you looking for?

Page 27: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Looking for the Source: Example

“I was looking to figure out where Glaris was. When I lived in Switzerland there were only a few reasonable mapping places of the country. And so I had bookmarked [the Switzerland map site].”

Intro

RW

Study

Res

Disc

Page 28: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Looking for the Source: Example

Interviewer: Have you looked for anything on the Web today?Jim: I had to look for the office number of the Harvard professor.I: So how did you go about doing that?J: I went to the homepage of the Math department at Harvard[…]J: I knew that she had a very small Web page saying, “I’m here at Harvard. Here’s my contact information.[…]I: So you went to the Math department, and then what did you do over there?J: It had a place where you can find people and I went to that page and they had a dropdown list of visiting faculty, and so I went to that link and I looked for her name and there it was.

Intro

RW

Study

Res

Disc

Page 29: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Individual Strategies

Search strategies varied by individual Pilers: Pile information Filers: File information

Intro

RW

Study

Res

Disc

Where was the last email you found?– Inbox?– Elsewhere?

Page 30: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

File or Pile Email

0

2

4

6

8

0 50 100

% found in Inbox

# of

sea

rche

s

Intro

RW

Study

Res

Disc

Filer

Piler

Page 31: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

How Individuals Search For Files

0 1 2 3 4 5 6 7 8 9

M

L

K

J

I

H

G

F

E

D

C

B

A

Keyword Search OrienteeringIntro

RW

Study

Res

Disc

Filers

Pilers

Teleport

Orienteer

Page 32: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Overview

Intro

RW

Study

Res

Disc

Introduction

Related Work

Study Methodology

Results

Discussion

– Understanding and applying what we learn

– Future work

Page 33: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

UnderstandingTeleporting v. Orienteering

Why was orienteering chosen over teleporting? Teleporting doesn’t work Teleporting requires too much cognitive effort Risk of over-specifying target Orienteering gives knowledge of the source Teleporting a failure mode

– Can’t associate information with source– Can’t find the information source

Intro

RW

Study

Res

Disc

Page 34: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Understanding Filers v. Pilers

Why do filers teleport more than pilers? Irony: Those with good organization don’t take

advantage of it Filers have strictly organized information

Are used to defining meta-data for their information

Pilers loosely organize their information Are used to associative navigating

Intro

RW

Study

Res

Disc

Page 35: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Haystack: Applying What We Learn

Using meta-data: Support orienteering– Not about having the perfect search interface– Need ability to prompt

Individualized support– Pilers/filers– Learning individual behaviors

Intro

RW

Study

Res

Disc

Page 36: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Future Work: Search

Previously viewed information Causes of failure Searches across corpus Getting help from others

Intro

RW

Study

Res

Disc

Page 37: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Future Work: Organization

Consistency of organization across corpus

Corpora boundaries Context used in

organization Organization’s

effect on search

Intro

RW

Study

Res

Disc

Page 38: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Conclusion

Look at search in the wild Strategies: Teleport/Orienteer Individual strategies Future systems should:

– Support orienteering– Provide individualized support

Page 39: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Questions?

To learn more about Haystack:

http://haystack.lcs.mit.edu

Contact us with comments:

- [email protected]

- [email protected]

Page 40: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Relating How and Corpus

Email and files: Almost always orienteered Easy to associate information with document Web: Teleported much more often

Email Files Web

Orienteer 59 42 19

Teleport 06 10 64

Intro

RW

Study

Res

Disc

Page 41: Surviving the Information Explosion Jaime Teevan, MIT with Christine Alvarado, Mark Ackerman and David Karger

Relating What and Corpus

Email Files Web

Specific 39 7 33

General 10 7 30

Document 08 35 14

Email searches were primarily for specific information File searches were primarily for documents Web searches were more evenly distributed

Intro

RW

Study

Res

Disc