35
Speaker: Nattiya Kanhabua L3S Research Center / University of Hannover Concise Preservation by combining Managed Forgetting and Contextualized Remembering Research Talk, May 9, 2014 University of Twente, Enschede

Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Embed Size (px)

Citation preview

Page 1: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Speaker: Nattiya Kanhabua

L3S Research Center / University of Hannover

Concise Preservation by combining

Managed Forgetting and

Contextualized Remembering Research Talk, May 9, 2014

University of Twente, Enschede

Page 2: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

An interdisciplinary team of experts in:

• Preservation, information management, information extraction

• Multimedia analysis, storage computing, cognitive psychology

ForgetIT Project Consortium

Page 3: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Overview of the ForgetIT project

• Motivation

• Example use cases

Work Package 3: Managed forgetting

• Objective

• Achievements in Year 1

Outline

Page 4: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

However, we are facing:

• Dramatic increase in content creation (e.g. digital photos)

• Increasing use of mobile devices with restricted capacity

• Information overload and changing professional + private lives

• Inadvertent forgetting in lack of systematic preservation

Forgetting plays a crucial role for human remembering and life

(focus, stress on important information, forgetting of details)

A Computer that forgets ?

Intentionally ??

And in context of preservation???

Shouldn't there be something like

forgetting in digital memories as well?

Forget IT

Page 5: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Motivation

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Page 6: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Vision: Building a Bridge

major progress in preservation technology

maturing information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

ForgetIT

Enabling

smooth

transition to

preservation

Creating

immediate

benefit +

reducing effort

Opening

alternatives to

“keep it all” and

“forgetting by

accident”

Easing

interpretation

in the long run

taking inspiration

from and

complementing

human memory

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Page 7: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Building the Bridge

Managed Forgetting

Synergetic Preservation

Contextualized Remembering

• bringing back information

into active use in a

meaningful way

• as opposed to the current

“forgetting by accident”

• inspired by human

forgetting

• couples information

management and

preservation management

Page 8: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

• High

awareness

of trip details

• Showing of

pictures

• Sorting out

redundant

pictures

• Sub-

grouping

and sorting

Simple Example: Holidays

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to

Paris with

Friends

• Thousands

of pictures

• Life goes on

• Pictures go

out of focus

• Creation of a

small

diverse

subset for

showing

occasionally

• Creation of

summary

page

• Addition of

context info

• Further

reduction of

redundancy

• Rest of

pictures into

archive

February 2015

Paris

Team: Me, Mary

Christine, Tom

• Changes in

life (e.g.

marriage)

• Addition/

update of

context

information

• Dealing

with

preservatio

n issues

girlfriend

Page 9: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

• High

awareness

of trip details

• Showing of

pictures

• Sorting out

redundant

pictures

• Sub-

grouping

and sorting

Simple Example: Holidays

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to

Paris with

Friends

• Thousands

of pictures

• Life goes on

• Pictures go

out of focus

• Creation of a

small

diverse

subset for

showing

occasionally

• Creation of

summary

page

• Addition of

context info

• Further

reduction of

redundancy

• Rest of

pictures into

archive

February 2015

Paris

Team: Me, Mary

Christine, Tom

• Changes in

life (e.g.

marriage)

• Addition/

update of

context

information

• Dealing

with

preservatio

n issues

girlfriend Girlfriend

wife

Page 10: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

• High

awareness

of trip details

• Showing of

pictures

• Sorting out

redundant

pictures

• Sub-

grouping

and sorting

Simple Example: Holidays

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to

Paris with

Friends

• Thousands

of pictures

• Life goes on

• Pictures go

out of focus

• Creation of a

small

diverse

subset for

showing

occasionally

• Creation of

summary

page

• Addition of

context info

• Further

reduction of

redundancy

• Rest of

pictures into

archive

February 2015

Paris

Team: Me, Mary

Christine, Tom

• Changes in

life (e.g.

marriage)

• Addition/

update of

context

information

• Dealing

with

preservatio

n issues

girlfriend Girlfriend

wife

• Revisiting

of Photo of

trip photos

• Re-

integration

into overall

photo

collection

(link into

context)

Page 11: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Managed Forgetting

Inspired by central role of human forgetting:

• help in identifying and focus on relevant information

• support preservation content selection

• replace inadvertent forgetting

Based on:

• Careful information value assessment

• Forgetting strategies via policies

• Forgetting options to integrate final manual checking

before deletion

• Combination with multi-tier storage solution

possible

Managed forgetting ≠ automatic deletion Instead: range of forgetting options e.g. • resource condensation

• change of indexing & ranking

• reduction of redundancy

decreasing

memory

buoyancy

Use of

tiers

Page 12: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Contextualized Remembering

Aim:

Bring back information into active use in a meaningful way even if a lot of

time has passed

Aim for semantic level of preservation

Based on:

Take into account relevant parts of context when moving to archive

Increase contextualization of preserved content

Consider context evolution over time (evolution-aware contextualization)

A. Ceroni, N. K. Tran, N. Kanhabua and C. Niederée, Bridging Temporal Context Gaps using

Time-Aware Re-Contextualization, (To appear) SIGIR’2014

Page 13: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Evolution-aware Contextualization & Re-contextualization

Context of

Interpretation

t

C C‘

Archival Information

System

Pres(D‘)

Pres(C‘)

Information

System

Human Forgetting

Change in focus

Structural changes

C‘‘

Evolution-aware

Contextualization

Re-contextualization

Pres(D‘)

Pres(C‘‘)

Semantic evolution

Structural evolution

Terminology evolution

Pres(D‘)

Pres(C‘‘)

D

Contextualization

C‘‘‘

D

Context-aware

Preservation

Semantic Evolution

Detection

D D

Page 14: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Work Package 3: Managed Forgetting

V. Mayer-Schönberger. Delete - The Virtue of Forgetting

in the Digital Age. Morgan Kaufmann Publishers, 2009.

Page 15: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

WP3 Objectives

• Conceptual model for managed forgetting Foundations of human-brain inspired managed forgetting

• Development of managed forgetting methods Information value assessment

Set of methods for Preserve-or-Forget

Policy-driven approach to managed forgetting (Y2)

Focus of Year 1

• Conceptual model for managed forgetting

• Design and implement the core managed forgetting process

• Exploratory research of information value assessment

Objectives of WP3 and Year 1 Focus

Page 16: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Role in Preserve-or-Forget Architecture

Page 17: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Research questions and first ideas for complementing human memory

(co-worked with WP2, D3.1) • Episodic memory: reconstruct lifetime memories and support reminiscence

• Working memory: better focus in current information use

Information value assessment (co-worked with WP9, D3.2)

• Data model and a computation method based on Semantic Web technologies

• Integration to PIMO semantic desktop and Preserve-or-Forget middleware

Exploratory studies (D3.2)

• Analyzing collective memory of public events in Wikipedia

• Analyzing high-impact features for content retention in the Social Web

• Feature selection for efficiency and scalability

Achievements in Year 1

Page 18: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Goal: understand how to complement human memory processes

Focus on two types of memories:

• Episodic memory: support reminiscence of long-term autobiographical events

• Working memory: better focus in current information use, e.g. de-cluttering

personal information spaces

Two information values: memory buoyancy, and preservation value

Complementing Human Memory: Our First Ideas

Page 19: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Memory buoyancy

• Information objects sinking down with decreasing importance, usage, etc.

Preservation value

• Used to decide which information object will be preserved or archived

Information Value Assessment

Memory Buoyancy Preservation Value

Short-/Mid-term current interests

E.g. meeting or travel documents

Long-term need for future use

E.g. important life events

Subjective metrics

+ usage logs (views, edits, modifies)

+ time, e.g., aging or recency

+ social context, external influences

Objective metrics

+ diversity, coverage, quality

Page 20: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Rapidly forget details -> “less redundancy”

Reconstruct from similar events, context

Rely on common patterns -> “false memory”

Our first ideas:

• Store details differing among similar event types forgotten in human memory

• Event-centric organization of digital items can play an important role

Forgetting in Episodic Memory

Page 21: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Memory bumps or peaks in the forgetting curve

Reminded or triggered the original memory by:

• A physical object (e.g. a printed photo)

• A digital memory system

• Different subsequent events

Our ideas:

• Propagate increased interest in an event to related events

• Consider common things, e.g., same entities, or similar event types

• Increase relevance level or use of memory buoyancy

Triggering of Memories

Page 22: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Analyzing Collective Memory in Wikipedia

Identify catalysts for reviving memories

Analyze re-visiting behaviors

• Page views of a large set of events

• Time series analysis

11 Wikipedia categories

• Number of triggering events

• Number of events possibly triggered

Page 23: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Temporal and spatial distributions

• Strong focus on more recent events

• Better coverage with increasing popularity

• Most frequent locations depending on event types

Temporal and Spatial Distributions

Page 24: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Our Approach and Results

Remembering score as a function (e.g., detecting co-peaks in views) of revisiting behavior

Correlate remembering scores vs. time and location similarities

Hurricane Sandy Findings:

• Hurricane Sandy triggers 1991 Perfect Storm

initially formed around Canada area, which is

high impact (most destructive and costly) ones

• 2011 Christchurch earthquake triggers recent

events in the same region, i.e., 2010 Canterbury

earthquake

Page 25: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Our Approach and Results

Remembering score as a function (e.g., detecting co-peaks in views) of revisiting behavior

Correlate remembering scores vs. time and location similarities

Hurricane Sandy 2011 Christchurch earthquake Findings:

• Hurricane Sandy triggers 1991 Perfect Storm

initially formed around Canada area, which is

high impact (most destructive and costly) ones

• 2011 Christchurch earthquake triggers recent

events in the same region, i.e., 2010 Canterbury

earthquake

Page 26: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Memory Buoyancy: Simplified Computation

Me

mo

ry B

uo

ya

nc

y

Time

Compute: MB(D, t)

Time

Ac

ce

ss

Lo

gs

t1 t2

Page 27: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Memory Buoyancy: Simplified Computation

Me

mo

ry B

uo

ya

nc

y

Time

Compute: MB(D, t)

Time

Ac

ce

ss

Lo

gs

t1 t2

Page 28: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Memory Buoyancy: Simplified Computation

Me

mo

ry B

uo

ya

nc

y

Time

Compute: MB(D, t)

Time

Ac

ce

ss

Lo

gs

t1 t2

Page 29: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Proposed MB assessment framework:

• Initialize MB values of resources

using a time-decay forgetting function:

• Incrementally update MB using

Random Walk on resource graph:

Memory Buoyancy Assessment

|'|)( )( ttt DecayRatermb

r

e2

Edfringe photo (2011)

Photos @ iPhone

e3

Folder @ computer

e1

Shortcut folder @ desktop

e4 e6

Photo @ ForgetIT Meeting (2013)

contains

contains

contains

hasSamePlace

hasSamePlace

e5 hasEntity

Whiskey photo (2012)

2

)(

1

)(

2

1)( 4

)(6

)()1( embemb

rmbt

Dasht

DashtDash

Averaged value over

two inlinked resources

Less propagation

account for two outlinks

hasSamePlace

e5

Whiskey Tour (2009)

hasSamePlace

Page 30: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Social Web apps gain popularity

Personal Web archives

Study: Identifying memorable content • 20 participants, 15 male and 5 female

• Rate (3,330) posts by relevance for future

Content Retention in Social Web Applications

Year in Review: photo from the Internet

Page 31: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Machine learning techniques

• Support vector machine, Bayesian network, and decision tree (J48)

80 features from categories:

• Content types + meta data

• Social interactions

• Temporal

• Privacy

• Graph

Correlation-based feature selection (CFS) • Temporal: highest impact features

• Graph: low impact for memorable posts

Learning to Classify Memorable Content

Page 32: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Classification results: • Baseline Features (CS): No. of likes, comments, and shares

• Baseline 69% (F-Measure)

• Top 9 features 79% (F-Measure)

Classification Results

Page 33: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Classification results: • Baseline Features (CS): No. of likes, comments, and shares

• Baseline 69% (F-Measure)

• Top 9 features 79% (F-Measure)

Classification Results

Page 34: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

1. M. Georgescu, D. D. Pham, N. Kanhabua, S. Zerr, S. Siersdorfer and W. Nejdl, Temporal Summarization of

Event-Related Updates in Wikipedia (demo), Proceedings of the 22nd International World Wide Web Conference

(WWW'13), May, 2013.

2. M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl and S. Siersdorfer, Extracting Event-Related Information from

Article Updates in Wikipedia, Proceedings of the 35th European conference on Advances in Information Retrieval

(ECIR'13), March, 2013.

3. N. Kanhabua and C. Niederée, Preservation and Forgetting: Friends or Foes?, In the First International

Workshop on Archiving Community Memories (in conjunction with iPRES'2013), September, 2013.

4. N. Kanhabua, C. Niederée and W. Siberski, Towards Concise Preservation by Managed Forgetting: Research

Issues and Case Study, Proceedings of the 10th International Conference on Preservation of Digital Objects

(iPRES'2013), September, 2013.

5. K. D. Naini and I.S. Altingovde, Exploiting Result Diversification Methods for Feature Selection in Learning to

Rank, Proceedings of the 36th European conference on Advances in Information Retrieval (ECIR'2014), April, 2014.

6. A. Ceroni and M. Fisichella, Towards an Entity-based Automatic Event Validation, Proceedings of the 36th

European conference on Advances in Information Retrieval (ECIR'2014), April, 2014.

7. T. N. Nguyen and N. Kanhabua, Leveraging Dynamic Query Subtopics for Time-aware Search Result

Diversification, Proceedings of the 36th European conference on Advances in Information Retrieval (ECIR'2014),

April, 2014.

8. K. D. Naini, R. Kawase, N. Kanhabua and C. Niederée, Characterizing High-impact Features for Content

Retention in Social Web Applications (poster), Proceedings of the 23rd International World Wide Web

Conference (WWW'2014), Seoul, Korea, April, 2014.

9. T. A. Tran, M. Georgescu, X. Zhu and N. Kanhabua, Ars longa, vita brevis: Analysing the Duration of Trending

Topics in Twitter Using Wikipedia (poster), (To appear) Proceedings of the ACM Web Science 2014 Conference

(WebSci'2014), Bloomington, USA, June, 2014.

Publications

Page 35: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering

Thank you for your attention!