Upload
olivia-little
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
IntelligentInformation and
Knowledge Infrastructures
Daniel OlmedillaL3S Research Center & Hannover University
Intelligent Access to Digital Heritage Conference19 Oct. 2007, Tallinn, Estonia
19 Oct. 2007 2Daniel Olmedilla
Outline
L3S Background
Introduction & Motivation
Personalized Search & Ranking
Privacy & Access Control
EU Projects Summary
19 Oct. 2007 3Daniel Olmedilla
Outline
L3S Background
Introduction & MotivationIntroduction & Motivation
Personalized Search & RankingPersonalized Search & Ranking
Privacy & Access ControlPrivacy & Access Control
EU Projects SummaryEU Projects Summary
19 Oct. 2007 4Daniel Olmedilla
L3S BackgroundMission and Focus
L3S research focuses on innovative and cutting-edge methods and technologies for three key enablers for the European Information Society: Knowledge Information Learning
LS3 projects focus on digital resources and their technological underpinnings:
Digital libraries and Search Semantic Web and Knowledge Sharing Distributed Systems, Networks and Grids
the use of these resources in eLearning and eScience contexts
19 Oct. 2007 5Daniel Olmedilla
L3S BackgroundArea “Semantic Web & Digital Libraries”
provide personalized access to distributed information resources and advanced search and recommendation functionalitiesprovide enhanced search on the desktop, in companies, on the Webenhance traditional libraries with digital content and personalized library services
19 Oct. 2007 6Daniel Olmedilla
Outline
L3S BackgroundL3S Background
Introduction & Motivation
Personalized Search & RankingPersonalized Search & Ranking
Privacy & Access ControlPrivacy & Access Control
EU Projects SummaryEU Projects Summary
19 Oct. 2007 7Daniel Olmedilla
Introduction & MotivationConference Theme
Intelligent Access to Digital Heritage
19 Oct. 2007 8Daniel Olmedilla
Introduction & MotivationUNESCO E-Heritage (I)
Digital Heritage are resources of human knowledge or expression, whether cultural, educational, scientific and administrative, or embracing technical, legal, medical and other kinds of information
Digital materials include texts, databases, still and moving images, audio, graphics, software, and web pages, among a wide and growing range of formats
[ http://portal.unesco.org/ci/en/ev.php-URL_ID=1539&URL_DO=DO_TOPIC&URL_SECTION=201.html, http://portal.unesco.org/ci/en/files/13367/10700115911Charter_en.pdf/Charter_en.pdf ]
19 Oct. 2007 9Daniel Olmedilla
Introduction & MotivationUNESCO E-Heritage (II)
Born-digital heritage available on-line, including electronic journals, World Wide Web pages or on-line databases, is now part of the world’s cultural heritage
Using computers and related tools, humans are creating and sharing digital resources - information, creative expression, ideas, and knowledge encoded for computer processing - that they value and want to share with others over time as well as across space
19 Oct. 2007 10Daniel Olmedilla
Introduction & MotivationUNESCO E-Heritage (& III)
The purpose of preserving the digital heritage is to ensure that it remains accessible to the public. (…) . At the same time, sensitive and personal information should be protected from any form of intrusion.
19 Oct. 2007 11Daniel Olmedilla
Introduction & MotivationFocus of this talk
Intelligent Access to Digital Heritage
SearchRank
• Personalized of media
• Access to sensitiveInformationResources
19 Oct. 2007 12Daniel Olmedilla
Introduction & MotivationInformation growth
In today's society, individuals and organisations are, on one hand, confronted with an ever growing load of information and content and, on the other, with increasing demands for knowledge and skills.
To cope with this, we need to link content, knowledge and learning, making content and knowledge more accessible, interactive and usable over time by humans and machines alike.
19 Oct. 2007 13Daniel Olmedilla
Introduction & MotivationNot only textual resources
19 Oct. 2007 14Daniel Olmedilla
Introduction & MotivationThe 1 TB life (Gordon Bell)
1TB gives you 65+ years of: 100 email messages a day (5KB each) 100 web pages a day (50KB each) 5 scanned pages a day (100KB each) 1 book every 10 days (1 MB each) 10 photos per day (400 KB JPEG each) 8 hours per day of sound - e.g. telephone,
voice annotations, and meeting recordings (8 Kb/s) 1 new music CD every 10 days (45 min each at 128 Kb/s)
It will take you 10 years to fill up your 160 GB drive
Want video? Buy more cheap drives (1 TB/year lets you record 4 hours/day of 1.5 Mb/s video)
19 Oct. 2007 15Daniel Olmedilla
Introduction & MotivationMain Objectives
1. Search for textual and audiovisual content
2. Rank results according to relevance
3. Personalize such search and ranking Not all users are the same Find what they are interested in
4. While protecting private information and resources
19 Oct. 2007 16Daniel Olmedilla
Outline
L3S BackgroundL3S Background
Introduction & MotivationIntroduction & Motivation
Personalized Search & Ranking
Privacy & Access ControlPrivacy & Access Control
EU Projects SummaryEU Projects Summary
19 Oct. 2007 17Daniel Olmedilla
Personalized Search & RankingRepresenting context by SW metadata
Metadata for resources can be created by appropriate metadata generators
Ontologies specify context metadata for i.e.:
Emails Files Web pages Publications
Metadata have to be application-independent! Store Metadata as RDF
19 Oct. 2007 18Daniel Olmedilla
Personalized Search & RankingPersonalization in the SW
gather online information, integrate heterogenous sources, syndicate according to user’s preferences
embed resources with a personalized context enable users to choose which kind of personalized
guidance in what combination they appreciate as support (plug & learn)
Realization: semi-automated extraction of information from
heterogenous sources re-usable personalization algorithms reason about
distributed data sources (user data, course descriptions, ontologies, etc.)
personalization rules reason about resources, e.g. to make recommendations[Baumgartner, Henze, Herzog. The Personal Publication Reader: Illustrating Web
Data Extraction, Personalization and Reasoning for the Semantic Web. ESWC’05 ]
19 Oct. 2007 19Daniel Olmedilla
Personalized Search & Ranking User Knowledge and Interests
Competence: “an effective performance within a domain / context at different levels of proficiency”
Can be explicitly defined by the user or inferred automatically
Competence
ProficiencyLevel
Context
Competency
19 Oct. 2007 20Daniel Olmedilla
Personalized Search & Ranking Expanding User Queries with Local Context
User related documents(desktop documents) containing the query
Score and extract keywords
Top query-dependent,
user-biasedkeywords
Extract query expansion or
re-ranking terms
[ Chirita, Firan, Nejdl. Summarizing local context to personalize global web search. CIKM 2006 ]
19 Oct. 2007 21Daniel Olmedilla
Personalized Search & Ranking Data heterogeneity
Characteristics A lot of text (unstructured information) A lot of structures, e.g. title, author, creation-date, … Heterogeneity in structure
Different holders (applications) use different schemas In nature, the structure of a domain is too complex for us to
give it a clear and certain definition
Classical Data Integration Transform data into a clear and uniform structure before we use it Intensive human intervention – very laborious and not scalable
Malleable Schema (X. Dong & A. Halevy ’05) Allow overlapping and vague elements to be defined in a single
schema
19 Oct. 2007 22Daniel Olmedilla
Personalized Search & Ranking Malleable Schemas: Example Data
Person
DocPerson
email Doc
xml search Jack
Pan
John Gary
True
Xml is the standardfor data exchange
…….
My paper
Dear Sergey, Pleasefind attached the file
…….
25.03.2006Desktop Search
We have many data…….
False
author
author
first name
sur name
name
title
body
Isa book
Isa paper
contents
writersender
attachment
subject
body date
19 Oct. 2007 23Daniel Olmedilla
Personalized Search & Ranking Querying Malleable Schemas
For example, user issue query:Q1: Select Person Where first_name Contains “Philip”
To obtain the complete results, we should relax the query to:Q2: Select Person Where first_name Contains “Philip”
Or name Contains “Philip”
A query has to be relaxed to related schema elements
But, how to discover the correlation between schema elements?
Person
…
…
first name
sur namePerson … …
name
19 Oct. 2007 24Daniel Olmedilla
Personalized Search & Ranking Discover Schema Correlations (I)
Solution: find duplicates which use different attributes.
Observation:1. more duplicates – better schema correlation discovery2. more accurate schema correlations – better duplicate detection
Solution: Let schema correlation discovery and duplicate detection reinforce each other to achieve improved results
19 Oct. 2007 25Daniel Olmedilla
Personalized Search & Ranking Discover Schema Correlations (& II)
title subject author writer Pub-date Rec-date
E1 XML Daniel Jan 1999
E2 XML Daniel Dec 2003
E3 DB Ullman Jul 1994
E4 DB Ullman Nov 2001
E5 AI Stuart Nov 2001
E6 Logic Stuart Nov 2001
duplicates: {E1, E2}, {E3, E4}, {E5, E6}attribute matches: {title, subject}, {author, writer}, {pub-date, rec-date}
duplicates: {E1, E2}, {E3, E4}, {E5, E6}attribute matches: {title, subject}, {author, writer}, {pub-date, rec-date}
[ Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl. Query Relaxation Using Malleable Schema. SIGMOD’07 ]
19 Oct. 2007 26Daniel Olmedilla
Outline
L3S BackgroundL3S Background
Introduction & MotivationIntroduction & Motivation
Personalized Search & RankingPersonalized Search & Ranking
Privacy & Access Control
EU Projects SummaryEU Projects Summary
19 Oct. 2007 27Daniel Olmedilla
Privacy & Access ControlAccess Control in Open Systems (I)
19 Oct. 2007 28Daniel Olmedilla
Privacy & Access ControlAccess Control in Open Systems (& II)
Assumption: I already know you you have a local account!
Not a member?
19 Oct. 2007 29Daniel Olmedilla
Privacy & Access ControlPolicy Examples
Give customers younger than 26 a 20% discount
Up to 15% of network bandwidth can be reserved by paying with an accepted credit card
Customers can rent a car if they are 18 or older, and exhibit a driving license and a valid credit card
[ Bonatti, Olmedilla. Driving and Monitoring Provisional Trust Negotiation with Metapolicies. IEEE Policies for Distributed Systems and Networks, 2005 ]
19 Oct. 2007 30Daniel Olmedilla
Privacy & Access ControlUse Credentials
19 Oct. 2007 31Daniel Olmedilla
Privacy & Access ControlNegotiations
Step 1: Alice requests a service from Amazon
Step 5: Alice discloses her VISA card credential
Step 4: Amazon discloses its BBB credential
Step 6: Amazon grants access to the serviceService
BobAlice
Step 2: Amazon discloses its policy for the service
Step 3: Alice discloses her policy for VISA
[Winsborough, Seamons, Jones. Automated trust negotiation. DARPA Information Survivability Conference and Exposition, 2000 ]
19 Oct. 2007 32Daniel Olmedilla
Privacy & Access ControlUser awareness and Control
Explain policies and system decisions Make rules & reasoning intelligible to the
common user
Use natural language?
“Academic users can download the files in folder historical_data whenever their creation date precedes 1942”
Suitably restricted to avoid ambiguities Fortunately, users spontaneously formulate
rules
19 Oct. 2007 33Daniel Olmedilla
Privacy & Access ControlCooperativeness & Verbalization
Suppose Alice's request is rejected
She may want to ask questions like: Why didn't you accept my credit card?
Other possible queries How-to queries What-if queries
Would I get the special discount on financial products X if I were locally employed?
[ Bonatti, Olmedilla, Peer. Advanced policy explanations on the web. ECAI 2006 ]
19 Oct. 2007 34Daniel Olmedilla
Privacy & Access ControlSample Screenshot (I)
19 Oct. 2007 35Daniel Olmedilla
Privacy & Access ControlSample Screenshot (& II)
19 Oct. 2007 36Daniel Olmedilla
Outline
L3S BackgroundL3S Background
Introduction & MotivationIntroduction & Motivation
Personalized Search & RankingPersonalized Search & Ranking
Privacy & Access ControlPrivacy & Access Control
EU Projects Summary
19 Oct. 2007 37Daniel Olmedilla
EU Projects SummaryEU IP Nepomuk: Social Semantic Desktop
- Desktop: Help individuals in managing information on their PC
- Semantic: Make content available to automated processing - Social: Enable exchange across individual boundaries
colleague
friend
acquaintance
NEPOMUK enabledpeers
Personal Semantic Web: a semantically enlarged intimate supplement to memory
Social protocolsand distributed search
Person
Topic
WebSite Document
Image
Event
Person
19 Oct. 2007 38Daniel Olmedilla
EU Projects SummaryEU IP PHAROS
PHAROS will move forward audiovisual searching from a point-solution search engine paradigm to an integrated search platform paradigm.
PHAROS will integrate future user and search requirements in a living laboratories for innovation
PHAROS partners are from 9 European Countries and will integrate its development with their nationally funded projects. SMEs, academia and large industrial players will ensure maximum impact on the business scenario
PHAROS will use an open approach in integrating external experiences and contributions and exchange results through the PHAROS Federation.
PHAROS will use an specifically-designed management structure, integrating the different PHAROS “streams”
Vision
Integrat
ion
Openess &
Federation
High - Impact
19 Oct. 2007 39Daniel Olmedilla
EU Projects SummaryEU NoE REWERSE
REasoning on the WEb with Rules and SEmantics
Web reasoning languages & processing Define set of reasoning languages
Coherent Inter-operable Functionality and application independent
For Advanced Web systems and applications
Advanced Applications as testbeds for languages Context-adaptive Web systems Web-based decision support systems
19 Oct. 2007 40Daniel Olmedilla
EU Projects SummaryEU IP TENCompetence
19 Oct. 2007 41Daniel Olmedilla
EU Projects SummaryL3S Project Leaders (http://www.L3S.de)
NEPOMUK (http://nepomuk.semanticdesktop.org/Dr. Claudia Niederee
PHAROS - http://www.pharos-audiovisual-search.eu/Dr. Bhaskar Mehta
REWERSE - http://rewerse.net/Prof. Dr. Nicola Henze
TENCompetence - http://www.tencompentece.org/Dr. Daniel Olmedilla