View
218
Download
0
Category
Tags:
Preview:
Citation preview
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C Aules d’empresa 2011 DEX Use Cases
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConceptBrowser DifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConceptBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Goal Apply DEX in a well-known social network Perform different kinds of information retrieval queries
• Link analysis• Social-oriented queries• Pattern recognition• Keyword search
IMDB Use Case www.imdb.com Inherent network-structure of the data
Information RetrievalIMDB
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Source Data 10 entities, 12 relationships More than 845,000 titles and 2,000,000 people Auxiliary tables with casts, roles, genres, extra movie & person info
Dex Graph Built in less than 21 minutes More than 25 million nodes Less than 1.14 GB of DEX data
Information RetrievalIMDB
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Information RetrievalIMDB
Link Analysis Social Networks Pattern Recognition Keyword Search
Focus
relationships between the entities of a network
relationship between different groups of nodes with the same affinity
identify the potential result graphs that match a certain pattern
where the user is assumed not to know anything about the organization of the data
Examples
get all the information of a movie
find the full relationships network of all the partners of an actor or actress
find all the directors that have worked with the same actress in ‘X’ different movies made in a period of time of ‘Y’ years
return all the context information of all the entities containing the keyword ‘X’
Results (stress
settings)
845,573 movies exploded at an average speed of 0.01 sec. per movie
one large result graph with 1,052 nodes connected by 552,826 edges
4,705 directors selected at an average speed of 0.09 sec. per result
3,308 graphs from a potential of 24,547,488 at an average speed of 0.005 sec.
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConceptBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Requesters Spanish Patrimonial Control Office
Goal Detect fraud in real patrimonial transactions
Data Model People, societies Patrimonial transactions
Procedure An expert defines a fraudulent pattern graph pattern Graph pattern allows user to find fraudulent people/societies
OCP
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConceptBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Requesters Catalan Institute of Oncology
Goal Support application to identify patterns (rules) in the procedures
applied to cancer patients
Data Model 50000 patients from the Bellvitge hospital (1994 – 2006) 67 types of tumors
Why DEX? Querying capability, multiple data sources, navigational
characteristics Larger amount of data, with hundreds of thousands of patients
GrafMED
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConceptBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Requesters Havas Media
Goal Support application for brainstorming tasks. Finds not obvious conceptual relations among words and concepts.
Data 440.882 concepts 117.278 groups of synonymic words 116.988 words 10.922.306 relations among words and concepts
Why DEX? Querying capability, navigational characteristics
ConceptBrowser
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConcetpBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Requesters Havas Media
Goal Bibex extension focused on medical researchers. Identifies researcher social networks, scientific evolution on a
particular medication and researchers influence.
Data 1.502.599 publications 2.136.184 researchers 194.991 medications 3.437.476 references
DifPubMed
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConcetpBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Requesters Havas Media
Goal Tool for analyzing information propagation in any social network. Identifies useful information such as how fast and how far
information is propagated in time.
Used Social Networks Youtube – Users, videos, comments, etc. Enron – Users, e-mails, etc. Flickr – Users, photos, comments. Orkut – Users, media(photos, music, etc.), messages, etc. Twitter – Users, messages, etc. Vi.vu – Medical professionals, non-professional users, questions, answers,
references, etc.
SocialMedia
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
SocialMediaResults
Influent Persons Distribution
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConcetpBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Motivation Selected in the Call for Applications for Research Grants Tool
(Recercaixa).
Goal Support tool for exploring and recommending audiovisual content. Oriented to be applied in primary and secondary education.
Data Contribution Catalan public broadcaster Televisió de Catalunya (TV3)
RecerCaixa
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Index
Information RetrievalOCPGrafMEDConcetpBrowserDifPubMedSocialMediaRecerCaixaReviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Requesters Ministry of Science and Innovation of the Spain Government (MICINN)
Goal Tool for identifying and recommending experts in a particular topic. Experts in a topic
People highly contributing to documents related to a topic
Data Model Document contributors Documents
Reviewers Recommender
No
m e la p
resenatació
o altra in
fo (o
pcio
nal)
Sp
ars
ity T
ech
nolo
gie
s &
DA
MA
-UP
C
Thanks for your attention
Any questions?
DAMA-UPC. DATA MANAGEMENT (UPC) Departament d'Arquitectura de Computadors
Edifici C6-S103. Campus Nord. Jordi Girona, 1-3. 08034 - Barcelona
www.dama.upc.edu
SPARSITY-TECHNOLOGIESJordi Girona, 1-3, Edifici K2M
08034 Barcelonainfo@sparsity-technologies.com
http://www.sparsity-technologies.com
Recommended