When Metadata is the ContentFrom Articles to Knowledge
SSP 2009 Annual MeetingChris Beguel – Director of Sales – TEMISBaltimore, MD – May 09
Copyright © 2009 TEMIS –All rights reserved 2
Where are we? Semantic Age!
Copyright © 2009 TEMIS –All rights reserved 3
Term
Entity
Fact
Knowledge
From Words to Meaning…
Product Dosing Action Target State Event Action
Potential Adverse EffectDrug = TrimilaxDosing = 500mgSymptom = TirenessWhen = After administration
Drug Symptom Condition
Prop. Num. Abrev. Verb /3rd Pron. Adj. Prep. NounVerb
Trimilax 500 makes me feel after ingestionmg dizzy
Copyright © 2009 TEMIS –All rights reserved 4
Metadata? Understand!
Title: Google gives drivers a handat the gas pumps
Source: InformationWeekAuthor: Antone GonsalvesDate: November 7, 2007
Metadata
Entities
Facts
Copyright © 2009 TEMIS –All rights reserved 5
Metadata? Understand!
Linux
United States
Opensource …
TMobile HTC
Qualcomm Motorola
Atlanta
Locations
National Association of Conveni…
Organizations
Lucy Sackett
Persons
Internet
Technologies
Gilbarco VeederRoot
Companies
InformationWeek
Sackett
Gilbarco
Entities
Facts
Metadata
Product
New Service Google Service
Copyright © 2009 TEMIS –All rights reserved 6
Metadata? Understand!
Launch
Gilbarco Google Service
Gilbarco New service
Announcement
Partnership
Gilbarco Google
Sackett InformationWeek
Function
Sackett Gilbarco
Alliance
Google HTC
Qualcomm
Motorola
TMobile
Entities
Facts
Metadata
Announcement
Who: GilbarcoWhom: unknownWhat: New ServiceWhen: unknown
Who: GilbarcoWhat: Google ServiceWhen: early next week
Launch
Who: SackettCompany: GilbarcoFunction: spoke woman
FunctionWho: GilbarcoWith whom: GoogleWhen; unknownState: Negative
Partnership
Who: GoogleWith whom: TMobile, HTC,Qualcom, MotorolaWhen: unknown
Alliance
Announcement
Who: SackettWhom: InformationWeekWhen: unknownWhat: unknown
Copyright © 2009 TEMIS –All rights reserved 7
From Metadata to Knowledge!
Copyright © 2009 TEMIS –All rights reserved 8
What is Text Mining?
v Text Mining is an information access technology…
v Text Mining generates Knowledge
v Text Mining serves information consumers & producers
Text Mining BackEnd
DataRepository
Text Mining FrontEnd(Text Analytics)
Copyright © 2009 TEMIS –All rights reserved 9
1. Enhanced Search Experience
Simple recognition of words…
From standard keyword search….
Copyright © 2009 TEMIS –All rights reserved 10
•Make comprehensive and precise search•Get more relevant documents•Find what you don’t know!
1. Enhanced Search Experience
… to Entity & Fact search!
EndUserBenefits
Copyright © 2009 TEMIS –All rights reserved 11
2. Faceted Navigation
From “narrow your search”….
Copyright © 2009 TEMIS –All rights reserved 12
2. Faceted Navigation
•Get a quick vision of document content•Navigate within contextrelevant information•Rapidly focus on targeted documents
EndUserBenefits
… to multidimensional faceted navigation
Selfadjustingfilters to refine the
search
Ability to combineseveral filters at once
(and/or)
Point & Clickfiltering
Copyright © 2009 TEMIS –All rights reserved 13
From bug view ….
3. Data Analysis and Reporting
Copyright © 2009 TEMIS –All rights reserved 14
3. Data Analysis and Reporting
… to birdeye view!
•Visualize key Entities & Facts (pie/bar charts)•Detect Entities & Facts dependencies (matrix charts)•Zoom in & out by drilling anywhere
EndUserBenefits
Copyright © 2009 TEMIS –All rights reserved 15
4. Information Discovery
From flat list of documents ….
Copyright © 2009 TEMIS –All rights reserved 16
4. Information Discovery
… toinformation
network
Entities
Facts
SearchPanel
DiscoveryTools
Proofs
•Search in knowledge, not in documents•Get a graphical representation of knowledge•Discover information by navigating within Facts
EndUserBenefits
Copyright © 2009 TEMIS –All rights reserved 17
Semantic Enrichment at the Core
ProductManagement
Web ContentManagement
Text MiningContent Enrichment
Related TopicsExtraction
SmartLinking
SentimentAnalysis
Trends Analysis& Charting
SimilarityDetection
ContentAnnotation
MetadataExtraction
TaxonomyManagement
AutomaticCategorization
Entity & FactsExtraction
Original ContentJournal Scans
Expert InterviewsEvent Reports
Visitors &customers
ContentEditors
Editorial& Content
Management
Copyright © 2009 TEMIS –All rights reserved 18
Benefits to Information Producers
v Create more engaging, longer lasting user visits• Richer user experience with context sensitive information• Enhanced page views per visits• Exposing the “long tail” through suggestions and linking• Integrate more content at a fraction of the cost
v Establish your web properties as a communitygateway
• “70% of all searches do NOT start on Google/MSN/Yahoo”says Sue Feldman at IDC Research
• Smart search and navigation are critical to user’s experience
Increase stickiness of website to maximizead revenue or subscription utilization!
Copyright © 2009 TEMIS –All rights reserved 19
RePackaging Content – Elsevier
v Objective• Develop a revolutionary database indexing the last 28 years
in chemistry patent• Provide an exceptional users’experience by using “smart
content”
v Results• ~20 Million Chemistry Patent documents• Searchable by chemical reactions, solvents, reactants directly
extracted from the documents• Released by ElsevierMDL in Nov. 2004
v Currently• TEMIS distributes the Chemical Entities Relationships
Annotator in partnership with Elsevier
Copyright © 2009 TEMIS –All rights reserved 20
Exposing the Long Tail – Springer
v Objective• Mapping of meaningful words and phrases
in journal articles to encyclopedia entries• Identification of related documents in a pool of over
three million journal articles
v Solution• Indexing of incoming journal articles to link journal
articles with the related encyclopedia entry• Creation of semantic fingerprint for each journal article
to allow search engine calculate degree of relationship• Integration with Springer’s search engine
v Benefits• Increased product sales by improving content linking
Copyright © 2009 TEMIS –All rights reserved 21
Answering Burning Questions – EFL
v Objectives• Extract numerical data
from case law to enhanceinformation accessfor lawyers.
v Solution• Luxid® with custom annotators (address, activity,
compensation, age, turnover… )• Export numerical data as metadata to a search engine.
v Benefits• Productivity gain to extract and validate metadata• Allowing to treat huge amount of case law
Questions?Thank you!
SSP 2009 Annual MeetingChris Beguel – Director of Sales – TEMISBaltimore, MD – May 09