Upload
george-roth
View
1.280
Download
6
Tags:
Embed Size (px)
DESCRIPTION
This is the vision of Recognos about the future of Semantic Technology in Document Management. The presentation was created for the SemTech Conference in November, 2011 in Washington DC.
Citation preview
Applications ofSemantic Technology in Document Management
Washington DC, November 2011George Roth, Adonis Damianwww.recognos.com
What is Document Management
A document management system (DMS) is a computer system (or set of computer programs) used to track and store electronic documents and/or images of paper documents. It is usually also capable of keeping track of the different versions created by different users (history tracking). The term has some overlap with the concepts of content management systems. It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.
Make the formatted equivalent with non-formatted !
November 2011
DMS is changing !!!!
CLASSICAL
Metadata Integration Capture Indexing Storage Retrieval Distribution Security Workflow Collaboration Versioning Search Publishing …
NEW
Compliance Accessibility Interactivity Augmentation Translation Linking – Relationships Sentiment Analysis New Search (Semantic Tagging,
Deep Search, NL Questions)
November 2011
To Process Documents is harder and harder !!!!
Volume Labor extensive The “research project” – 40% – 60%
data gathering Metadata independent of content Shallow Search Hard to understand by non-experts
November 2011
New Tools: Semantic Technologies
NLP Natural Language Processing – understand the meaning of documents (statistic, machine learning, hybrid, graph based)
Semantic Search – tagging Data Integration Sentiment Analysis Linked Open Data – Linked Data Inference - Reasoning
November 2011
Semantic Technologies – Outside and Inside the Enterprise
Inside – Controlled Environment - TRUST Inside – Security issues Same techniques as outside the
enterprise Integrates non-formatted with formatted
data Easy to measure the effects - ROI Add on to the existing KM models Emerging area – Semantic technologies
started on the wwwNovember 2011
Document Management is changing !!!!
New features will become commodity in 2-3 years
Compliance Data Extraction, Comparison,
Change Analysis Interactivity Augmentation Translation Linking – Relationships Sentiment Analysis New Search (Semantic Tagging, Deep
Search, NL Questions)November 2011
Biggest Acquisitions
Microsoft: Powerset (Bing), Fast Search, Jinni
Google: Freebase, Needlebase Apple: SIRI Etc…
November 2011
New Document Management
Embedded Compliance Rules
November 2011
Compliance Rules
Example there is a rule: – email – Rule 0134C: “Not allowed to mention a
percentage as a profit promise investing with the firm”
In an email: “ Dear John, Our company has an amazing
method to invest, so that you will make at least 10% profit in 3 months !!!! “
The email was stopped – sent to Compliance with the message: “Violation of the Rule 0134C”
November 2011
Data Compliance
MFIP data extraction Link to the original document
November 2011
New Document Management
Data Extraction, Comparison, Change Analysis
November 2011
Data Extraction – Semantic Rules
November 2011
Data Comparison– Semantic Rules
November 2011
Change Analysis - Alarms - Semantic
Create Alarm when Trading Policy Changes
Create Alarm when Commissions Change (fields)
Create Alarms when member of the Board Changes
November 2011
New Document Management
Interactivity
November 2011
Interactivity
November 2011
New Document Management
Augmentation
November 2011
Augmentation
November 2011
New Document Management
Automated Translation
November 2011
Translation
Google Translate Great for simple translation – emails,
non technical documents
Language Weaver Specialized translation through machine
learning Train the system per domains
November 2011
New Document Management
Sentiment Analysis
November 2011
Sentiment Analysis
Media Sentry Open Amplify, Expert Systems,
Lymbix NLP and machine learning
November 2011
Sentiment Analysis
November 2011
New Document Management
Search
November 2011
Shallow Search vs. Deep Search
November 2011
Deep Document Search
November 2011
Faceted Search
November 2011
Faceted Search
November 2011
Ask Questions – Document Adviser
November 2011
NLP Search
November 2011
New Document Management
Complex App Samples
November 2011
Complex Apps: Media Monitoring – Media Sentry
November 2011
Media Sentry – Under the Hood
Internal Message Storage
WWW
GoogleAlerts
MeltwatersAlerts
Twitter FacebookForums /
Blogs
External Data Pull
TwitterAdapter
FacebookAdapter
80legsAdapter
DiffbotAdapter
ExchangeAdapter
ExchangeServer
FileServer
Natural Language Processing
Websites
UploadedTaxonomyESSEX
Data StorageWeb User Interface
MS SQL Server
November 2011
CRM – Intelligent Call Center Amdocs AIDA (AMDOCS Intelligent Decision
Automation)
November 2011
CRM - Event Extraction
November 2011
Interactive Book
November 2011
Display Linked Data Ask a question – semantic search
Entity Lookup
Interactive Books - sirBook
November 2011
Counterparty Risk
November 2011
Counterparty Risk
November 2011
Counterparty Risk
November 2011
Discovery through Inference (1)
November 2011
Discovery through Inference (2)
November 2011
Ultimate Goal – The Smart Document
Interactive - Exists Search – Semantic Search, Q&A Semantic Tagging – Summarization LOD with domains Linked : People, Companies,
Locations, Specific Terms Example a travel book
November 2011
Used technologies
The following technologies were used:- iQser – GIN- Clark & Parsia – Spanner, StarDog- Expert System – NLP- GATE- Smart Logic – Enterprise Query Platform – Fast Search –
Microsoft Sharepoint 11- Revelytix- Cognition- Franz Systems- DiffBot- Ontotext
November 2011
Contact
George RothPresident and CEO Recognos Inc.San [email protected]
Drew WarrenCEO Recognos FinancialNew [email protected]
November 2011