Upload
voginip
View
497
Download
3
Embed Size (px)
Citation preview
Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”
Zoeken & Vinden3rd March 2016, Amsterdam
Paul Gunstone, Sales Director
The Company & The Opportunity
We believe there is huge business value locked away in content, because content contains the majority of the organization's intelligence. Organizations that unlock that value can outperform competitors in their market. Content augments the value that most organizations have already found in structured data - it is the untapped frontier of competitive advantage.
To realize this value we know you must understand your content, the information and knowledge it contains, how it can be applied in the context of your operations and how it enhances the insights from structured data. The content must be described completely and consistently with metadata.
We focus all our energy on creating value from unstructured content – something we call Content Intelligence
Jeremy Bentley, CEO
“
Some of Smartlogic’s customers
Search “Maturity” - why am I searching?
Search Volume
Search Value
Document Search Subject Search
“Euraka” Search
€0
€1M?
SEARCH
REFINERS
ENTITIES
CLUSTERING
RELATIONSHIPS
FACT EXTRACTION
• Digital universe is growing dramatically
• Most of this information is unstructured
• Only a small fraction of the digital universe has been explored for analytical value
• Valuable knowledge and
relationships are hidden in this data
The Challenges … Part 1
Relentless growth in content volumes ……
The Challenges …. Part 2
We don’t all speak the same language ……
The Challenges …. Part 3
The proliferation of systems within and beyond the “Hyper-Connected Enterprise”is creating a HUGE range of sources of ‘content’:
• File shares • DMS• ECMS• ERP• HR (HCM)• Finance & Legal• Email• Knowledge Base• CRM• SFA• Twitter (etc.)• LIMS (eg)• Call Centre Logs
Board & Meeting Minutes, Engineers’ Reports, H&S Audits (and Actions), Annual Appraisals, Processes & Procedures, Newsletters, Marketing & Product Materials, Maintenance & Repair Manuals, Contracts, Letters of Credit, Insurance Policies, Supply Chain Information, Strategy Documents, Business Plans, Annual Report, Regulatory Submissions, Management Reports, Performance Management Documents, Grievance Procedure Evidence ……..
Contextual metadata driven experienceUser interfaces to leverage the ontologies to deliver the richest experience for users when publishing, using and analyzing content
Semaphore delivers these capabilities – enterprise scale
Build and manage semantic modelsSimplify the ingestion, development or customization of ontologies
Assisted and automated metadata enrichmentAutomatically describe all your content with rich metadata
What is Semaphore?
Conceptual architecture
SharePoint 2013 integration
SharePoint Online integration
Solr native integration
MarkLogic integration
Generic CMS integration
A presenter’s nightmare …
I will now demonstrating 3 components to show how each plays a part in making content discoverable.
Search and Publishing Enhancement
Executing a SOLR Search
Executing a Solr Search
The screen-shot shows the Semaphore
SAF (Search Application
Framework) front-end where the user is wanting to search NASA content for information on the
“moon buggy”.
The search box is prompting with
suggestions from the model, but we’re going
to ignore these to illustrate the benefits
of using semaphore to enhance solr search
Standard Search Results
Standard SOLR Search Results
You can see there are over 2000 results.
The standard SOLR method for joining two
words is to use an ‘OR’; as a result you get the majority of
results that mention “Moon” but are not about the “Moon
Buggy”.
More sophisticated searching still doesn’t get better results
A more knowledgeable user might search for the
phrase “moon buggy” which should
potentially return more relevant results,
but may not return ALL the relevant
results as there may be other ways to
describe this item.
Standard Search Results
Ontology Driven Widgets provide “Did You Mean?”
Each set of results includes some suggested terms, extracted from the Semantic Model using a process called “Concept
Mapping”. The most common use for this is to provide a “Did You Mean”
panel.
The user can hover-over terms and see
information such as a description and images
surfaced from the Model. In this case, the user has selected the preferred-term of “Lunar Roving Vehicle” as the picture matches what they call
the “Moon Buggy”.
Model Assisted Search Result
Search Results enhanced by Semantic Model
In this case, the user has selected the preferred term
“Lunar Roving Vehicle” (either when prompted in the search box or via the “Did you mean”
panel).
The search engine is now returning the 59 results that were categorised as being
relevant to the Lunar Roving Vehicle, using the rules built
automatically from the Semantic Model, using as
evidence the term, its acronyms (‘LRV’), its
synonyms (such as ‘Moon Buggy’) and the context of the related missions (Apollos 15,
17 and 17).
Results returned in this type of search will be more relevant,
as the match is determined by a linguistic analysis of the content – not by a search
algorithm.
Search refiners augment the Semantic Model
The search results page includes
refiners, populated from document
metadata which can be obtained from the document itself, or by classification against the Semantic Model.
These refiners can be used to supplement the Semantic Model,
for example you could use an author refiner to identify experts on the subject that you
are researching.
Auto Categorisation (or Classification)
Document for Categorisation
This slide shows how you can apply the Semantic Model to
documents (in this case a transcription of an Apollo crew
de-brief) to automatically identify the areas of the model
that are discussed in this document. These items are stored as various items of
metadata, in this case when the document is uploaded to
SharePoint, although Semaphore integrates with
many other systems.
Semaphore has also identified the type of document, and this
can be used to drive additional workflow such as
compliance etc.
Lastly, the Model is interactive – document authors can
browse the model for relevant terms, or use search-as-you-
type.
Entity Extraction
In this example a document (taken from Wikipedia) is not only being categorised for Subject (in this case topics from a civilian
government taxonomy) but Semaphore is also
extracting Organisations and People found in the
document using Natural Language Processing, names
that can be included as Metadata even though they aren’t part of the
Semantic Model.
Fact Extraction
In this example Semaphore is being used to process legal
documents to automatically extract
key pieces of information such as
Party names, amounts, terms and conditions
etc. Where these items can be extracted
explicitly they can be stored as metadata
properties; where they cannot be extracted explicitly, the clauses
referring to these items can be stored for
manual processing.
Model Creation and Management (Taxonomies/Ontologies)
Model; High Level Concepts
Browsing the Semantic Model
Semaphore provides a collaborative environment for
managing semantic models, capitalising on subject matter experts within an
organisation.
This illustration shows the Semaphore
Workbench being used to browse the
NASA Model, the user can select to browse by top-level category, or can type a search, which will be matched to terms in the model.
Concept Relationships (Collaboration Tool)
Term information
The Semaphore workbench shows how each term fits
into the model, including related
terms, synonyms and term properties. All this information can
be used in document categorization and in search enhancement as illustrated in this
presentation.
Obtaining feedback
The Semaphore Workbench also
allows collaboration: subject matter experts can contribute to the
quality of the Semantic Model by
suggesting additional terms, synonyms and
related terms.
The Value of a Semantic Solution
Our clients describe the value they derive in a number of ways, here are just three:
Cost Efficiency:One organisation, which has a very engineering/scientific workforce, indicates that it saves the equivalent of cUS$700 per employee per year due to the reduction in time taken to find the right content from across many content repositories. ($700/$45 (hourly salary) = 15.5 hours/year saved = 19 minutes/week saved). With over 10,000 employees the equivalent savings are huge.
Cost Savings:Another organisation calculated the cost of classifying documents manually at US$3 per document (based on staff costs, office space, etc). With over 500,000 documents needing to be classified the Return on Investment was 10 fold – and would continue to increase as more documents are produced. They also cited the quality and consistency of auto-classification to be significantly better than human-classified content
Risk Reduction:Financial Services companies that cannot prove compliance to a host of regulations are being fined millions of Euros/Pounds/Dollars. One reason they cannot prove compliance is that the evidence they need is lost or locked away in textual content, in a file-share or in a Content Repository, poorly classified. Our semantic solution makes the evidence readily available and provides consistency over time. Looking for the same evidence at a later date will still deliver the same results.
SMARTLOGIC – EMEA & APAC200 ALDERSGATELONDON, EC1A 4HDTEL: +44 (0)203 176 4500FAX: +44 (0)207 785 7014
SMARTLOGIC – AMERICAS111 N MARKET ST.SAN JOSE, CALIFORNIA, 95113TEL: 408 213 9500FAX: 408 572 5601
[email protected] © 2016 SMARTLOGIC INC
Questions …
And
Thank You!
Paul GunstoneSales Director
[email protected]+44 7739 310343