Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”

Preview:

Citation preview

Smartlogic, Semaphore and Semantically Enhanced Search – For “Discovery”

Zoeken & Vinden3rd March 2016, Amsterdam

Paul Gunstone, Sales Director

The Company & The Opportunity

We believe there is huge business value locked away in content, because content contains the majority of the organization's intelligence. Organizations that unlock that value can outperform competitors in their market. Content augments the value that most organizations have already found in structured data - it is the untapped frontier of competitive advantage.

To realize this value we know you must understand your content, the information and knowledge it contains, how it can be applied in the context of your operations and how it enhances the insights from structured data. The content must be described completely and consistently with metadata.

We focus all our energy on creating value from unstructured content – something we call Content Intelligence

Jeremy Bentley, CEO

Some of Smartlogic’s customers

Search “Maturity” - why am I searching?

Search Volume

Search Value

Document Search Subject Search

“Euraka” Search

€0

€1M?

SEARCH

REFINERS

ENTITIES

CLUSTERING

RELATIONSHIPS

FACT EXTRACTION

• Digital universe is growing dramatically

• Most of this information is unstructured

• Only a small fraction of the digital universe has been explored for analytical value

• Valuable knowledge and

relationships are hidden in this data

The Challenges … Part 1

Relentless growth in content volumes ……

The Challenges …. Part 2

We don’t all speak the same language ……

The Challenges …. Part 3

The proliferation of systems within and beyond the “Hyper-Connected Enterprise”is creating a HUGE range of sources of ‘content’:

• File shares • DMS• ECMS• ERP• HR (HCM)• Finance & Legal• Email• Knowledge Base• CRM• SFA• Twitter (etc.)• LIMS (eg)• Call Centre Logs

Board & Meeting Minutes, Engineers’ Reports, H&S Audits (and Actions), Annual Appraisals, Processes & Procedures, Newsletters, Marketing & Product Materials, Maintenance & Repair Manuals, Contracts, Letters of Credit, Insurance Policies, Supply Chain Information, Strategy Documents, Business Plans, Annual Report, Regulatory Submissions, Management Reports, Performance Management Documents, Grievance Procedure Evidence ……..

Contextual metadata driven experienceUser interfaces to leverage the ontologies to deliver the richest experience for users when publishing, using and analyzing content

Semaphore delivers these capabilities – enterprise scale

Build and manage semantic modelsSimplify the ingestion, development or customization of ontologies

Assisted and automated metadata enrichmentAutomatically describe all your content with rich metadata

What is Semaphore?

Conceptual architecture

SharePoint 2013 integration

SharePoint Online integration

Solr native integration

MarkLogic integration

Generic CMS integration

A presenter’s nightmare …

I will now demonstrating 3 components to show how each plays a part in making content discoverable.

Search and Publishing Enhancement

Executing a SOLR Search

Executing a Solr Search

The screen-shot shows the Semaphore

SAF (Search Application

Framework) front-end where the user is wanting to search NASA content for information on the

“moon buggy”.

The search box is prompting with

suggestions from the model, but we’re going

to ignore these to illustrate the benefits

of using semaphore to enhance solr search

Standard Search Results

Standard SOLR Search Results

You can see there are over 2000 results.

The standard SOLR method for joining two

words is to use an ‘OR’; as a result you get the majority of

results that mention “Moon” but are not about the “Moon

Buggy”.

More sophisticated searching still doesn’t get better results

A more knowledgeable user might search for the

phrase “moon buggy” which should

potentially return more relevant results,

but may not return ALL the relevant

results as there may be other ways to

describe this item.

Standard Search Results

Ontology Driven Widgets provide “Did You Mean?”

Each set of results includes some suggested terms, extracted from the Semantic Model using a process called “Concept

Mapping”. The most common use for this is to provide a “Did You Mean”

panel.

The user can hover-over terms and see

information such as a description and images

surfaced from the Model. In this case, the user has selected the preferred-term of “Lunar Roving Vehicle” as the picture matches what they call

the “Moon Buggy”.

Model Assisted Search Result

Search Results enhanced by Semantic Model

In this case, the user has selected the preferred term

“Lunar Roving Vehicle” (either when prompted in the search box or via the “Did you mean”

panel).

The search engine is now returning the 59 results that were categorised as being

relevant to the Lunar Roving Vehicle, using the rules built

automatically from the Semantic Model, using as

evidence the term, its acronyms (‘LRV’), its

synonyms (such as ‘Moon Buggy’) and the context of the related missions (Apollos 15,

17 and 17).

Results returned in this type of search will be more relevant,

as the match is determined by a linguistic analysis of the content – not by a search

algorithm.

Search refiners augment the Semantic Model

The search results page includes

refiners, populated from document

metadata which can be obtained from the document itself, or by classification against the Semantic Model.

These refiners can be used to supplement the Semantic Model,

for example you could use an author refiner to identify experts on the subject that you

are researching.

Auto Categorisation (or Classification)

Document for Categorisation

This slide shows how you can apply the Semantic Model to

documents (in this case a transcription of an Apollo crew

de-brief) to automatically identify the areas of the model

that are discussed in this document. These items are stored as various items of

metadata, in this case when the document is uploaded to

SharePoint, although Semaphore integrates with

many other systems.

Semaphore has also identified the type of document, and this

can be used to drive additional workflow such as

compliance etc.

Lastly, the Model is interactive – document authors can

browse the model for relevant terms, or use search-as-you-

type.

Entity Extraction

In this example a document (taken from Wikipedia) is not only being categorised for Subject (in this case topics from a civilian

government taxonomy) but Semaphore is also

extracting Organisations and People found in the

document using Natural Language Processing, names

that can be included as Metadata even though they aren’t part of the

Semantic Model.

Fact Extraction

In this example Semaphore is being used to process legal

documents to automatically extract

key pieces of information such as

Party names, amounts, terms and conditions

etc. Where these items can be extracted

explicitly they can be stored as metadata

properties; where they cannot be extracted explicitly, the clauses

referring to these items can be stored for

manual processing.

Model Creation and Management (Taxonomies/Ontologies)

Model; High Level Concepts

Browsing the Semantic Model

Semaphore provides a collaborative environment for

managing semantic models, capitalising on subject matter experts within an

organisation.

This illustration shows the Semaphore

Workbench being used to browse the

NASA Model, the user can select to browse by top-level category, or can type a search, which will be matched to terms in the model.

Concept Relationships (Collaboration Tool)

Term information

The Semaphore workbench shows how each term fits

into the model, including related

terms, synonyms and term properties. All this information can

be used in document categorization and in search enhancement as illustrated in this

presentation.

Obtaining feedback

The Semaphore Workbench also

allows collaboration: subject matter experts can contribute to the

quality of the Semantic Model by

suggesting additional terms, synonyms and

related terms.

The Value of a Semantic Solution

Our clients describe the value they derive in a number of ways, here are just three:

Cost Efficiency:One organisation, which has a very engineering/scientific workforce, indicates that it saves the equivalent of cUS$700 per employee per year due to the reduction in time taken to find the right content from across many content repositories. ($700/$45 (hourly salary) = 15.5 hours/year saved = 19 minutes/week saved). With over 10,000 employees the equivalent savings are huge.

Cost Savings:Another organisation calculated the cost of classifying documents manually at US$3 per document (based on staff costs, office space, etc). With over 500,000 documents needing to be classified the Return on Investment was 10 fold – and would continue to increase as more documents are produced. They also cited the quality and consistency of auto-classification to be significantly better than human-classified content

Risk Reduction:Financial Services companies that cannot prove compliance to a host of regulations are being fined millions of Euros/Pounds/Dollars. One reason they cannot prove compliance is that the evidence they need is lost or locked away in textual content, in a file-share or in a Content Repository, poorly classified. Our semantic solution makes the evidence readily available and provides consistency over time. Looking for the same evidence at a later date will still deliver the same results.

SMARTLOGIC – EMEA & APAC200 ALDERSGATELONDON, EC1A 4HDTEL: +44 (0)203 176 4500FAX: +44 (0)207 785 7014

SMARTLOGIC – AMERICAS111 N MARKET ST.SAN JOSE, CALIFORNIA, 95113TEL: 408 213 9500FAX: 408 572 5601

WWW.SMARTLOGIC.COMINFO@SMARTLOGIC.COM © 2016 SMARTLOGIC INC

Questions …

And

Thank You!

Paul GunstoneSales Director

paul.gunstone@smartlogic.com+44 7739 310343