FAST Search-webinar-06-29-2010

Preview:

DESCRIPTION

 

Citation preview

Earley & Associates, Inc. | Classification: PUBLIC USE Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

1

Technology Showcase – FAST Search

June 29, 2010

Creating Innovative Enterprise Search Applications to enable Business Productivity

2Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

2

Technology Showcase FAST Search Webinar Schedule

This Webinar will be presented two times to maximize coverage over time zones worldwide

Session 1:

June 29, 2010

6:00am Pacific Daylight Time9:00am Eastern Daylight Time2:00pm British Summer Time3:00pm Central European Time

Session 2:

June 29, 2010

11:00am Pacific Daylight Time1:00pm Eastern Daylight Time6:00pm British Summer Time7:00pm Central European Time

3Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

3

Upcoming Taxonomy Community of Practice Webinars

July 7, 2010 Rights Management Taxonomy – A Key to Digital Content Value

August 4, 2010 Getting Greater Business Value from Taxonomy Projects

Communities of Practice

SharePoint IA Group: http://tech.groups.yahoo.com/group/SharePointIACoP/

Taxonomy Group: http://finance.groups.yahoo.com/group/TaxoCoP

Search Group: http://tech.groups.yahoo.com/group/SearchCoP

4Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

4

• July 26 - July 29  Chicago, IL         

• Aug 9 - Aug 12 Toronto, Canada       

• Sep 13 - Sep 16 Washington, DC                  

• Sep 20 - Sep 23 San Francisco, CA               

• Oct 4 - Oct 7 Philadelphia,        

• Oct 18 - Oct 21 Houston, TX  

Course details at www.earley.com/training/aiim-courses

AIIM Information Organization and Access (IOA) Certificate Program

by Earley & Associates

5Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

5

Housekeeping

• Calls last approximately 60 minutes

• Questions will be taken at the end of the sessionemail your questions to sachie@earley.com

• During the session send chats with questions to Earley & Associates

If you have problems with your audio please contact Adam at 1-719-785-5674

6Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

6

Speakers & Sponsors

• Nate Treloar Principal Search Evangelist, Microsoft

• Toby Conrad Senior Consultant, Smartlogic

• Seth Earley President/CEO, Earley & Associates, Inc.

7Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

7

Earley & Associates Highlights

Founded 1994

Focus Areas Holistic approach to specific business contexts and goals for:

• Retail

• Manufacturing

• Public Sector

• Pharmaceuticals & Life Sciences

• Media & Entertainment

Personnel Core team of 25 consultants

Locations Concord, MA headquarters, consultants in US, UK & Canada, global projects

Services • Taxonomy & Information Architecture

• Search Strategy for Enterprise & Web

• ECM, DAM & Information Lifecycle

• Program Management & Governance

8Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

8

About the Series

• Earley & Associates has been sponsoring call series for the past 5 years on topics around content management, search, taxonomy, information architecture

• The Technology Showcase Series arose from requests by customers to explore tools and technologies that help bridge the gap between strategy and tactics

• Today’s call is sponsored by:

• Smartlogic, our partner in taxonomy management and integration technologies

• Microsoft FAST, provider of search application infrastructure

9Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

9

Session Themes

• We need to redefine our understanding and approach to search

• Search is an application and needs to be integrated into the user experience, not just bolted on or considered an afterthought

• Search can be made much more effective by leveraging metadata and taxonomies

Earley & Associates, Inc. | Classification: PUBLIC USE Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

10

The Challenge of Search

Five basics truths about search

11Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

11

Search as Utility

• “search as a utility has become deeply ingrained into people's everyday lives.“ – Study by Nielsen/Net Ratings

• “search software, hardware, and support bundle or search appliance has become very popular since being introduced in early 2002" – Goebel Group

These are misleading concepts. Search is used as a utility, but

contexts vary so widely that “plugging search in” does not always produce satisfactory

results.

12Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

12

Search is Heterogeneous

Search/Classification/Taxonomy Integration Framework

Data Sources

Search Mechanisms

Appliances Federated Search

Auto categorization/Clustering

Entity Extraction

Faceted Search

Semantic Search

Business Intelligence

Customer Relationship Mgt

Document repositories

Custom databases and applications

Intranets/web pages

13Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

13

Truth #1:

We have to change our definition of search.

• Search is no longer just a white box.• Search is an experience.• Search is about information access & capabilities.

14Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

14

Truth #2:

Search algorithms are getting better, but they cannot infer human

context and intent.

• A search engine doesn’t know if I’m an engineer, an attorney, or a high school student.

• Perspective has an impact on whether a set of search results are useful and appropriate.

15Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

15

Truth #3:

Taxonomy, metadata and information architecture are key aspects of search.

• Search is fundamentally about metadata• Some content is structured, some isn’t and needs help• Advanced search functionalities require taxonomy

16Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

16

Truth #4:

Search is increasingly looking like navigation.

• What happens when you click on a link?• Guided navigation and faceted search are really the

same thing.

17Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

17

Truth #5:

Search is messy.

• Knowledge is messy, information is messy.• People find answers through haphazard and chaotic

processes.

18Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

18

“…search terms are short, ambiguous and an approximation of the searcher’s real information need…”

Source: http://research.microsoft.com/~ryenw/papers/WhiteCONTEXT2002.pdf Ryen W. White, Joemon M. Jose and Ian Ruthven

19Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

19

Nate Treloar, Microsoft

The Search Experience

2020

Enterprise Search Matters

Search is the key to engaging

information experiences

Create experiences that combine the magic of software with the power of Internet

services across a world of devices

DESKTOP

ENTERPRISE

ONLINE DEVICES

2121

Connecting people to information, driving better outcomes

Enterprise Search is Transforming Business

Search helps your customers

get what they want

Search helps your employees

get their jobs done

cutting costsincreasing revenue

2222

…and a common platform

Solutions for

Business Productivit

y

Solutions for

Internet Sites

2323

Rapidly respond to business needs

Cut costs with a common infrastructure

Connect and empower people

SharePoint 2010Next Generation Platform

Communities

Search

Sites

Composites

ContentInsights

2424

Today’s Choices Force Compromise

OR

Out-of-the-Box Products

limited in capability

easy to manage

but

High-end Products

hard to manage, expensive

highly capable

but

2525

Best of SharePoint

Best of High-end

Best of Microsoft

Enterprise Search from

Microsoft

Our Vision: A New Choice in Search

2626

FAST Search for SharePoint User InterfaceOut-of-the-box features and controls

Deep Refinement

Thumbnails

Previews

Sorting

Similar Results

Federation

People Search

27

Sequential stages perform specific tasks while ingesting content

Breaks down content to the smallest addressable chunks to build meaningUnderstands file encoding, data formats, and written languages Supports 400+ file formats, 80+ languages

Process your content to make it searchableNormalizes content so that a consistent relevancy model can be appliedIdentifies structured and unstructured metadata in your contentMaps document metadata to SharePoint Crawled Properties

FAST Content Enrichment “Pipeline"A systematic approach to interpreting your content

Entity Extraction

Language Detection

Format Conversion

Custom Stage

2828

Search Driven ApplicationsMeet all the search application needs you have across your business

“How do I support the

unique search needs of teams and work

that impact our business?”

To do so, you need a search platform that has• A deep understanding

of your information• Flexible relevance to

meet diverse needs• A customizable UX to

increase user efficiency

Sales: 360o Customer Insight

Analytics: Risk Management

Marketing: Competitive Intelligence

Research & Development:Innovation Portal

Support:Call Center Advisor

Operations:Systems/Logistics Portal

Legal, HR, IT, Finance, ……

2929

3030

3131

3232

Enterprise Search from Microsoft

UX ITDX

Go beyond the

search box

Eliminate compromise

Do more with search

www.microsoft.com/enterprisesearch/

33© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the

date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

microsoft.com / Enterprise Search

Thank You.

Building powerful applications using semantic classification

Toby Conrad

toby.conrad@smartlogic.com+1 773 251 0824

Introduction

• Is it useful to know what content you have?

• Will it help me improve search?

• Will it deliver powerful search applications?

• Examples

• Conclusion

Knowing what you have

What about your content?

How do you catalogue your content?

Creation Date

Modified Date

Author

Format(PDF,DOC,XLS)

Subject

Location

Project

Function(IT,HR,Finance)

Exp

ert

Pro

tect

ive

Mar

ker

Ret

enti

onE

xpir

y

Pub

lish

er

Sit

e

How do I structure it?

Creation Date

Modified Date

Author

Format(PDF,DOC,XLS)

Subject

Location

Project

Function(IT,HR,Finance)

Exp

ert

Pro

tect

ive

Mar

ker

Ret

enti

onE

xpir

y

Pub

lish

er

Sit

eStructuralProcess

Information

Information Seeking Behaviour

Creation Date

Modified Date

Author

Format(PDF,DOC,XLS)

Subject

Location

Project

Function(IT,HR,Finance)

Exp

ert

Pro

tect

ive

Mar

ker

Ret

enti

on/

Exp

iry

Pub

lish

er

Dep

artm

ent

SharePoint 2010 Search:Metadata Refinement Panel

What is the benefit?

Precision Recall

Confidence

Discovery

Find Re-Use

NASA screenshot

Perfectly formed filters organised by facet

Explore relationships

NASA screenshot

Topic overviews

NASA screenshot

Concept Mapping

NASA screenshot

Explore broader andnarrower topics

NASA Screenshot

Hierarchical facetedsearch

NHS screenshot

Summer Health – Live Well

Over-50s can stay healthy and independent by keeping active and ... - Live Well

More results

Visual contentnavigation

NHS screenshot

Metadata enrichment process 1

AmericasUSA

Tampa

Florida

North America

Earth

Automating the process with Semantics

Doing it with Fast

FAST ESP Server

Query Pipeline

Index

Use

r R

eque

sts

Corpus

Search and Navigation

Enhancement

Classification and

Text Mining

Indexing Pipeline

Document Processor

Portal

Search Application Framework

Sam

ple

Inte

rfac

e C

ode

OntologyLifecycle

Management

Doing it with SharePoint

Microsoft Office SharePoint Server

MOSS Search

Cra

wle

r

Use

r R

eque

sts

Web Application

Document Library

Search Web Part

Mapped Property

Search and Navigation

Enhancement

Classification and

Text Mining

OntologyLifecycle

Management

Publishing Site

Semaphore Solution (WSP)

...and as part of the SharePoint load

Migration Engine

Open Semantic Platform

In summary

• Classifying your content tells you exactly what you have, where to locate it and what to use it for

• It is valuable in managing, re-purposing and finding content – and most of all in supporting key processes

• With text mining, content classification, off-the-shelf integration and UI frameworks it is now easy enough to be main stream

56Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

56

Seth Earley, Earley & Associates

Mobilizing a Semantic Classification Project

57Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

57

Mobilizing for Semantic Classification

• Define the information management problem you are attempting to solve Not enough to say “make the information easier to use”

• Focus on a specific process, user type, audience, context, information source For example – Proposal development process for sales

organization in North America

• Enroll the appropriate business stakeholders, executive sponsor, content or repository owners Quantify business impact of the problem

plan

58Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

58

Mobilizing for Semantic Classification

• Ask where and how taxonomy and information architecture can improve the user experience

• Consider overall site organizing principles as well as search functionality

• Develop both active and passive personalization to surface content to users in the context of their task

plan

59Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

59

Leveraging Taxonomy to Improve Findability

Faceted Search

Navigational Search via

Topic Taxonomy

Scoped Search

60Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

60

Organization & Access

Site Map& Navigation

Wireframes & Template Design

61Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

61

Audience Analysis & IM Lifecycle Automation

User Roles & Groups

Personalization Design

User Scenarios

Workflow Design

62Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

62

Possible Approaches

• Option 1 - Small scale technology proof of concept

• Option 2 – More detailed current state assessment of content, processes and organizing principles

• Option 3 – Full information strategy and architecture development

Determine the correct overall approach through a Semaphore Planning Engagement

63Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

63

Semaphore Planning Engagement

1. Review your findability objectives and the overall context in which Semaphore will be deployed (technology, organization, processes, and representative assets)

2. Plan and configure a proof of concept that helps you assess the potential of Semaphore Demonstrating how it will improve findability Evaluating the potential of automated taxonomy creation and tagging

given a selection of representative documents or other digital assets

3. Document results and identify opportunities and challenges to succeeding with a Semaphore deployment in your organization

4. Collaborate to develop a roadmap for meeting your findability objectives Planning out technology integration, taxonomy development,

knowledge transfer, sustainable governance and other aspects of a successful initiative

64Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

64

Answering Key Questions for Initiative Success

• Is our current taxonomy sufficient to meet our findability objectives? If not, where do we need to expand to more facets

for search and navigation? Do we need to custom craft high-value areas of

taxonomy to support our core business operations?

• Does our taxonomy have the rich synonyms and relationships required to ensure automated tagging? If not, will automated tagging do a good job with the

metadata present in our unstructured content? Can we “extract” semantic metadata using

Semaphore’s ability to text mine and generate an ontology?

plan

65Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

65

Answering Key Questions for Initiative Success

• Do we need to improve our governance model to maintain a deep and broad taxonomy over time? Can we leverage our existing governance model for

data management or do we need a different approach?

• What are the skills and resources needed to successfully maintain a Semaphore solution over time? How much training will our organization need? Given

our environment and culture what is the best way to develop the skills we need?

• How will we leverage semantic classification to further improve findability and reuse in our IT initiatives?

plan

66Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

66

In Summary - Creating Innovative Enterprise Search Applications

• Design the user experience of information access, not simply ‘white box’ search

• Think of search throughout the design of your application, not as an afterthought

• Allow the user to make choices that will “disambiguate” their searches – help them think more precisely

• Carefully evaluate your environment – people, process, content, tools – in order to improve the search experience

Earley & Associates, Inc. | Classification: PUBLIC USE Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

67

Questions?

68Copyright © 2010 Earley & Associates, Inc. All Rights Reserved.

68

Please fill out the survey that should be in your inbox.

Let us know what topics you are interested in and how we can

improve the series.

Seth Earleyseth@earley.comwww.earley.com781-820-8080