23
Search, APIs, Capability Management and the Sensis Journey Craig Rees

Search, APIs, capability management and the Sensis journey - By Rees Craig

Embed Size (px)

DESCRIPTION

See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011 Earlier this year, Sensis launched its Business Search API, which allows publishers to develop local search propositions powered by the two million business listings contained in the Australian Yellow Pages® and White Pages® directories. This case study will explore Sensis’ strategic direction for search and explain how the framework and metrics by which search is managed at Sensis were used to define our search roadmap. Key architectural decisions including our use of Solr and MongoDB will be discussed as well as our approach to real-time search tuning and quality management.

Citation preview

Page 1: Search, APIs, capability management and the Sensis journey - By Rees Craig

Search, APIs, Capability Management

and the Sensis Journey

Craig Rees

Page 2: Search, APIs, capability management and the Sensis journey - By Rees Craig

• Project background

• Platform selection

• Search capability

• Relevance

• Architecture

• Quality management

• Hurdles

• What’s next

Today’s menu

Page 3: Search, APIs, capability management and the Sensis journey - By Rees Craig

• Sensis helps Australians find, buy and sell

• From print directories to a cross-platform lead generator

• Sensis publishes over 1.8 Million business listings

• Two of the top 10 visited online sites in Australia (WhitePages.com.au and YellowPages.com.au)

Sensis

Page 4: Search, APIs, capability management and the Sensis journey - By Rees Craig

Business objectives

• Drive presence in the local search market place

• Open up the largest database of business listings in Australia

• Reduce the effort required from local search developers

• Free to use, we are after the reporting Technology objectives

• Develop a total search platform

• Relevancy testing as part of the development lifecycle

• A framework to identify problem spaces

• Manageable platform

• Continuous deployments

Project background

Page 5: Search, APIs, capability management and the Sensis journey - By Rees Craig

Developer portal

Page 6: Search, APIs, capability management and the Sensis journey - By Rees Craig

Platform selection

• Support for the search capability team

• Structured vs non structured data

• Deterministic vs black box

• Non propriety code base

• Community backing

Page 7: Search, APIs, capability management and the Sensis journey - By Rees Craig

Unmanaged

Adhoc

Monitored

Managed

Optimized

• No resources• No reporting• Out of the box

features

• Adhoc processes• Part time team• Static dictionaries• Individual led innovation

• Defined team• Regular monitoring• Static autosuggest• Basic linguistics

• Online dashboards• Test environments• Dynamic search refinements• Targets and metrics

• A/B testing• Machine learning• External collaboration• Multiple contexts

The Sensis Search capability maturity model*Courtesy of Pete Crawford & Craig Lonsdale

Lvl 5

Lvl 4

Lvl 3

Lvl 2

Lvl 1

Page 8: Search, APIs, capability management and the Sensis journey - By Rees Craig

Context is key

Intent• Name• Type• Product• Spatial

LocationLocationLocationLocation

ChronologyChronology

Social GraphSocial Graph

IndividualIndividual

DeviceDevice

Page 9: Search, APIs, capability management and the Sensis journey - By Rees Craig

Historical search Data

MongoDB

Business Data

Geo Service

Index

Name Query Handler

Type Query Handler

Business Data

Search Service

Reporting Service

Reporting Events

Publisher

Solr

API

Ontologies

Mashery

Our architecture

Page 10: Search, APIs, capability management and the Sensis journey - By Rees Craig

Historical search Data

MongoDB

Business Data

Geo Service

Index

Name Query Handler

Type Query Handler

Business Data

Search Service

Reporting Service

Reporting Events

Publisher

Solr

API

Ontologies

Mashery

Data staging

Page 11: Search, APIs, capability management and the Sensis journey - By Rees Craig

Historical search Data

MongoDB

Business Data

Geo Service

Index

Name Query Handler

Type Query Handler

Business Data

Search Service

Reporting Service

Reporting Events

Publisher

Solr

API

Ontologies

Mashery

Search

Page 12: Search, APIs, capability management and the Sensis journey - By Rees Craig

Historical search Data

MongoDB

Business Data

Geo Service

Index

Name Query Handler

Type Query Handler

Business Data

Search Service

Reporting Service

Reporting Events

Publisher

Solr

API

Ontologies

Mashery

API

Page 13: Search, APIs, capability management and the Sensis journey - By Rees Craig

Historical search Data

MongoDB

Business Data

Geo Service

Index

Name Query Handler

Type Query Handler

Business Data

Search Service

Reporting Service

Reporting Events

Publisher

Solr

API

Ontologies

Mashery

API proxy

Page 14: Search, APIs, capability management and the Sensis journey - By Rees Craig

• Moved from a black box solution to a manageable platform

• Deliver search improvements without major code changes

• Understand how results were calculated

• Identity problems scientifically

• Continuously tune and test relevance

Evolution of search management

Yesterday Today Tomorrow

Page 15: Search, APIs, capability management and the Sensis journey - By Rees Craig

Problem spaces, quality management & tuning

Path Analysis used to identify problems spaces

Problem spaces, quality management & tuning

“Gold Sets” used to define overall quality score (TREC)

Features signed off only when they make a positive impact to quality score

Specific gold sets for each problem space:

Intent Spelling & stemming Location Phrase parsing

Page 16: Search, APIs, capability management and the Sensis journey - By Rees Craig

Search quality analysis and testing

Page 17: Search, APIs, capability management and the Sensis journey - By Rees Craig

Results examiner

Page 18: Search, APIs, capability management and the Sensis journey - By Rees Craig

Score analysis

Page 19: Search, APIs, capability management and the Sensis journey - By Rees Craig

Tuning

Page 20: Search, APIs, capability management and the Sensis journey - By Rees Craig

Lather, rinse, repeat

Page 21: Search, APIs, capability management and the Sensis journey - By Rees Craig

Hurdles along the way

• Data redundancy and homogeneity • Solr ranking of rare terms • Intent differentiation• Contextual synonyms

Page 22: Search, APIs, capability management and the Sensis journey - By Rees Craig

Where next?

• Query engine• Facets / autosuggest• Real time tuning• Machine learning• Multi term queries• Scoring thresholds• Content Value

Page 23: Search, APIs, capability management and the Sensis journey - By Rees Craig

Questions?

Email: [email protected]: developers.sensis.com.auTwitter: @SensisAPI

@ablebagel