Site Search Analytics in a Nutshell

  • Published on
    21-Apr-2017

  • View
    65.251

  • Download
    0

Embed Size (px)

Transcript

  • Site Search Analytics in a Nutshell

    Louis Rosenfeld

    lou@louisrosenfeld.com @louisrosenfeld

    Webdagane 10 September 2013

    mailto:lou@louisrosenfeld.commailto:lou@louisrosenfeld.com

  • Hello, my name is Lou

    www.louisrosenfeld.com | www.rosenfeldmedia.com

  • Lets look at the data

  • No, lets look at the real dataCritical elements in bold: IP address, time/date stamp, query, and # of

    results:

    XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02

    XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ie=UTF-8&client=www&q=license+plate&ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16

  • No, lets look at the real dataCritical elements in bold: IP address, time/date stamp, query, and # of

    results:

    XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02

    XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ie=UTF-8&client=www&q=license+plate&ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16

    What are users searching?

  • No, lets look at the real dataCritical elements in bold: IP address, time/date stamp, query, and # of

    results:

    XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02

    XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ie=UTF-8&client=www&q=license+plate&ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16

    What are users searching?

    How often are users failing?

  • SSA is semantically rich data, and...

  • SSA is semantically rich data, and...

    Queries sorted by frequency

  • ...what users want--in their own words

  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences

  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiencesNot all queries are distributed equally

  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences

  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiencesNor do they

    diminish gradually

  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences

  • A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences

    80/20 rule isnt quite accurate

  • (and the tail is quite long)

  • (and the tail is quite long)

  • (and the tail is quite long)

  • (and the tail is quite long)

  • (and the tail is quite long)The Long Tail is

    much longer than youd suspect

  • The Zipf Distribution, textually

  • Some things you can do with SSA

    1.Make it harder to get lost in deep content2.Make search smarter3.Reduce jargon4.Learn how your audiences differ5.Know when to publish what6.Own and enjoy your failures7.Avoid disaster8.Predict the future

  • #1Make it harder to get lost

  • Start with basic SSA data: queries and query frequency

    Percent: volume of search activity for a unique query during a particular time period

    Cumulative Percent: running sum of percentages

  • Tease out common content types

  • Tease out common content types

  • Tease out common content types

    Took an hour to... Analyze top 50 queries (20% of all search activity) Ask and iterate: what kind of content would users be

    looking for when they searched these terms? Add cumulative percentages

    Result: prioritized list of potential content types#1) application: 11.77%

    #2) reference: 10.5% #3) instructions: 8.6%

    #4) main/navigation pages: 5.91%

    #5) contact info: 5.79%

    #6) news/announcements: 4.27%

  • Clear content types lead to better contextual navigation

    artist descriptions

    album reviews

    album pages

    artist biosdiscography

    TV listings

  • #2Make search smarter

  • Clear content types improve search performance

  • Clear content types improve search performance

  • Clear content types improve search performance

    Content objects related to products

  • Clear content types improve search performance

    Content objects related to products

    Raw search results

  • Contextualizing advanced features

  • Session data suggest progression and context

  • Session data suggest progression and context

    search session patterns1. solar energy2. how solar energy works

  • Session data suggest progression and context

    search session patterns1. solar energy2. how solar energy works

    search session patterns1. solar energy2. energy

  • Session data suggest progression and context

    search session patterns1. solar energy2. how solar energy works

    search session patterns1. solar energy2. energy

    search session patterns1. solar energy2. solar energy charts

  • Session data suggest progression and context

    search session patterns1. solar energy2. how solar energy works

    search session patterns1. solar energy2. energy

    search session patterns1. solar energy2. solar energy charts

    search session patterns1. solar energy2. explain solar energy

  • Session data suggest progression and context

    search session patterns1. solar energy2. how solar energy works

    search session patterns1. solar energy2. energy

    search session patterns1. solar energy2. solar energy charts

    search session patterns1. solar energy2. explain solar energy

    search session patterns1. solar energy2. solar energy news

  • Recognizing proper nouns, dates, and unique ID#s

  • #3Reduce jargon

  • Saving the brand by killing jargon at a community collegeJargon related to online education: FlexEd, COD,

    College on Demand

    Marketings solution: expensive campaign to educate public (via posters, brochures)

    The Numbers (from SSA):

    Result: content relabeled, money saved

    query rank query#22 online*#101 COD#259 College on Demand#389 FlexTrack

    * online part of 213 queries

  • #4Learn how your audiences differ

  • Who cares about what?

  • Who cares about what?

  • Who cares about what?

  • Who cares about what?

  • Why analyze queries by audience?

    Fortify your personas with dataLearn about differences between audiences

    Open University Enquirers: 16 of 25 queries are for subjects not taught at OU

    Open University Students: search for course codes, topics dealing with completing program

    Determine whats commonly important to all audiences (these queries better work well)

  • #5Know when to publish what

  • Interest in the football team:

    going...

  • Interest in the football team:

    going...

    ...going...

  • Interest in the football team:

    going...

    ...going...

    gone

  • Interest in the football team:

    going...

    ...going...

    gone

    Time to study!

  • Before Tax Day

  • After Tax Day

  • #6Own and enjoy your failures

  • Failed navigation?Examining unexpected searching

    Look for places searches happen beyond main page

    Whats going on?

    Navigational failure?

    Content failure? Something else?

  • Where navigation is failing (Professional Resources page)

    Do users and AIGA mean different things by Professional Resources?

  • Comparing what users findand what they want

  • Comparing what users findand what they want

  • Failed business goals?Developing custom metrics

    Netflix asks

    1. Which movies most frequently searched? (query count)

    2. Which of them most frequently clicked through? (MDP views)

    3. Which of them least frequently added to queue? (queue adds)

  • Failed business goals?Developing custom metrics

    Netflix asks

    1. Which movies most frequently searched? (query count)

    2. Which of them most frequently clicked through? (MDP views)

    3. Which of them least frequently added to queue? (queue adds)

  • Failed business goals?Developing custom metrics

    Netflix asks

    1. Which movies most frequently searched? (query count)

    2. Which of them most frequently clicked through? (MDP views)

    3. Which of them least frequently added to queue? (queue adds)

  • #7Avoid disasters

  • The new and improved search engine that wasnt

    Vanguard used SSA to help benchmark existing search engines performance and help select new engine

    New search engine performed poorlyBut IT needed

    convincing to delay launch

    Information Architect &

    Dev Team Meeting

    Search seems to have a few

    problems Nah

    .

    Wheres the

    proof?

    You cant tell

    for sure.

  • What to do? Test performance of common queries

    Before and after testing using two sets of metrics1.Relevance: how reliably the search engine

    returns the best matches first2.Precision: proportion of relevant results

    clustered at the top of the list

  • Old engine (target) and new compared

    Note: low relevance and high precision scores are optimal

    More on Vanguard case study: http://bit.ly/D3B8c

    http://bit.ly/D3B8chttp://bit.ly/D3B8c

  • Old engine (target) and new compared

    Note: low relevance and high precision scores are optimal

    More on Vanguard case study: http://bit.ly/D3B8c

    uh-oh

    http://bit.ly/D3B8chttp://bit.ly/D3B8c

  • Old engine (target) and new compared

    Note: low relevance and high precision scores are optimal

    More on Vanguard case study: http://bit.ly/D3B8c

    uh-oh better

    http://bit.ly/D3B8chttp://bit.ly/D3B8c

  • #8Predict the future

  • Shaping the Financial Times editorial agendaFT compares these

    Spiking queries for proper nouns (i.e., people and companies)

    Recent editorial coverage of people and companies

    Discrepancy? Breaking story?!

    Let the editors know!Seed your

  • Can SSA bring us together?

  • Lous TABLE OF OVERGENERALIZED

    DICHOTOMIESWeb Analytics User Experience

    What they analyze Users' behaviors (what's happening) Users' intentions and motives (why those things happen)

    What methods they employ

    Quantitative methods to determine what's happening

    Qualitative methods for explaining why things happen

    What they're trying to achieve

    Helps the organization meet goals (expressed as KPI)

    Helps users achieve goals (expressed as tasks or topics of interest)

    How they use data Measure performance (goal-driven analysis) Uncover patterns and surprises (emergent analysis)

    What kind of data they use

    Statistical data ("real" data in large volumes, full of errors)

    Descriptive data (in small volumes, generated in lab environment, full of errors)

  • Lands End and SKUs

  • Lands End and SKUs

    SKU: # 39072-2AH1

  • Use SSA to start work on a site report card

  • Use SSA to start work on a site report card

    SSA helps determine common information needs

  • Read this

    Search Analytics for Your Site: Conversations with Your Customers by Louis Rosenfeld (Rosenfeld Media, 2011)

    www.rosenfeldmedia.com

    Use code WEBDAGENE2013

    for 20% off allRosenfeld Media books

    http://www.rosenfeldmedia.comhttp://www.rosenfeldmedia.com

  • Louis Rosenfeld lou@louisrosenfeld.com

    www.louisrosenfeld.comwww.rosenfeldmedia.comwww.slideshare.net/lrosenfeld

    @louisrosenfeld@rosenfeldmedia

    Say hello

    mailto:lou@louisrosenfeld.commailto:lou@louisrosenfeld.comhttp://www.louisrosenfeld.comhttp://www.louisrosenfeld.comhttp://www.rosenfeldmedia.comhttp://www.rosenfeldmedia.comhttp://www.slideshare.net/lrosenfeldhttp://www.slideshare.net/lrosenfeld