33
Search Analytics for Fun and Profit An Event Apart Chicago, Illinois August 27, 2007 Lou Rosenfeld www.rosenfeldmedia.com

Search Analytics for Fun and Profit

Embed Size (px)

DESCRIPTION

Lou Rosenfeld's presentation on local site search analytics; An Event Apart Chicago, August 27, 2007.

Citation preview

Page 1: Search Analytics for Fun and Profit

Search Analytics for Fun and Profit

An Event ApartChicago, IllinoisAugust 27, 2007

Lou Rosenfeldwww.rosenfeldmedia.com

Page 2: Search Analytics for Fun and Profit

Who I Am

Information architecture consultant to Fortune 500s

Publisher and founder, Rosenfeld Media

Blog at www.louisrosenfeld.comCo-author, Information Architecture

for the World Wide Web (3rd ed., 2006; O’Reilly)

New book: Search Analytics for Your Site: Conversations with your customers (2008; Rosenfeld Media): www.rosenfeldmedia.com/books/searchanalytics

Page 3: Search Analytics for Fun and Profit

Anatomy of a Search Log(from Google Search Appliance)Critical elements in pink: IP address, time/date stamp, query,

and # of results:

XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02

XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ie=UTF-8&client=www&q=license+plate&ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16

XXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=regional+transportation+governance+commission&ip=XXX.XXX.X.130 HTTP/1.1" 200 9718 62 0.17

Page 4: Search Analytics for Fun and Profit

The Zipf Curve: Short Head, Middle Torso, Long Tail

Page 6: Search Analytics for Fun and Profit

What’s the Sweet Spot?

Rank Cumul. % Count Query

1 1.40 7218 campus map

14 10.53 2464 housing

42 20.18 1351 webenroll

98 30.01 650 computer center

221 40.05 295 msu union

500 50.02 124 hotels

7877 80.00 7 department of surgery

Page 7: Search Analytics for Fun and Profit

Topical Patterns and Seasonal Changes

Page 8: Search Analytics for Fun and Profit

Where will you Capture Search Queries?

1. The search logs that your search engine naturally captures and maintains as searches take place

2. Search keywords or phrases that your users execute, that you capture into your own local database

3. Search keywords or phrases that your commercial search solution captures, records, and reports on (Mondosoft, Visual Sciences, Ultraseek, Google Appliance, etc.)

Page 9: Search Analytics for Fun and Profit

Querying your Queries: Getting started

1. What are the most frequent unique queries?2. Are frequent queries retrieving quality results?3. Click-through rates per frequent query?4. Most frequently clicked result per query?5. Which frequent queries retrieve zero results? 6. What are the referrer pages for frequent queries?7. Which queries retrieve popular documents?8. What interesting patterns emerge in general?

Page 10: Search Analytics for Fun and Profit

Tune your Questions:From generic to specific

Netflix asks1. Which movies most frequently searched?2. Which of them most frequently clicked

through?3. Which of them least frequently added to

queue?

Page 11: Search Analytics for Fun and Profit

Diagnose This: Fixing and improving the UX

1. User Research2. Content Development 3. Interface Design: search entry

interface, search results4. Retrieval Algorithm Modification5. Navigation Design6. Metadata Development

Page 12: Search Analytics for Fun and Profit

User Research:What do they want?…

SA is a true expression of users’ information needs (often surprising: e.g., SKU #s at clothing retailer; URLs at IBM)

Provides context by displaying aspects of single search sessions

Page 13: Search Analytics for Fun and Profit

User Research:…what else do they want?…

BBC provides reports to determine other terms searched within same session (tracked by cookies)

Page 14: Search Analytics for Fun and Profit

User Research:…who wants it?…

Specific segments needs as determined by: Security clearance IP address Job function Account information Alternatively, you may be able to

extrapolate segments directly from SA

Pages they initiate searches from

Page 15: Search Analytics for Fun and Profit

User Research:…who wants it?…

BBC’s top queries report from children’s section of site

Page 16: Search Analytics for Fun and Profit

User Research:…and when do they want it?

Time-based variation (and clustered queries) from MSU

By hour, by day,by season

Helps determine“best bets” development

Also can help tune main page and other editorial content

Page 17: Search Analytics for Fun and Profit

Content Development:Do we have the right content?

From www.behaviortracking.com

Analyze 0 result queries Does the content exist? If so, there are titling, wording, metadata, or indexing problems If not, why not?

Page 18: Search Analytics for Fun and Profit

Content Development:Are we featuring the right stuff?

Track clickthroughs to determine which results should rise to the top (example: SLI Systems)

Also suggests which “best bets” to develop to address common queries

BBC removes navigation pages from search results

Page 19: Search Analytics for Fun and Profit

Search Entry Interface Design:“The Box” or something else?Identify “dead end” points (e.g., 0 hits,

2000 hits) where assistance could be added

Query syntax helps you select search features to expose (e.g., use of Boolean operators)

OR

Page 20: Search Analytics for Fun and Profit

Search Results Interface Design:Which results where?#10 result is clicked through more often than #s

6, 7, 8, and 9 (ten results per page)

From SLI Systems (www.sli-systems.com)

Page 21: Search Analytics for Fun and Profit

Search Results Interface Design:How to sort results?Financial Times has found that users often

include dates in their queriesObvious but effective improvement: allow

users to sort by date

Page 22: Search Analytics for Fun and Profit

Search System:What to change?

Add functionality: Financial Times added spell checking

Retrieval algorithm modifications Financial Times weights company names

higher Netflix determines better weighting for unique

terms and phrases

Deloitte, Barnes & Noble, Vanguard demonstrate that basic improvements (e.g., Best Bets) are insufficient (and justify increased $$$)

Page 23: Search Analytics for Fun and Profit

Navigation:Any improvements?

Michigan State University builds A-Z index automatically based on frequent queries

Page 24: Search Analytics for Fun and Profit

Navigation:Where does it fail?

Track and study pages (excluding main page) where search is initiated What do they search? (e.g., acronyms,

jargon) Are there other issues that would cause

a “dead end”? (e.g., tagging and titling problems)

Are there user studies that could test/validate problems on these pages? (e.g., “Where did you want to go next?)

Page 25: Search Analytics for Fun and Profit

Metadata Development:How do searchers express their needs?Tone and jargon (e.g., “cancer” vs.

“oncology,” “lorry” vs. “truck,” acronyms)

Syntax (e.g., Boolean, natural language, keyword)

Length (e.g., number of terms/query; Long Tail queries longer and more complex than Short Head)

Everything we know from analyzing folksonomic tags applies here, and vice versa

Page 26: Search Analytics for Fun and Profit

Metadata Development:Which values and attributes?

Uncover hierarchy and identify Metadata values (e.g., mobile

vs. cell) Metadata attributes (e.g.,

genre, region) Content types (e.g., spec,

price sheet)

SA combines with AI tools for clustering, enabling concept searching and thesaurus development

Page 27: Search Analytics for Fun and Profit

Metadata Development:Leveraging differences in the curveVariations in information needs emerge

between Short Head and Long TailExample: Deloitte intranet’s “known-

item” queries are common; research topics are infrequent

known-itemqueries

researchqueries

Page 28: Search Analytics for Fun and Profit

Organizational Impact:Educational opportunities

“Reverse engineer” performance problems Vanguard

Tests “best” results for common queries Determines why these results aren’t

retrieved or clicked-through Demonstrates problem and solutions to

content owners/authors benefits

Sandia Labs does same, only with top results that are losing rank in search results pages

Page 29: Search Analytics for Fun and Profit

Organizational Impact:Reexamining assumptions

Financial Times learns about breaking stories from their logs by monitoring spikes in company names and individuals’ names and comparing with their current coverage

Discrepancy = possible breaking story; reporter is assigned to follow up

Next step? Assign reporters to “beats” that emerge from SA

Page 30: Search Analytics for Fun and Profit

SA as User Research Method: Sleeper, but no panaceaBenefits

Non-intrusive Inexpensive and (usually) accessible Large volume of “real” data Represents actual usage patterns

Drawbacks Provides an incomplete picture of usage:

was user satisfied at session’s end? Difficult to analyze: where are the

commercial tools?Complements qualitative methods (e.g.,

persona development, task analysis, field studies)

Page 31: Search Analytics for Fun and Profit

SA Headaches:What gets in the way?

Problems* Lack of time Few useful tools for parsing logs, generating

reports Tension between those who want to perform

SA and those who “own” the data (chiefly IT) Ignorance of the method Hard work and/or boredom of doing analysis

Most of these are going away…

* From summer 2006 survey (134 responses), available at book site.

Page 32: Search Analytics for Fun and Profit

Please Share Your SA Knowledge:Visit our book in progress siteSearch Analytics for Your Site:

Conversations with your Customers by Louis Rosenfeld and Richard Wiggins (Rosenfeld Media, 2008)

Site URL: www.rosenfeldmedia.com/books/searchanalytics/

Feed URL: feeds.rosenfeldmedia.com/searchanalytics/

Page 33: Search Analytics for Fun and Profit

Contact Information

Louis Rosenfeld Rosenfeld Media, LLC705 Carroll Street, #2LBrooklyn, NY 11215 USA

+1.718.306.9396

[email protected]

www.louisrosenfeld.comwww.rosenfeldmedia.com