Data Mining in Business Intelligence 7 March 2013, Ben-Gurion University
Analysing Search TrendsYair Shimshoni, Google R&D center, [email protected]
Outline
● What are search trends?
● The Google Trends tool
● Search Trends Analysis
● Using Search Trends for Business Intelligence
● Nowcasting with Google Trends & Google Correlate
Search Trends● More than a billion people around the globe are searching the web using search
engines. Through search, users express their interest:to know, to see, to hear, to do, to buy, to sell, to compare, to study, to travel, ...
Therefore, the aggregated search patterns of Internet users w.r.t. various terms convey valuable information on people’s interests, preferences and intentions.
● These search patterns are represented by time series called Search Trends. ● Each data point represents the aggregated amount (or frequency) of searches for a
given search term, in a given place, at each day/week/month.● Search Trends represent what people are searching for, when, where and how
much. Therefore, business questions can be examined through the proxy of Search Trends, that can answer questions like:○ Was my TV campaign effective?○ Is my brand’s awareness growing?○ What is my share of voice w.r.t. my competitors?○ What do customers associate with my brand?○ What competitor products are considered by buyers along with my product?○ Where should I open my new store?
Search Trends Characteristics● Search Trends share some fundamental characteristics which may vary, depending
on the context of the corresponding search terms.● Often Search Trends show regular behaviors like a constant increase, a seasonal
pattern or recurring spikes, while other Trends may be highly irregular.
● We have conducted experiments that associated such characteristics with the notion of Predictability of Search Trends in Google. We showed that over a half of the Trends of the most popular search terms are predictable*.
Seasonal Search Trends (can you guess...?)
Seasonal Search Trends (examples)
ski
world cup
כוכב נולד , הישרדות
Non-Seasonal Search Trends examples (can you guess...?)
Non-Seasonal Search Trends examples (examples)
FTP
facebook , myspace
madonna , spice girls , lady gaga
The Google Trends Tool● Google is sharing with the public the aggregated and anonymized search trends of
users searches across its domains worldwide, and exposes this information via a free analytic tool called: Google Trends (f.k.a. Google Insights for Search)
● Google Trends analyzes a portion of all Google searches (from 2004 to the present) to show trends and patterns of what the world is searching for...
● Google Trends is updated on a daily basis with a delay of 1-2 days.● Any exposed data is strictly based on multiple distinct users to protect privacy.● Google Trends exposes search trends data and analysis in several dimensions:
○ Time: Daily/Weekly/Monthly granularity.○ Geo Location: Countries, Regions and Cities (including Metro Areas in the US).○ Category: 1000+ Search categories in a hierarchical taxonomy.○ Search property: Web, Image, News & Product Search.○ Associated searches: Top & Rising related searches
● Google Trends is simple & interactive, it can be embedded as a gadget and allows for data exports to CSV files.
● This tool and data can be useful to anyone, especially to:○ Analysts & researchers○ Journalists & bloggers ○ Marketers & advertisers○ Scholars
The Google Trends Tool www.google.com/trends
Recurring Spikes of Search Interest
Alternating seasonality across geographies
YoY Seasonality: The Football season in the US
Classifying searches to categories
Taxonomy of Categories + ML classification engine● Each search query is classified into a hierarchical taxonomy of 1000+
search categories depending on it's nearby context (queries before & after).● Filtering the trends by the category enables to 'clean' the data and observe
the underlying patterns of search behavior in a specific category/market.● Often, the aggregated trends of the entire search category represent the
major underlying dynamics of the category.
(C) Google Inc.
Aggregated Category Trends (can you guess...?)
Automotive
Computers & Electronics
Sports
Entertainment
Aggregated Category Trends (examples)
Search Trends Research● Examine the regularity, seasonality & predictability of Search Trends.
● Conduct correlation analysis, clustering & profiling of the Trends space.
● Use time series prediction methodologies to forecast Search Trends.
● Forecast users interest & analyze business cycles using search trends.
● Examine the dynamics of Co-Searching of search terms.
● Define relatedness metrics & investigate association between search terms.
● Comparing & integrating query data with other online/offline data sources.
● Examine the 'flow' of a web phenomena and analyse its geo-propagation.
Clustering Analysis:Common Patterns of Search Behavior
(C) Google Inc.
(standardized monthly search data for 30 popular US queries - automatically clustered)
Example of yearly seasonality with peaks before the holidays (+ upward trend):
(C) Google Inc.
Example of yearly seasonality with peaks in the summer and lows in the winter (+ downward trend)::
(standardized monthly search data for 30 popular US queries - automatically clustered)
(C) Google Inc.
Example of a long term downward trend:
(standardized monthly search data for 30 popular US queries - automatically clustered)
(C) Google Inc.
(C) Google Inc.
● Over half of the most popular Google search queries were found predictable in 12 month ahead forecast, with a mean absolute prediction error of approximately 12% on average.
● Some categories have particularly high fraction of predictable queries:○ Health (74%)○ Food & Drink (67%)○ Travel (65%)
● Some categories have particularly low fraction of predictable queries:○ Entertainment (35%)○ Online Communities & Social Networks (27%)
● Trends of aggregated categories were found even more predictable:○ 88% of the aggregated category trends of over 600 categories are predictable
with a mean absolute prediction error of less than 6% on average.
which queries are predictable?
Predictability of Search Trends
Predictability of Search Trends (results)
● Formulation of Predictability w.r.t. a model, error def + thresholds.● Explored trends from various countries and search categories.● Examined the influence of different trends' characteristics.
On the Predictability of Search Trends / Y. Shimshoni, N. Efron, Y. Matias, 2009
● Competitive Landscape & Market Share Analysis
● Measuring Brand Awareness and other brand tracking metrics
● Attributes Associations
● Targeting & Positioning Strategies
● Profiling of Customers' and Understanding their Preferences
● Planning, Media Mixing & Econometric Modeling
● Campaign Effectiveness & Benchmarking
● Business Development & New Product Decisions
● Detection of Market Changes and New Trends
● Prediction of Demand & Sales
● Nowcasting and Macroeconomic Monitoring
Analysing Search Trends for Business Intelligence on:
Examples: Brand & Campaign Analysis
We use the proxy of searches for brand names/terms to estimate Market Shares:
We can demonstrate the Campaign Effect and compare it to benchmark trends:
(C) Google Inc.
Analyzing deviations from modeled prediction● Model the aggregated trends of 4 sub-categories in the US Auto Industry.● Forecast a period around the economic crisis: Aug 2008 – July 2009.● Examine the deviations of the prediction model in these 4 sub-categories. ● It seems evident that this industry went through a significant change:
More search traffic than predicted Less search traffic than predictedVehicle Maintenance Vehicle Shopping
Auto Parts Auto Financing
Nowcasting with Search Trends● Many real-life time series, like macro-economic indicators and many market signals
can be modeled using Search Trends.● Choi and Varian 2009,2011 have used Search Trends of single queries and
aggregated search categories to ‘predict the present’, where they predict the present levels of indicators like car sales or unemployment levels, before their official publication.
● For example, the rate of unemployment y may be predicted by the query x job search or new auto purchases may be predicted by the search category vehicle shopping. So, we can use AR model: yt = a0+a1yt-1+ a12yt-12+ bxt + et
● Such Nowcasting models are much faster & cheaper to build and run than traditional measurement methodologies.
● In the recent years over 60 academic articles were published on using Google Search Trends for nowcasting in various fields (including several central banks). The overall finding is that Search Trends significantly improve the forecast accuracy of traditional models.
● Since there are millions of search queries and hundreds of search categories, comparing to about 100 monthly data observations in most real life time-series, One can use the Google Correlate Tool, to suggest correlating Trends.
Google Correlate
www.google.com/trends/correlate