Upload
oncrawl
View
1.928
Download
0
Embed Size (px)
Citation preview
#OnCrawlBreakfast
(and make Technical SEO great again)
@FrancoisGoube, CEO @Oncrawl
How to optimize your crawl budget?
Who am I?
Francois GoubeFounder @OnCRAWL15 years SEO experience, Serial
Entrepreneur. French Majestic
Ambassador
Semantic Nerd
Data addict & SEO
maniac
Make your inner SEO super hero grow up
Super-powers = Knowledge + Tools
What Google says about « Crawl Budget »
If new pages tend to be crawled the same day they'republished, crawl budget is not something webmasters need
to focus on.
[…] if a site has fewer than a few thousand URLs, most of the time it will be crawled efficiently.
[…] we don't have a single term that would describeeverything that "crawl budget" stands for externally.
https://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html
Understanding Google’s crawl budget
Your Website What Google really knows!
This is what a log file looks like
What your log files look like in OnCrawl
• All bots data
• Status codes
• Crawl frequency
• List of URL fetched by bots
• All referring traffic data
• Active pages
• Freshrank
• …
You know what Google did!
• 100% of your GSC Properties show exploration
statistics
• With log analysis you can detect errors in Bots
behaviour
• Bad internal linking structure, Pagination,
Facetting, Orphan pages or spider trap can affect
Google’s ability to explore your website properly.
For all our customers, an optimisation of Crawl
Budget leads to better rankings
Every webmaster should keep an eye on his Crawl Budget
Understanding Google’s crawl budget
Google Crawl Budget
“Taking crawl rate and crawl demand together we define crawl budget
as the number of URLs Googlebot can and wants to crawl.”
So everyday Google crawls a determined number of pages
As an SEO you need to help Google crawl your Money Pages
Understanding Google’s crawl budget
Understanding Google’s crawl budget
Understanding Google’s crawl budget
Understanding Google’s crawl budget
Google patents about Crawl rate & Demand
• US 8666964 B1 : Managing items in crawl schedule
• US 8707312 B1 : Document reuse in a search engine crawler
• US 8037054 B2 : Web crawler scheduler that utilizes sitemaps from websites
• US 7305610 B1 : Distributed crawling of hyperlinked documents
• US 8407204 B2 : Minimizing visibility of stale content in web searching
including revisine web crawl intervals of documents
• US 8386459 B1 : Scheduling a recrawl
• US 8042112 B1 : Scheduler for search engine crawler
Crawl Scheduling is the big thing!
Understanding Google’s crawl budget
Page Importance
« Page Importance » is not the PageRank
• Where is the page in my architecture? – Depth influences Crawl ratio
• Page Rank: TF/CF of a page - Majestic
• Internal Page Rank – InRank OnCrawl
• Type of document: PDF, HTML, TXT
• Inclusion in sitemap.xml
• Quality of anchors
• Quality of content: Nb of Words, Near duplicates
• …
Combine all these pieces of Data with your logs!
Understanding Google’s crawl budget
What are the Factors influencing
Google’s Crawl Budget?
All websites
Are not born equal
Which Ranking Factor affects Crawl Ratio?
Which Ranking Factor affects Crawl Ratio?
Which Ranking Factor affects Crawl Ratio?
Which Ranking Factor affects Crawl Ratio?
Which Ranking Factor affects Crawl Ratio?
Crawler3 000 000
Google7 000 000
All pages available from
your linking structure
All pages known by
Take care of your Orphan Pages
Orphan pages
Are pages that are not
linked from your internal
linking structure,
but that Google knows
Take care of your Orphan Pages
Orphan pages generic cases
• Ecommerce:
• Out of stocks products
• No longer available products
• Revamping of menus, pagination, facetting,…
• Media
• Bad internal linking structure
• Archives only available through Sitemap.xml
Main problems:
• those pages don’t receive any linkjuice –> chances are they can’t rank!
• You are wasting Google Crawl budget
Take care of your Orphan Pages
How to deal with your Orphan pages?
Is it Normal?Redirect
301
Noindex via Robots.txt
Yes
No
Do they receive Organic Traffic?
Yes
Is the page Valuable for my
current Business?
No
No
Can’t answer questions?
Yes
Ask an expert!
Add link from
structure
Take care of your Orphan Pages
What ROI can I espect?
Less pages crawled
(Unuseful
& Inactive)
More useful
pages &
active pages
Better indexation
Better Internal
Popularity
Boost
Your
Organic traffic
Grow your Organic Traffic with less Pages
Thank you !
@OnCrawl – Booth 29
Try to win a 1-year Pro Subscription!