32
@earnedMarketing EPISODE WASTE

Large Site SEO Architecture - #BrightonSEO 2015

Embed Size (px)

Citation preview

@earnedMarketing

EPISODE WASTE

@earnedMarketing

`

Tomas Vaitulevicius@earnedMarketingHead of Digital Marketing @ JustPark

Apr-15

Jul-15

Oct-15

Jan-16Apr-1

6Jul-1

6Oct-

16Jan-17

Apr-17

Jul-17

Oct-17

0200,000400,000600,000800,000

1,000,0001,200,0001,400,0001,600,0001,800,0002,000,000

SEO Traffic

Aug-08Feb-09

Aug-09Feb-10

Aug-10Feb-11

Aug-11Feb-12

Aug-12Feb-13

Aug-13Feb-14

Aug-14Feb-15

0

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000SEO Traffic

@earnedMarketing

Richard Baxter@richardbaxter

Benjamin Johnson@d00berry

Dean Rowe+DeanRoweSEO

Big thanks for help in putting this content together

@earnedMarketing

SEO ARCHITECTURE

@earnedMarketing

Search Demand

Topic Coverage

Top of Class Content

Dedicated Pages

Flat Prioritised Linking Monitoring

Devising of SEO architecture follows a very similar set of steps at websites of all sizes

@earnedMarketing

Monitoring

Search Demand

Topic Coverage

Top of Class Content

Dedicated Pages

Flat Prioritised Linking

Index management

Thin / duplicate content

Maintenance

Waste

Crawl budget

Cannibalisation

Content churn

Technical Complexit

y

But at large ones there’s a number of other SEO complications that need to be dealt with. This deck focuses primarily on Waste (of crawl budget, Google index & internal link equity)

@earnedMarketing

Nofollow

Robots.txt

Noindex

Canonical

keep Googlebot away from parts of sitekeep Googlebot away from parts of site get parts of site out of Google’s indexsolve duplicate content issues

SEO TOOLKIT

Small Sites

Another difference between small and large site SEO architecture is that the basic SEO tools…

@earnedMarketing

Nofollow

Robots.txt

Noindex

Canonical

keep Googlebot away from parts of sitekeep Googlebot away from parts of site get parts of site out of Google’s indexsolve duplicate content issues

burn internal link equity

burn internal link equity & block inbound link

equitywaste crawl budget and burn 15% of link equity

waste crawl budget, burn 15% of link equity

& add uncertainty

SEO TOOLKIT BAND AIDS

Small Sites Large Sites

…become pretty damaging on scale

@earnedMarketingThis matters because PageRank is still the foundation of Google’s crawl and indexation

@earnedMarketing

Search

Listing

Home

Category

Let’s say we have a small website

@earnedMarketing

Search 0.61

Listing

Home 0.85

0.72

1

0.72

0.61

Category0.85

1

With 1 unit of PageRank arriving to the homepage and cascading down through links

@earnedMarketing

Search 0.15

Listing

Home 0.85

0.36

1

1

0.36

0.15

Category0.85

0.36

0.15

Dead-end

Dead-end

If we add a couple of links patched with the SEO band-aids (nofollow or Robots.txt Disallow), we’ll make half of the link equity of Category and Search pages evaporate from our site

@earnedMarketing

Search 0.15

Listing

Home 0.85

0.36

1

1

0.36

0.15

Category0.85

-75%

0.36

0.15

Dead-end

Dead-end

Making the listing page 75% weaker. Inefficiencies like these are killing large site SEO as pages with little PageRank don’t get crawled and indexed, and obviously won’t get any traffic

@earnedMarketing

IT IS HARD!

Huge amounts of waste and damaging effects of SEO Band Aids do make Large Site SEO Architecture pretty d*mn hard

@earnedMarketing

Page OrientedArchitecture

Destination Oriented

Architecture

Single Page Application

But we found inspiration in the new technology of Single Page Applications for a new approach to SEO architecture which fixes the problems rather than patching them up

@earnedMarketing

justpark.com/london/justpark.com/london/?page=2justpark.com/london/?sort=pricejustpark.com/listing-1/justpark.com/listing-1/photosjustpark.com/listing-1/savejustpark.com/listing-1/enquirejustpark.com/listing-1/bookjustpark.com/forgot-password

page page? page? page page?page?page?page?page?

destination destination destination destination destinationdestinationdestinationdestinationdestination

PAGES vs DESTINATIONS

In Destination Oriented Architecture we want to identify the canonical pages/URLs that represent real Destinations targeting SEO Topics and “kill” all of the other publicly available URLs

@earnedMarketing

1 SEO Topic = 1 Destination1 Destination = 1 SEO TopicNo SEO Topic = No Destination

We want to have as many distinct Destinations as we have different SEO Topics we’re targeting. And all the supplementary content and functionality to live within these Destinations

@earnedMarketing

DESTINATION TO CRAP RATIOS

Usage

Internal links

Index

Crawl

0% 20% 40% 60% 80% 100%

We use Destination to Crap ratios to gauge how well we’re doing on the journey to a fully Destination Oriented Architecture (it’s also helpful in getting buy-in from the different stakeholders as no one wants to think of their platform as being 80% crap or waste)

@earnedMarketing

Crawl – split of Googlebot crawl hits in your access logs between (exact) destination URLs and notIndex – all of your destinations should be in your sitemaps that are submitted to the Google Search Console. Index ratio is = Indexed Destinations (GSC > Crawl > Sitemaps) vs Total Indexed (GSC > Google Index > Index Status) - Indexed DestinationsInternal Links – all internal links from a web crawl (Screaming Frog, Deep Crawl, etc.) split between the ones pointing to (exact) destination URLs and notUsage – page views of your users (web analytics) split between (exact) destination URLs and not

METHODOLOGY

@earnedMarketing

REAL WORLD EXAMPLES

@earnedMarketing

rightmove.co.uk/fees.html?listing_id=165467654justpark.com/parking-spaces/…/callout-snippet/ > Js-off – host content within

a relevant destination and link with in-page anchors> Js span trigger for preloaded or AJAX lightbox

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

SUPPLEMENTARY CONTENT

@earnedMarketing

rightmove.co.uk/property/London.html/svr/2124;jsessionid=9BE1415794CEDC5590B1FA11B8817DE0

> Exclude for bots > Move to cookies> Go stateless (in extreme circumstances carrying the state in POST form hidden fields)

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

SESSION PARAMETERS

@earnedMarketing

http://ww.just-park.co.uk/uk/parking/London >>https://www.justpark.com /uk/parking/london / > Catch-all 301 redirects in

server config > http <> https > non-www <> www < unrecognised subdomains > upper case > lower case > no trailing slash <> trailing slash

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

ALTERNATIVE URLs

@earnedMarketing

instagram.com/accounts/login/?next=%2Fabout…distilled.net/store/profile/login/?next=/resources/ > Hash parameter for Js

> HTTP Referrer> Cookies / LocalStorage> Lightbox login form

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

FORWARDING PARAMETERS

@earnedMarketing

ufc.com/fightweek?utm_campaign=Intl+Fight+…ted.com/?utm_medium=email&utm_source=Oxford…> Special URL tracking redirect

loop> Hash (#) parameter based traffic source tracking

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

TRACKING PARAMETERS

@earnedMarketing

justpark.com/london/…/garden-car-park/?start_date=2015-08-16&end_date=2015-08-16&start_time=… > Omit on default

> Server session (but better not)> Hash parameter for Js> Cookies / LocalStorage

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

FUNCTIONAL PARAMETERS

@earnedMarketing

worldbank.org/…/modules/economic/gnp/print.htmlrightmove.co.uk/…/print.html?listingId=47812940 > Print Stylesheet! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

PRINT VERSION

@earnedMarketing

justpark.com/…/book/?listing_id=148685&…rightmove.co.uk/addtoshortlist.html?listing_id=478129

> Logged-out version link to /login#forwarding=xxx> Logged-out version Js span trigger login lightbox> POST to the product URL> AJAX for logged-in

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

LOGGED-IN FUNCTIONALITY

@earnedMarketing

justpark.com/uk/parking/brighton/?page=2rightmove.co.uk/…/London.html?sortType=1

> InPage-only AJAX manipulations> Cookies> Hash parameters and on load AJAX processing

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

SEARCH PAGINATION & FILTERS

@earnedMarketing

gumtree.com/search?q=car&tq=%7B%22i%22%3A...ebay.co.uk/sch/i.html?_nkw=car&_from=R40&_tr…> Canonicalising redirects

> Js search form pointing to the canonical URL> AJAX search with pushState canonical URLs (SPA)

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

DYNAMIC SEARCH URLS

@earnedMarketing

rightmove.co.uk/…/terms-of-use-and-privacy-policyjustpark.com/uk/airport-parking/

> Js span triggered AJAX lightbox> Shortlisting only relevant resources by page type (homepage, search, etc.)> Merge multiple site-wide-linked pieces into a single location with hash deep links

! Crawl Waste! Littered Index! Duplicate Content! Thin Content! Wasted Internal Link Equity ! Scattered Inbound Link Equity

SITE-WIDE LINKS (HEADER / FOOTER)

@earnedMarketing

And, please, crawl your sites to make sure you’re not linking to URLsthat redirect or canonicalise to other URLs!..