How to Spot a Bear - An Intro to Machine Learning for SEO

@TomAnthonySEO

April 2015 - BrightonSEO

HOW TO SPOT A BEAR A Machine Learning Introduction for SEOs

Can you define a list of rules for spotting

bears?

1) Four legs.

Let’s start with:

List of rules (first half):(when I asked in the office)

1. Four legs. 2.Breathes. 3.Furry. 4. Long snout.

List of rules:

1. Four legs. 2.Breathes. 3.Furry. 4. Long snout.

5. Brown. 6.Not always brown. 7. Mammal. 8.No tail.

(how do you spot a mammal?!)

Let’s check our rules…

Rules say:

Harmless Furry Thing (less than 4 legs)

Rules say:

Odd Grey Creature (no long snout)

Remove ‘long snout’, and rules say:

Bear (Extra-terrestrial bear?!)

Our rules suck.

A different bear: Google’s Panda

Can you define a list of rules for spotting spammy pages?

Same problem as bears!

NBED GOOD PAGE

Good page

NBED GOOD PAGE

Commercial page, still good.

Hrm…

Seems legit…

Google can’t write rules.

What we can do is identify spammy or

non-spammy attributes.

Are there adverts on the page?

Are there lots of spelling mistakes?

Is there little text content?

Are there Calls To Action in ALL CAPS?

Some Possible Spam Signals

Smooth segue to:

Machine Learning

List of pages we’ve manually classified.

List of attributes that we believe are important to

classifying pages.

adverts on page?

more than 5 spelling

mistakes?

less than 200 words of content?

CTA in ALL CAPS?

site A Y Y Y Y Spam Site

site B N N Y Y Good Site

site C Y N N N Spam Site

site D N Y N Y Spam Site

site E N Y N N Good Site

Example Data

Neural Networks: A Perceptron

Inputs Output

Neuron

Neural Networks: A Perceptron

Inputs Output

if:inputs >= 1

output TRUE

1 x 0.5 = 0.50 x 0.5 = 01 x 0.5 = 0.50 x 0.5 = 0

1______

Total:Output: TRUE

if:inputs >= 1

output TRUE

1 x 0.5 = 0.50 x 0.5 = 00 x 0.5 = 00 x 0.5 = 0

0.5______

Total:Output: FALSE

if:inputs >= 1

output TRUE

1 x 0.5 = 0.50 x 0.5 = 01 x 0.4 = 0.40 x 0.5 = 0

0.9______

Total:Output: FALSE

if:inputs >= 1

output TRUE

adverts on page?

more than 5 spelling

mistakes?

less than 200 words of content?

CTA in ALL CAPS?

site A Y Y Y Y Spam Site

site B N N Y Y Good Site

site C Y N N N Spam Site

site D N Y N Y Spam Site

site E N Y N N Good Site

Example Data

Untrained Neuron

Is site spam?

adverts

>5 spelling mistakes

< 200 words content

CTA in ALL CAPS

if:inputs >= 1

output TRUE

Training

adverts

< 200 words content

CTA in ALL CAPS

if:inputs >= 1

output TRUE

Training

adverts

< 200 words content

CTA in ALL CAPS

if:inputs >= 1

output TRUE

After training: 4/5 sites correct

Is site spam?

adverts

< 200 words content

CTA in ALL CAPS

if:inputs >= 1

output TRUE

ANNs typically have many neuronssource: http://www.teco.edu/~albrecht/neuro/html/node18.html

Deep Learning

Humans are good at pattern matching

We’re better than machines…source: Pawan Sinha (http://web.mit.edu/bcs/sinha/papers/sinha_recog_review_NN.pdf)

ML can learn to recognise cats from examples

Deep Learning learns more like us

Ok, so what does this have to do with Google?

PandaML based algorithm updates

Old index Caffeine

Caffeine - Infrastructure Update (we believe this made Panda+Penguin possible)

Hummingbird is to ??? as

Caffeine is to Panda+Penguin

Hummingbird Is it similar to Caffeine? Is it the basis for new natural language algorithms?

Where is Google going next with ML?

Image Search 2.0

Image Labelling

Video Labelling

ML Generated Image Descriptions

“Two pizzas sitting on top of a stove top oven”

Natural Language Faceted Search

‘show me olympic athletes' ‘show me the women'

“Find well rated vegetarian cooking books written after 1990”

How about:

Factual Accuracy as a

Ranking Factor

Fact CheckingKnowledge Vault

Idea: Bad Facts

NBED- shot of Google talking about this shit

Estimating ‘Trustworthiness’

Entirely ML Generated Algorithm?

http://dis.tl/ml-algo

Thanks! :)

@TomAnthonySEO

How to Spot a Bear - An Intro to Machine Learning for SEO

Internet

3 x 12 MATRIX...YOU Open Spot Open Spot Open Spot Open Spot Open Spot Open Spot Open Spot Open Spot 27 81 243 729 2,187 6,561 19,683 59,049 177,147 531,441 Number of …

Finding the Sweet Spot with SEO and Social Media #SOCIALCON

How to spot a Bear Market - Brainy's Share Market Toolbox to spot a Bear Market If we can better understand the past, we can better anticipate the future. Robert Brain February 2010

The Big Bear Guide to Blogging SEO - Camp Blogaway 2014

Black Bear - Lee Machinery Bear...Black Bear CHENG DAY MACHINERY WORKS CO., LTD. Black Bear SYSTEMS

How to Spot a Bear - An Intro to Machine Learning for SEO

How to spot important seo industry trends

Black bears American Black Bears Cinnamon Bear Kermode Bear Glacier Bear Florida Black Bear Brown Bears Alaskan Brown Bear Asiatic Brown Bear European

(Pum Hwyaden) · Touch the ground! Teddy bear, teddy bear, Jump up high! Teddy bear, teddy bear, Touch the sky! Teddy bear, teddy bear, Bend down low! Teddy bear, teddy bear, Touch

Brown bear, brown bear

SEO Recommendation | SEO Checklist

How to-spot-bad-seo-services

Bear Chure - Bear Mech

Brown Bear, White Bear

B1 B2 B3 B4 - TEMKIT Cards.pdfPronghorn Ant. POLAR BEAR Panda Bear Black Bear Grizzly Bear BLACK BEAR Panda Bear Polar Bear Grizzly Bear GRIZZLY BEAR Black Bear ... Mountain Goat Pronghorn

seo website, seo top google, seo trang web, seo giá tốt, seo chất lượng

SEO Tutorial - SEO Recommendations - SEO Tips - Practical SEO

How to Spot Bad SEO Services to Spot Bad SEO Services 10 Signs that Your SEO Firm Isn’t Worth the Money ... Rand Fishkin, CEO of SEO software company SEOMoz, cites several reasons

How to spot good seo agency

Bear Creek and Bear Lake Watershed Management Plan€¦ · Bear Creek & Bear Lake Watershed Advisory Committee Bear Creek & Bear Lake Watershed Technical Team Bear Creek & Bear Lake