Inform: Targeting the Interest Graph

Preview:

DESCRIPTION

Personalization of content and ad selection using the Inform Service

Citation preview

Targeting the Interest Graph:

Marc Hadfield CTO, Inform Semantic Technology Conference, 2011

Personalization of content and ad selection using the Inform Service

Introduction Marc Hadfield is CTO of Inform Technologies.

Interests: Natural Language Processing, Semantics, Life Science Graph Algorithms, Machine Learning, Big Data

Inform Technologies is a semantic technology company.

Inform provides semantic technology – NLP and Analytics to Publishers, and operates a user generated forum site Yuku.com.

We at Inform have been evolving our technology to the user generated content space. We’ve adapted our technology to different kinds of content such as informal text, photos, videos, and questions.

We’ve recently addressed Ad Selection, Video Selection, and Personalization.

I’ll discuss some of our results with the Interest Graph.

2

Inform Service

Semantic Software-as-a-Service for Publishers

Advantage: ~30% boost in engagement in “traditional” publisher websites.

Tracks 4,000+ Subjects and 320,000+ Entities: Inform Topics

Inform Service: –  In-Article links to Topics Pages –  Related Articles from the Archive –  Related Articles around the Web –  Related Photos –  Related Videos –  Topic Pages including mix of content sources –  Tools (Publishing Tools, etc.)

3

Inform Publisher Customers

4

Yuku Forums   Forum Content

–  “Old School” user generated content –  ~40,000 forums –  Top 100 forums account for about 50% of traffic –  ~1 Billion short form content pieces –  ~1 Million monthly unique users –  ~150K new content objects per day –  ~1 Million Page Views per Day

  Subscription / Advertising Revenue   Inform adapting / integration our Semantic Tech

  Great laboratory for testing algorithms / theories –  Apply more broadly than Yuku platform

  Nice A/B testing environment   Testing new algorithms on our ForumFind search engine

–  And embedded widgets in Yuku

  Good reason to improve Ad Selection

5

6

Occam

Today: Personalization for Enhanced Targeting

•  Capturing the Interest Graph

•  Personalized experience   Help People find interesting content   Make Ads relevant

7

Inform Content & Analytics Platform

Publisher site Widgets Yuku

Licensed / Crawled Content

3rd Party / Activity Data

Core Engine “Occam”

Content Distribution

Content / Data Ingestion

Text Analysis

Categorization / Personalization

Algorithms

Inform “Occam” Architecture

8

Receive Message

• REST Webservice Call • Queue

Extract

• Get URL • Extract Document Features • Extract Text

NLP

• NLP Features (Machine Learning) • Inference Engine (Prolog / Frame Logic) • Discourse / Behavior / Sentiment Models (Prolog / Frame Logic) (New)

Analysis

• Trend Analysis (incremental data) • Graph Analysis (incremental data)

Reply

• Store in Semantic Repository (if needed) • Send Reply Message (via Queue or Webservice)

Example Workflow:

Inform API   REST Based   Queue for high volume content exchange   Returns data in RDF, XML, or JSON   All Content has a URI   All Inform Topics have URIs (can be dereferenced)   Insert Content, Update Content, Delete Content   Login / Logout   Change Status of Content (Published, Unpublished)   Content can be “GET”

–  Associated Topics (Subjects and Entities) returned –  Include scores

  Search Inform Topics   Semantic Search

–  Simplified queries (not full sparql) –  Typical Query: Get Content of Type “Article” about “Barack Obama”

ranked by score

9

Inform API (2)   Related Content

–  Articles, Messages, Photos, Videos, Questions, Web

  AdContext™ (new) –  URL IAB Topics + Inform Topics

  VideoContext™ (new) –  URL Inform Topics –  Related Videos

  InterestGraph (new) –  Parameters: user-id / session-id Inform Topics

  Personalized AdContext™ (new) –  URL + session-id / user-id (anonymized) IAB Topics + Inform Topics

10

AdContext™: IAB Ad Standards IAB (Interactive Advertising Bureau) Standard to return a set of

metadata about a website, webpage, section of a webpage to assist advertising within web content.

Defines how a Topic may be associated with web content.

Defines a set of standard upper level Topics such as “Science”, “Sports”, and “Business”, and mid-level Topics such as “Golf” and “Fashion”. These are tier-1 and tier-2.

Inform has aligned the IAB Topics with Inform’s Topics. Inform can deliver more specific Topics (the full set of Inform Topics) as “tier-3” IAB Topics.

The AdContext™ service returns this metadata. Ad Networks may use the service to assist in ad selection.

Semantic Ad Selection may improve yield 2X – 5X (as per various external studies).

11

Aside: rNews RDFa Standard rNews: embedding metadata in online news

rNews is a proposed standard for using RDFa to annotate news-specific metadata in HTML documents. The rNews proposal has been developed by the IPTC, a consortium of the world's major news agencies, news publishers and news industry vendors. rNews is currently in draft form and the IPTC welcomes feedback on how to improve the standard in the rNews Forum.

http://dev.iptc.org/rNews

Why? SEO, Rich Snippets, Reduce “scrapper” error, better metadata.

Inform API returns via the API rNews metadata ready to embed in news articles (in testing).

12

Publisher Customer Example:

13

Inform automatically tags entities (people, places, companies, and organizations) and provides related topics, articles, and media

The Related News Widget pulls in the most relevant and recent articles from within the New York Daily News Archive

Customer Example:

14

Inform’s tags can be brought together in numerous ways to create a richer experience for consumers

Inform also generates highly engaging and relevant slideshows

Demo Inform API w/Facebook

15

How to connect Inform to the social graph?

Demo Inform API w/Facebook

16

Demo Inform API w/Facebook

17

Demo Inform API w/Facebook

18

Inform Topics mapped to Wikipedia Pages and thus to other Concepts – including the Facebook “Like” Graph

Interest Graph •  Inform Topics

  4,000+ Subjects in Hierarchy (SKOS)

  320,000+ Entities   Wikipedia Pages   Wikipedia Categories   Inform “same-as” links to

Wikipedia

•  1 Million+ Monthly Unique Users

•  ~1 Billion content pieces total   Forum Messages, Replies,

Photos, Videos

•  150K new content pieces per day

•  1 Million+ PageViews per Day

•  ~5 Million ads serviced per Day

19

Goal: Link Users to Topics for selection of content and ads

Personalization Signals •  Content is “about” a Topic (subject or entity)

•  User submits Content (“write”)   Message, Reply, Photo, Video, Question, …

•  User reads Content (“view”)   Message, Reply, Photo, Video, Question

Trends / Global Aggregation:

•  Importance Metric

•  Bursty / Velocity

•  Sentiment ( “:-)”, “LOL”, …)   “Like” the topic? “Dislike” the topic? Context?

–  i.e. dislike a Football Team, so “likes” to hear when they lose (negative sentiment)

•  Other features… 20

Interest Graph Algorithms Criteria:

•  Near Real-Time

•  Highly parallel to allow for scaling

•  Fuzzy Data, Flexible data model

Implementation:

•  General Graph Representation   Node Weights, Edge Weights, Node Types, Edge Types

•  Graph walk to extract a User’s Interest Graph

•  Parallel Message-Passing Algorithms for Graph Analysis   Importance, PageRank, Centrality   Spreading Activitation   Pregel-like implementation (Signal/Collect)

•  Add Graph Analytics to Workflow 21

Neighborhood around JJB User

22

Niketalk User Interest Graph (local)

23

Without global importance metric:

Niketalk User Interest Graph (global)

24

With global importance metric: Recommendations can be made reflecting the shifting interests of the global community.

Example Yuku Forum - Gymnastics

25

ForumFind – “laboratory”

26

ForumFind – Topic, Ad, Content

27

ForumFind – MyForumFind (user: jjb2 )

28

Interest Graph – User Insights •  “Everybody Lies” (“House” TV Show)

–  The only way to know the users interests is to have an implicit channel to detect interests without impacting user behavior

•  People have broad / dynamic interests

•  People read “trash” –  i.e. everyone reads Celebrity Gossip –  If convenient / no one looking

•  Global Data can be used to make recommendations   No surprise, but nice to have confirmation

•  People move on   “Likes” need to expire

•  Recommendations for content and ads can be implemented in a highly dynamic and parallel fashion running in real time with reasonable resources using graph analysis

29

Interest Graph – Conclusion

•  Using a User’s Graph of Interests can dramatically improve the user’s engagement   Data still being gathered within Inform as to percentage

increase, but so far very encouraging numbers!

•  The Inform Service can be used to implement a more personalized content and ad experience with minimal implementation effort.

•  Talk to me about using our API!

30

Thank You!

Questions?

Marc Hadfield CTO, Inform Technologies marc@inform.com

31

Example CMS Integration

32

Published Article:

33

Recommended