16
November 18th, 2014 Confidential DataDriven NYC

Bitly // Data Driven NYC // November 2014

Embed Size (px)

Citation preview

November 18th, 2014Confidential

DataDriven NYC

Bitly.is/SocialData

HOW DOES

Data ArchitectureSUPPORT OUR MISSION?

MESSAGE BASED SYSTEM

APP

Messages

MESSAGING DESIGNS

Messages

NSQ

DISTRIBUTION

Messages

NSQ

Worker A

Worker A

Worker A

Worker BAll the Worker A’s share the workload

and process a single copy of all the

messages in aggregate

Scale out Data Processing

DECOUPLING

Worker A, and Worker B each get a

copy of all the messages

Messages

NSQ

Worker A

Worker B

Publish / Subscribe

AKA Multicast

IN PRACTICE @ Bitly

Bitly’s Data Science team wants to

research correlation

between where a brand’s audience

is active and conversion.

Can you set them up to access our Data?

IN PRACTICE @ Bitly

NSQ

Metrics

Archive to Disk

Realtime Data

Analysis

HDFS for Offline Analysis

Decoupling

independent

data needs

makes this

easy to solve

ENRICHMENT

NSQ Worker A

Workers enriches messages for further processing

NSQ

NSQWorker B

Rob Slide #3● A

○ 1○ 2○ 3

ENRICHMENT

{ .... "bitly_user_hash_identifier": "1xTDx93", "LongURL": http://espn.com/, "timestamp": 1416331248”,…}

{ .... "bitly_user_hash_identifier": "1xTDx93", "LongURL": http://espn.com/, "timestamp": “1416331248”, ”Geo_region":” NY”, ”Topic":”news ,sports”,…}

Raw Decode

Annotated Decode

INTEGRATION

NSQ

NSQ

NSQ

NSQ

Bitly Brand Tools Customers R&D

In House DMP

Third party analytics

Marketing Cloud

THANK YOU.@markjosephson @orbitalsander