Near Realtime Processing over HBase

Near-‐Real(me Processing over HBaseRyan Brush@ryanbrush

Topics-The story so far -Complemen8ng MapReduce with stream-‐based processing -Techniques and lessons -Query and search -The future

The story so far...

Chart Search

Chart Search-Informa8on extrac8on -Seman8c markup of documents -Related concepts in search results -Processing latency: tens of minutes

Medical Alerts

Medical Alerts-Detect health risks in incoming data -No8fy clinicians to address those risks -Quickly include new knowledge -Processing latency: single-‐digit minutes

Exploring live data

Exploring live data-Novel ways of exploring records -Pre-‐computed models matching users’ access paLerns -Very fast load 8mes -Processing latency: seconds or faster

And many othersPopula(on analy(cs

Care coordina(onPersonalized health plans

- Data sets growing at hundreds of GBs per day - Approaching 1 petabyte total data - Rate is increasing; expec8ng mul8-‐petabyte data sets

-Analyze all data holis8cally -Quickly apply incremental updates

A trend towards compe8ng needs


MapReduce- (re-‐)Process all data - Move computa8on to data - Output is a pure func8on of the input

- Assumes set of sta8c input

Stream- Incremental updates - Move data to computa8on - Needs to clean up outdated state

- Input may be incomplete or out of order

Both processing models are necessary and the underlying logic must be the same


Speed Layer

Batch Layer

hLp://nathanmarz.com/blog/how-‐to-‐beat-‐the-‐cap-‐theorem.html

http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html

Speed Layer

Batch LayerHigh Latency (minutes or hours to process)

Low Latency (seconds to process)

Move data to computa(on

Move computa(on to dataYears of data

Hours of data

Bulk loads

Incremental updates


hLp://nathanmarz.com/blog/how-‐to-‐beat-‐the-‐cap-‐theorem.html

http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html

Realtime Layer

Batch LayerMapReduce

Storm

Stream-‐based

Hadoop


Into the rabbit hole-A ride through the system -Techniques and lessons learned along the way

Data inges8on

-Stream data into HTTPS service -Content stored as Protocol Buffers -Mirror the raw data as simply as possible

/source:1/document:123/source:2/allergy:345/source:2/document:456/source:2/order:234…/source:n/prescription:789

HBase

CollectorService

Source System 1

Source System 2

Source System N

. . . HTTPS

Scan for updates

Process incoming data- Ini8ally modeled aYer Google Percolator -“No8fica8on” records indicate changes -Scan for no8fica8ons

Data Table

source:1/document:123

source:2/allergy:345


. . .

source:150/order:71

No8fica8on Table


source:150/order:71

But there’s a catch…-Percolator-‐style no8fica8on records require external coordina8on -More infrastructure to build, maintain -…so let’s use HBase’s primi8ves

Scan for updatesProcess incoming data

- Consumers scan for items to process -Atomically claim lease records (CheckAndPut) - Clear the record and no8fica8ons when done - ~3000 no8fica8ons per second per node

Row Key Qualifiers (lease record and keys of updated items)

split:0 0000_LEASE, source:2/allergy:345, source:150/order:71, …

split:1 0000_LEASE, source:4/problem:78, source:205/document:52, …

. . .

Advantages-No addi8onal infrastructure -Leverages HBase guarantees -No lost data -No stranded data due to machine failure

-Robust to volume spikes of tens of millions of records

Downsides-Weak ordering guarantees -Processing must be idempotent -Lots of garbage from deleted cells -Schedule major compac8ons!

-Must split to avoid hot regions -Poten8ally beLer op8ons emerging -Apache Kana with replica8on

Measure Everything

- Instrumented HBase client to see effec8ve performance

- We use Coda Hale’s Metrics API and Graphite Reporter

- Revealed impact of hot HBase regions on clients

The story so far

HBase

CollectorService

Source System 1

Source System 2

Source System N

. . . HTTPS Data Notifications

IncrementalProcessors

Load data

Scan for updates

Into the Storm-Storm: scalable processing of data in mo8on -Complements HBase and Hadoop -Guaranteed message processing in a distributed environment -No8fica8ons scanned by a Storm Spout

Processing with Storm

CollectorService

Source System 1

Source System 2

Source System N

. . . HTTPS Raw Data

HBase

Bolt

Bolt

BoltSpout

Processed Data

Apps

Services

Challenges of incremental updates

-Incomplete data -Outdated state -Difficult to reason about changing state and 8ming condi8ons

Handling Incomplete Data

Row Key Summary Family Staging Family

document:1 page:1

Incoming data

- Process (map) components into a staging family



document:1 page:1 page:3

Incoming data




document:1 page:1 page:2 page:3

Incoming data




document:1 document_summary page:1 page:2 page:3

- Process (map) components into a staging family -Merge (reduce) components when everything is available -Many cases need no merge phase; consuming apps simply read all of the components

Incoming data

Outdated State

Time 0: Alice lives in ChicagoTime 1: Alice lives in New York

Incoming DataChicago resident indexNew York resident index

Processed Data

- Big Data - MapReduce: rebuild processed data

- Outdated state is simply ignored

- Fast Updates - ACID database: simply update Alice’s loca8on

- Big and Fast: it gets complicated

Outdated State: Reconcile on Read

Historical Data (MapReduce

Output)

Incremental Updates

Merge Application

-Akin to Marz’s Lambda Architecture -Data stores op8mized for specific workloads - Keeps processing models independent -Adds complexity at read 8me, but simpler overall

-Marz’s Lambda Architecture

-Not available in commodity app stacks - Probably best approach when and if higher-‐level abstrac8ons emerge

Outdated State: Reconcile on Write

-Marz’s Lambda Architecture

Time 0: Alice lives in ChicagoTime 1: Alice lives in New York

Incoming DataChicago resident indexNew York resident index

Processed Data

- Keep history of your incoming data

- When the event at Time 1 occurs, read that history and update both indexes

- Works with many exis8ng data stores

- Adds complexity to processing logic

- Data store must handle MapReduce and real8me loads -‐-‐ may not be op8mal

Different models, same logic-Incremental updates like a rolling MapReduce -Func(ons are the center of the universe (not InputFormats or Messages)

-Write logic as pure func8ons, coordinate with higher libraries - Storm -Apache Crunch

Gesng complicated?-Incremental logic is complex and error prone -Use MapReduce as a failsafe

CollectorService

Source System 1

Source System 2

Source System N


HBase

Bolt

Bolt

BoltSpout

Processed Data

MapReduce

Apps

Services

Reprocess during up8me

-Deploy new incremental processing logic -“Older” 8mestamps produced by MapReduce -The most recently wriLen cell in HBase need not be the logical newest

Row Key Document Family

document:1 {doc, ts=50}

document:2 {doc, ts=100}

Real 8me incremental update

, {doc, ts=300}

MapReduce outputs

, {doc ts=200}, {doc, ts=200}

Comple8ng the Picture

CollectorService

Source System 1

Source System 2

Source System N


HBase

Bolt

Bolt

BoltSpout

Processed Data

MapReduce

Apps

Services

Comple8ng the Picture

CollectorService

Source System 1

Source System 2

Source System N


HBase

Bolt

Bolt

BoltSpout

Processed Data

MapReduce

Apps

Services

Search Indexes

Building indexes with MapReduce

-A shard per task -Build index in Hadoop -Copy to index hosts

Embedded Solr

Map TaskIndex Shard

Embedded Solr

Map TaskIndex Shard

Embedded Solr

Map TaskIndex Shard

Pushing incremental updates-POST new records -Bursts can overwhelm target hosts -Consumers must deal with transient failures

SolrShard

SolrShard

SolrShard

Replica

Replica

Replica

ProcessorData stream

Pulling indexes from HBase- Custom Solr plugin scans a range of HBase rows - Time-‐based scan to get only updates - Pulls items to index from HBase - Cleanly recovers from volume spikes and transient failures

person:1person:2. . . person:nperson:n + 1….person:m

HBase

SolrShard

SolrShard

Solr

Scan

Scan

Scan

A note on schema: simplify it!

-Heterogeneous row keys efficient but hard to reason about -Must inspect row key to know what it is -Mismatches tools like Pig or Hive

Row Key Qualifiers

person:1/name <content>

person:1/address <content>

person:1/friend:1 <content>

person:1/friend:2 <content>

person:2/name <content>

…

person:n/name <content>

person:n/friend:m <content>

Logical parent per row

-The row is the unit of locality -Tabular layout is easy to understand -No lost efficiency for most cases -HBase Schema Design -‐-‐ Ian Varley at 2012 HBaseCon

Row Key Qualifiers

person:1 name<…> address:<…> friend:1:<…> friend:2:<…>

person:2 name<…> address:<…> friend:1:<…>

. . .

person:n name<…> address:<…> friend:1:<…>

The path forward

This paMern has been successful…but complexity is our biggest enemy

We may be in the assembly

language era of big data

Higher-‐level abstrac(ons for these paMerns will emerge

It’s going to be fun

Ques8ons?@ryanbrush

hLps://engineering.cerner.com

https://engineering.cerner.com

Software

Near Realtime Processing over HBase