Datacenter@Night: How Big Data Technologies Power Facebook

Kannan Muthukkaruppan & Karthik RanganathanJun/20/2013

How Big Data Technologies Power Facebook

How Big Data Technologies Power FacebookKarthik RanganathanSeptember, 2013

Introduction

Email: karthik@nutanix.comTwitter: @KarthikRCurrent: Member of Technical Staff, NutanixBackground: Technical Engineering Lead at Facebook. Co-built Cassandra for Facebook Inbox Search and improved performance and resiliency of Hbase for Facebook Messages and Search Indexing.

Agenda

Big data at Facebook HBase use cases

• OLTP• Analytics

Operating at scale The Nutanix solution

Big Data at Facebook

OLTP• User databases (MySQL)• Photos (Haystack)

• Facebook Messages, Operational Data Store (HBase) Warehouse

• Hive Analytics• Graph Search Indexing

HBase in a nutshell

Apache project, modeled after BigTable Distributed, large scale data store Built on top of Hadoop DFS (HDFS) Efficient at random reads and writes

FB’s Largest Hbase Application

Facebook Messages

The New Facebook Messages

Why HBase?

Evaluated a bunch of different options• MySQL, Cassandra, building a custom storage system for

messages

Horizontal Scalability Automatic failover and load balancing Optimized for write-heavy workloads HDFS already battle-tested at Facebook HBase’s strong consistency model

Quick stats (as of Nov 2011)

Traffic to HBase• Billions of messages per day• 75B+ rpc’s per day

Usage pattern• 55% reads, 45% writes• Average write: 16 KV’s to multiple CF’s

Data Sizes

7PB+ online data• ~21PB with replication• LZO compressed• Excludes backups

Growth rate• 500TB+ per month• ~20PB of raw disk per year!

Growing with size

Constant need of features with growth Read and write path improvements

• Performance optimizations• IOPS reduction• New database file format

Intelligent data and compute placement• Shard level block placement• Locality based load-balancing

Other OLTP use cases of HBase

Operational Data Store Multi-tenant KeyValue store Site integrity – fighting spam

Warehouse use cases of HBase

Graph Search Indexing• Complex application logic• Multiple verticals

Hive over HBase• Realtime data ingest• Enables real-time analytics

Real-time monitoring and anomaly detection

Operational Data Store

ODS: Facebook’s #1 Debugging Tool

Collects metrics from production servers

Supports complex aggregations and transformations

Really well-designed UI

Quick stats

Traffic to HBase• 150B+ ops per day

Usage pattern• Heavy reads of recent data• Frequent MR jobs for rollups• TTL to expire older data

Real-time Analytics

Facebook Insights

Real-time URL/Domain Insights

Deep analytics for websites• Facebook widgets

Massive scale• Billions of URL’s• Millions of increments/sec

Detailed Insights

Tracks many metrics• Clicks, likes, shares,

impressions• Referral traffic

Detailed breakdown• Age buckets, gender,

location

Controlled Multi-tenancy

Generic KeyValue Store

A Multi-tenant solution on HBase

Generic Key-Value store• Multiple apps on the same cluster• Transparent schema design• Simple API

put(appid, key, value)value = get(appid, key)

Architecture

put(appid, key, value)

Memcache

get(appid, key)

ReadWrite

Multi-tenancy Issues

Not a self-service model• Each app is reviewed

Global and per-app metrics• Monitor RPCs by type, latencies, errors• Friendly names for apps

If things went wrong• Per-app kill switch

Powering FB’s Semantic Search Engine

Graph Search Indexing

Framework to build search indexes

Multiple, independent input sources HBase stores document info Output is the search index image

rowKey = document idvalue = terms, document

Architecture

HBase cluster

Document

source 2Document source 1

MR cluster

…Image files…

Do’s and Do-Not’s From Experience

Operating at Scale

Design for failures(!)

Architect for failures and manageability No single point of failure

• Killing any process is legit

Minimize manual intervention• Especially for frequent failures

Uptime is important• Rolling upgrades are the norm• Need to survive rack failures

Dashboard and Metrics

Single place to graph/report everything RPC calls SLA misses

• Latencies, p99, Errors• Per-request profiling

Cluster and node health Network Utilization

Health Checks

Constantly monitor nodes Auto-exclude nodes on failure

• Machine not ssh-able• Hardware failures (HDD failure, etc)• Do NOT exclude on rack failures

Auto-include nodes once repaired Rate limit remediation of nodes

In a nutshell…

Use commodity hardware Scaling out is #1 Efficiency is #2

• though pretty close behind scale-out

Design for failures• Frequent failures must be auto handled

Metrics, Metrics, Metrics!

Overview through comparison

The Nutanix Solution

Nutanix compared with HBase

Evaluated a bunch of different options• MySQL, Cassandra, building a custom storage system for

messages

Horizontal Scalability Just add more nodes to scale out

Automatic failover and load balancing When a node goes down, others take its place automatically Load of node that went down is distributed to many others

Nutanix compared with HBase philosophy

Optimized for write-heavy workloads Optimized for virtualized environments Read and write heavy workloads Transparent use of flash to boost perf

HDFS already battle-tested at Facebook Nutanix is also quite battle-tested

HBase’s strong consistency model Nutanix is also strongly consistent

Other aspects of Nutanix

Architected for failures and manageability No single point of failure Minimal manual intervention for frequent failures

Uptime is important Rolling upgrades are the norm• Need to survive rack failures

Single place to graph/report everything Prism UI to report and manage the entire cluster

Constantly monitor nodes Auto-exclude nodes on failure

In a nutshell about Nutanix…

Runs on commodity hardware Scaling out is #1

Drop in scale out for nodes

Efficiency is #2 Constant work on perf improvements

Design for failures Frequent failures auto handled Alerts in UI for many other states

Metrics, Metrics, Metrics! Prism UI gives insights into the cluster health

Questions?

Thank You

NUTANIX INC. – CONFIDENTIAL AND PROPRIETARY

Datacenter@Night: How Big Data Technologies Power Facebook

Business

DataCenter 2012

Datacenter Care

Applied Machine Learning at Facebook: A Datacenter ......Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective FACEBOOK, INC PRESENTED BY RAVI RAHMAN Overview

Pengenalan Datacenter

Adam Datacenter

Misha Smelyanskiy… · Misha Smelyanskiy Director, AI System Software/Hardware, Facebook. Challenges and Opportunities of Architecting AI Systems at Datacenter Scale Misha Smelyanskiy

Datacenter Computers

EVERY THURSDAY NIGHT Schnitzel - Ibis Styles Albany · Schnitzel EVERY THURSDAY NIGHT $18 only PARMIGIANA ADDITIONAL Night add your FAVOURITE TOPPING find us on facebook #angusandco

DataCenter Design

2020 Datacenter Leasing by City2020 Datacenter Leasing by

Datacenter Management and Virtualizationdownload.microsoft.com/documents/uk/enterprise/57... · Datacenter Management and Virtualization ... Datacenter infrastructure and applications

Datacenter overview

Datacenter guide

Primary Datacenter Secondary Datacenter - Koloffon Eureka · Primary Datacenter vCloud Connector vCenter Server ... Software-defined Datacenter Compute Cluster 01 ... 192.168.1.100

Google Datacenter

DataCenter - Overview

Datacenter Ethernet

Applied Machine Learning at Facebook: A Datacenter ... · PDF fileApplied Machine Learning at Facebook: A Datacenter Infrastructure Perspective Kim Hazelwood, Sarah Bird, David Brooks,

Datacenter Virtualisation …

Datacenter Networks