25
Operating HBase – Things You Need to Know Christian Gügi

Apachecon Europe 2012: Operating HBase - Things you need to know

Embed Size (px)

DESCRIPTION

If you’re running HBase in production, you have to be aware of many things. In this talk we will share our experience in running and operating an HBase production cluster for a customer. To avoid common pitfalls, we’ll discuss problems and challenges we’ve faced as well as practical solutions (real-world techniques) for repair. Even though HBase provides internal tools for diagnosing issues and for repair, running a healthy cluster can still be challenging for an administrator. We'll cover some background on these tools as well as on HBase internals such as compaction, region splits and their distribution. We'll also introduce our tool to visualize region sizing and distribution in the cluster, that we recently open sourced.

Citation preview

Page 1: Apachecon Europe 2012: Operating HBase - Things you need to know

Operating HBase – Things You Need to Know

Christian Gügi

Page 2: Apachecon Europe 2012: Operating HBase - Things you need to know

2

Outline● HBase internals

● Overview of HBase utilities

● HBase split visualisation with Hannibal

● Challenges & lessons learned

● Resources to get started

Page 3: Apachecon Europe 2012: Operating HBase - Things you need to know

3

About me● Software Architect @ Sentric

● Founder and organizer of the Swiss Big Data User Grouphttp://www.bigdata-usergroup.ch

● Contact:[email protected]://www.sentric.ch@chrisgugi

Page 4: Apachecon Europe 2012: Operating HBase - Things you need to know

4

HBase Internals

Page 5: Apachecon Europe 2012: Operating HBase - Things you need to know

5

Data Model● A sparse, multi-dimensional, sorted map

● Table consist of rows, each has a row key

● Each row may have any number of columns

● Rows are sorted lexicographically based on row key

● Column = Column Family : Column Qualifier

– Cell → {rowkey, column, timestamp}

● Region: contiguous set of sorted rows

● Region: unit of distribution and availability

[Bigtable: A Distributed Storage System for Structured Data]

Page 6: Apachecon Europe 2012: Operating HBase - Things you need to know

6

Physical Data Organization

Memstore

HFile(on HDFS)

HFile(on HDFS)

Store

Region

HLo

g(W

AL

on H

FD

S)

content Column Family

● Column families are stored separately on disk

– Unit of access control with different patterns

● Writes are held (sorted) in memory until flush

● Sorted on disk in predictable order

– By row key, column key, descending timestamp

Memstore

HFile(on HDFS)

Store

anchor Column Family

Page 7: Apachecon Europe 2012: Operating HBase - Things you need to know

7

Flushes and Compaction● Flushing/compaction per Region

– One thread (CompactSplitThread) per region server

● Minor compaction

– Merges two or more HFiles into one

● Major compaction

– Picks up all HFiles in the region, merges them and removes deleted k/v

● Regions are split when grown too large

Page 8: Apachecon Europe 2012: Operating HBase - Things you need to know

8

System Architecture

Master

HBase

Write-Ahead Log

RegionServer

HDFS ZooKeeper

[HBase: The Definitive Guide]

API

MemstoreHFile

Page 9: Apachecon Europe 2012: Operating HBase - Things you need to know

9

Key Design & Distribution● Bad idea: continuous number or timestamp

(sequential row keys)– RegionServer hot-spotting

● Better: use hash function and/or composite key – Distribute keys over random regions

– Uniform reads/writes across key space

● Proper key design is very essential– E.g. reversed URL (Bigtable paper)

Page 10: Apachecon Europe 2012: Operating HBase - Things you need to know

10

Overview HBase Utilities

Page 11: Apachecon Europe 2012: Operating HBase - Things you need to know

11

Useful Tools● hbck – checks and fixes table integrity and

region consistency

● HFile – examine contents of HFile

● HLog – examine contents of HLog file

● OfflineMetaRepair – rebuild meta table from file system

● HBase web interfaces– Master

– RegionsServer

Page 12: Apachecon Europe 2012: Operating HBase - Things you need to know

12

Monitoring Tools● Ganglia

● Nagios

● OpenTSDB

● …

All tools use metrics provided through JMX

Page 13: Apachecon Europe 2012: Operating HBase - Things you need to know

13

Manual Splitting● Via master web interface– Split

● HBase shell split command

● RegionSplitter– Create table with pre-split regions

– Rolling split of all regions on existing table

– . /bin/hbase org.apache.hadoop.hbase.util.RegionSplitter

Page 14: Apachecon Europe 2012: Operating HBase - Things you need to know

14

Disable Automatic Splitting● Determined by hbase.hregion.max.filesize

● Set to max. 100GB

● OK, but: – How do I monitor my region growth?

– Where do I split when I have irregular data growth?

Page 15: Apachecon Europe 2012: Operating HBase - Things you need to know

15

HBase Split Visualisation with Hannibal

Page 16: Apachecon Europe 2012: Operating HBase - Things you need to know

16

Hannibal● Open source, project on github

– https://github.com/sentric/hannibal

● Web based

● Implemented in Scala

● Compatible with HBase 0.90

● Support > 0.92 added soon

● Check it out!

Page 17: Apachecon Europe 2012: Operating HBase - Things you need to know

17

How well are regions balanced over the cluster?

Page 18: Apachecon Europe 2012: Operating HBase - Things you need to know

18

How well are the regions split for the table?

Page 19: Apachecon Europe 2012: Operating HBase - Things you need to know

19

How did the region evolve over time?

Page 20: Apachecon Europe 2012: Operating HBase - Things you need to know

20

Future Plans● HBase 0.92 client API changes allow to

query Compaction-State on Regions through HBaseAdmin → differentiate major from minor compactions

● Add tool to find best region-key for irregular data growth

● Expose metrics through JMX

Page 21: Apachecon Europe 2012: Operating HBase - Things you need to know

21

Challenges & Lessons Learned

Page 22: Apachecon Europe 2012: Operating HBase - Things you need to know

22

Challenges● Everyone is still learning

● Some issues only appear at scale– At scale, nothing works as advertised

● Production cluster configuration– Hardware issues

– Tuning cluster configuration to our work loads

● HBase stability

● Monitoring health of HBase

Page 23: Apachecon Europe 2012: Operating HBase - Things you need to know

23

Lessons Learned● Schema & key design

– What’s queried together should be stored together

● Monitoring/Operational tooling is most important

● Forget “emergency actions”, it takes some time

● You need DevOps in production

● Huge know-how curve, you need to know the whole ecosystem

– Hadoop, HDFS, Map/Red, ZooKeeper

Page 24: Apachecon Europe 2012: Operating HBase - Things you need to know

24

Resources to get started● https://github.com/sentric/hannibal

● http://hbase.apache.org/book.html

● https://github.com/jmhsieh/hbase-repair-scripts

● http://www.sentric.ch/blog/best-practice-why-monitoring-hbase-is-important

● HBase: The Definitive Guide

Page 25: Apachecon Europe 2012: Operating HBase - Things you need to know

25

Questions?@chrisgugi

Thank you!