Building/Administering Large DB Clusters LinuxCon Europe 2012€¦ · Building/Administering Large DB Clusters LinuxCon Europe 2012 Who are Palomino? Bespoke Services: we work with

Building/Administering Large DB ClustersLinuxCon Europe 2012

Who are Palomino?Bespoke Services: we work with and like you.Production Experienced: senior DBAs, admins, and engineers.24x7: globally-distributed on-call staff.No-lock-in contracts.Professional Services (DevOps):

➢ Chef,➢ Puppet,➢ Ansible.

Big Data Cluster Administration (OpsDev):➢ MySQL, PostgreSQL,➢ Cassandra, HBase,➢ MongoDB, Couchbase.

Building/Administering Large DB ClustersLinuxCon Europe 2012

Who am I?Tim EllisCTO/Principal Architect, Palomino

Achievements:➢ Palomino Big Data Strategy.➢ Datawarehouse Cluster at Riot Games.➢ Back-end Storage Architecture for Firefox Sync.➢ Led DB teams at Digg for four years.➢ Harassed the Reddit team at one of their parties.

Ensured Successful Business for:➢ Digg, Friendster,➢ Riot Games,➢ Mozilla,➢ StumbleUpon.

Questions?

Ask questions during presentation.No need to hold your questions until the end.

Building/Administering Large DB ClustersWhat is this Talk?

Building a Large Database Cluster➢ Practical Concerns➢ The tools and choices

Monitoring the Cluster➢ How distributed DBs are different➢ Getting the data and acting on it

Rules of Thumb➢ How to size your cluster➢ Cluster architecture

Prerequisite: Build a Large ClusterAllocating the Hardware

Getting Hardware – your own company's:➢ Can be politically-charged.➢ Get a small batch first.➢ Build small demonstration cluster.➢ Get everyone on-board with the demo.

Renting/Leasing Hardware – the Cloud:➢ Allocate hardware in EC2 or elsewhere.➢ Usually easier, but possibly harder admin:

➢ Hardware failure more common.➢ Hardware/network flakiness more common.

Prerequisite: Build a Large ClusterBuilding the Cluster

Okay, I've got the hardware. What next?

Prerequisite: Build a Large ClusterBuilding the Cluster

Configuring the Hardware. The old dilemma:➢ Spend days to install/configure DB software?

Subsequent management is painful.➢ Use SSH in “for” loops?

Rolling your own configuration management tools is a lot of work.

➢ Learn a configuration management tool?Obvious choice in 2012. Well-documented tools like Chef, Puppet, Ansible.

Configuration Management ToolsMy Experience

Puppet: 6 years ago at Digg➢ Manage/Deploy of hundreds of servers.➢ Painful, but not as bad as hand-coding it all.

Chef: 2 years ago at Drawn to Scale and Riot➢ Manage/Deploy dozens of servers.➢ Learning Ruby is a “joy” of its own.

Ansible: 6 months ago at Palomino➢ Manage/Deploy dozens of servers.➢ First Palomino Cluster Tool subset built.

Prerequisite: Build a Large ClusterConfiguration Management Options

Pick your Configuration Management:➢ Chef: Popular, use Ruby to “code your

infrastructure.” Must learn Ruby.➢ Puppet: Mature, use data structures to “define

your infrastructure.” Less coding.➢ Ansible: Tiny and modular, similar to Puppet,

but with ordering for deployment. Pragmatic.Write/Get Recipes, Manifests, Playbooks?➢ Writing is tedious. Can take >1 week.➢ Get from internet? Often incomplete.

Prerequisite: Build a Large ClusterThe Palomino Cluster Tool

Palomino's tool for building large DB clusters:➢ Chef, Puppet, Ansible modules.➢ Open-source on Github.

➢ https://github.com/time-palominodb/PalominoClusterTool

➢ Google: “Palomino Cluster Tool.”➢ Will build a large cluster for you in hours:

➢ HA Master(s)/Slaves – hundreds as easy as two➢ HBase fully-distributed mode

➢ Previously this would take days.

https://github.com/time-palominodb/PalominoClusterTool

The Palomino Cluster ToolBuilding the Management Node

Cluster Management Node:➢ Will build the initial cluster.➢ Will do subsequent cluster management.

Tool for Initial Cluster Build:➢ Palomino Cluster Tool (Ansible subset).

Tools for Trending and Alerting:➢ Graphite or OpenTSDB➢ Nagios or Icinga


Palomino Cluster Tool (Ansible subset).

Why Ansible?➢ No server to set up, simply uses SSH.➢ Easy-to-understand non-code Playbooks.➢ Use a language you know for modules.➢ For demo purposes, obvious choice.➢ Also production-worthy:

➢ Built by Michael DeHaan, long-time configuration management guru.


Management node lives alongside your cluster.➢ We are building our cluster in EC2.➢ Thus management node in EC2.➢ This tutorial assumes Ubuntu 12.04.➢ t1.micro is fine for management node.

Install basic tools:➢ apt-get install git (for Ansible/P.C.T.)➢ apt-get install make python-jinja2 (for

Ansible)

The Palomino Cluster ToolConfiguring the Management Node

Install Ansible:➢ git clone git://github.com/ansible/ansible.git➢ make install

Install Palomino Cluster Tool:➢ git clone git://github.com/time-

palominodb/PalominoClusterTool.git

I think we just finished the management node!

Picking a Distributed DBMSThe Single Point of Failure?

Typical Reasons Clusters Fail:➢ Cascading failure (distributed fail)➢ Network failure (distributed fail)➢ Bad query executed (distributed fail)➢ NameNode failing? (single point of failure)

NameNode failure is not typical cause of cluster failure. Still, it's good to plan for it:➢ All critical filesystems RAID 1+0➢ Redundant PSUs and NICs

Building an HBase Distributed DatabaseHardware and Architecture

NameNode as mentioned: highly redundant.

All other nodes: commodity hardware.➢ RAID-0 or, preferred, JBOD.➢ Spindles++: 8HDD in 1U good starting point.➢ 7200RPM SATA: nice, 15KRPM: overkill.➢ Many TB of storage. ←lots of this!➢ 8-24GB RAM.➢ Good/fast/multiple NICs.

Hadoop/HBase want lots of disk & network.

Building a Distributed DBMSNetwork and Rack Considerations

Network Within the Rack (Top-of-Rack Switch)➢ Bandwidth for 30 machines going full-tilt.➢ Multiple TOR switches for redundancy.➢ Bridging on nodes.

Network Between Racks➢ Better than 2GB desireable.➢ Network instability causes cluster instability.

Enlist help of your in-house Networking Pros.

Monitoring a Distributed DB ClusterPicking a Trending Tool

Tool must allow correlation of statistics.➢ Pick any N stats,➢ Put on a graph of log/linear scale,➢ Pick colours of each stat.

Tools that have these characteristics:➢ OpenTSDB,➢ Graphite,➢ Others?

Monitoring a Distributed DB ClusterTrending

Which stats should I capture? In doubt?Graph all of them.➢ Every Hadoop statistic,➢ Every HBase statistic,➢ Every OS-level statistic.

How?➢ CollectD has JMX plugin.➢ HBase/Hadoop have Ganglia stats export.➢ Ganglia/gmond can store into Graphite.

Monitoring a Distributed DB ClusterDistributed Databases are Different

Cross-node Correlation of Events:➢ Node X instability? Could be Node Y's fault.➢ ERRORs across all nodes?➢ Correlation of WARNINGs and ERRORs?➢ Log events correlate to graph anomolies?➢ Size of error logs change at new rate?

Outliers cause problems:➢ Slow nodes causing cascading failures.➢ Network instability causing cluster failure.

Troubleshooting Distributed DB ClusterBy Scientific Method: Procedure

Problems on the cluster?

Formulate hypothesis from input:➢ Graphs➢ Logs

Test hypothesis (tweak config)

Check you're graphing everything and go to the start.

Distributed DB Cluster TrendingGraphing your Logs

You need to graph everything. Are you graphing your logs?

➢ grep ERROR | cut [dt/hr part] | uniq -c

2012-07-29 06 156922012-07-29 07 304322012-07-29 08 769432012-07-29 09 549552012-07-29 10 15652

That's close, but what if it's hundreds of lines? Can use spreadsheet, but slows iteration cycle.


Graphing logs (terminal output) easier with Palomino's terminal tool “distribution,” OSS on Github:

# grep ERROR | cut <date/hour part> | distribution

2012-07-29 06|15692 ++++++++++ 2012-07-29 07|30432 +++++++++++++++++++ 2012-07-29 08|76943 ++++++++++++++++++++++++++++++++++++++++++++++++ 2012-07-29 09|54955 ++++++++++++++++++++++++++++++++++ 2012-07-29 10|15652 ++++++++++

On a quick iteration cycle in the terminal, this is very useful. For presentation to the suits later you can import the data into another prettier tool.


You want to characterise your logs.➢ How many ERRORs per hour?➢ How many WARNINGs per hour?➢ How many log lines per hour?

Look for patterns and ratios.➢ Alert on deltas.

System is imperfect, but it's a good start.➢ Good start >> Unstarted perfection.

Distributed DB Cluster AlertingTools

The tools are already well-known:➢ Whatever you already use probably works.➢ Nagios/Icinga are very capable.

Alerting rules are simpler than RDBMS:➢ Daemon not responding on port?

And more complex:➢ Increased ERROR/WARNING frequency?➢ Different CPU/Network characteristics?

Administering HadoopRules of Thumb

Don't let cluster get >70% full.➢ Disk throughput suffers.➢ HBase compactions slower or impossible.

Watch your network!➢ Network saturated? Perhaps reduce-heavy.➢ Disks saturated? Perhaps map-heavy.

Logs have WARNINGs and even ERRORs.➢ Act only if ERRORs translate to problems.

Administering any Distributed DBMSRules of Thumb

The Cloud is flaky:➢ Dramatically variant performance (>30x!).➢ Cascading failures more common.➢ Cannot choose network topology.

EC2 Brazil is currently most stable.

EC2 US-East is currently most flaky.

Building/Administering Large DB ClustersQ&A

Questions? Suggestions:➢ Interesting stuff. Got a job for me?➢ Well I got a job for you. Interested?➢ Average flight speed of a laden sparrow?➢ What's the meaning of Donnie Darko?

Thank you! Emails to domain palominodb, username time. LinuxCon Europe 2012 in Barcelona. Enjoy the rest of the show!

Documents

Building/Administering Large DB Clusters LinuxCon Europe 2012€¦ · Building/Administering Large DB Clusters LinuxCon Europe 2012 Who are Palomino? Bespoke Services: we work with