Upload
cloudera-inc
View
2.049
Download
3
Tags:
Embed Size (px)
DESCRIPTION
This session will discuss how you can represent your complete cluster with one config file and have it deployed to Cloud or Bare Metal. Infochmimps’ Ironfan builds on Opscode Chef to allow you to specify and orchestrate all flavors of your cluster’s deployment, monitoring and growth. Not just the core HBase/HDFS/MapReduce/Hive/Flume, etc. but all the elements including web / app servers, mysql, redis, rabbitmq and whatever other servers needed to implement your service. These same tools can manage variations for development, staging, R&D as well as the target “rendering” to various Clouds, Bare Metal or even Vagrant VMs.
Citation preview
Orchestrating Clusters with Ironfan and Chef
Robert J. Berger - CTO Runa, [email protected]
Hassles of Big Data Stack Deployments
• Lots of Moving Parts
• Hadoop/HBase just one sub-system
• Heterogeneous Tech
• Monitoring & Metrics Everywhere
• Details obscure the big picture
• Need repeatability & variations on themes
The Forest for the Trees
Forest
SessionStore HBase MySQL
Dashboard
App Servers Monitor
Hadoop M/RAMQPElastic
Search
Trees
Web
Rails A
ppR
ails App
ELB WebW
eb
Clj A
pp
Clj A
ppC
lj App
Web
Graphite
Ganglia
LogstashSensu
Statsd
GD
ash
ES Server
ES ServerES Server
ES Server
Rabbit
Rabbit
RedisR
edis
Redis
ZooKeeper
ZooKeeper
ZooKeeper
HB
Master
HB
Master
RegionsrvrR
egionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Regionsrvr
Master
Sec Master
SlaveSlave
Slave
SlaveSlave
SlaveSlave
Slave SlaveSlave
MasterSlave
Slave
Dashboard App Servers Monitoring
ElasticSearch
AMQP HadoopM/R
SessionStore
HBaseMySQL
LeavesDashboard App Servers Monitoring
Elastic Search AMQP Hadoop M/R
Session Store HBase MySQL
NginxReverse ProxyUnicornRailsDashboard AppJavaPostfixCron jobsSensu clientSensu plugins
Elastic Load BalancerNginxReverse ProxySwarmijiJavaLeiningenJarkClojure AppsHBase ClientPostfixCron jobsSensu clientSensu plugins
MySQL ClientUpstart ConfigLogstash Client
MySQL ClientUpstart ConfigLogstash Client
NginxReverse ProxyJavaLeiningenJarkPostfixCron jobsSensu ServerSensu WebSensu ClientSensu Plugins
Ganglia ServerGanglia WebStatsd ServerGraphite ServerGrpahite WebPythonLogstashMySQL ClientUpstart ConfigLogstash Client
Elastic Search ServerJavaCron jobsSensu clientSensu pluginsUpstart Config
RabbitMQRabbitMQ PluginsCluster ConfigErlangCron jobsSensu clientSensu pluginsUpstart Config
NamenodeSecondary NamenodeTasktrackerJobtrackersBootstrap NamenmodeJavaJMXCron jobsSensu clientSensu pluginsGanglia ClientUpstart Config
NamenodeSecondary NamenodeTasktrackerJobtrackersDatanodesBootstrap NamenmodeJavaJMXCron jobsSensu clientSensu pluginsGanglia ClientUpstart Config
RedisCron jobsSensu clientSensu pluginsUpstart Config
ZookeeperHBase MasterRegionserver
MySQL MasterMySQL SlavesCluster SetupCron jobsSensu clientSensu pluginsUpstart Config
Molecules<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://ip-10-17-57-58.ec2.internal:8020/hadoop/hbase</value> <description>The directory shared by region servers. Should be fully-qualified to include the filesystem to use. E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR </description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> <description>The mode the cluster will be in. Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) </description> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master0-cluster0.runa.com,regionserver0-cluster0.runa.com,regionserver1-cluster0.runa.com</value> <description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on. </description> </property>
Config Management:Leaves & Molecules
• Chef, Puppet, Cfengine
• Much better than shell scripts or cli jocks
• Infrastructure as code
• Still No Forest Perspective
Ironfan: Forest, Trees, Leaves and Molecules
• Builds on top of Chef
• Cluster Description in Single File
• Your System Diagram Come to Life
• Components announce capabilities
• Service Discovery automates interconnects
• Knife (CLI) extension controls cluster and component life-cycles
Community + OwnCookbooks
VMs and/or ServersRunning Chef-Client
Chef Development Host
N
N
N
N
Basic Chef
Chef Knife:Upload Cookboks to Chef Server
Launch Instances / Bootstrap Servers
Chef Server
Roles / Cookbooks Data BagsNodes
Attributes
Auth & ACLsSearch
NodesAttributes
Search
Chef + IronfanCommunity + Own
Cookbooks
VMs and/or ServersRunning Chef-Client
Chef Development Host
Ironfan Pantry +
Chef Knife + Ironfan Gem:Launch / Bootstrap / Manage Whole Clusters, All Facets or
Specific Instances
N
N
N
N
Chef Server
Roles / Cookbooks Data Bags
NodesAttributesDiscovery
Auth & ACLs
Search / Discovery
Ironfan Components• Ironfan Gem:
• Knife Plugins to orchestrate clusters• Logic to sync Chef Server & Cloud[s]
• Silverware: Coordinate Discovery of Services
• Ironfan-Homebase: Ironfan tuned Chef-Repo
• Ironfan-Pantry: Cookbooks tuned for Clusters
• Ironfan-CI: Testing of Ironfan clusters and Cookbooks
Cluster Config: Forest View
facet 'master' do instances 1 cloud.image_id 'ami-93c31afa' cloud.flavor "cc1.4xlarge"
role "big_package" role "hadoop_master" role "hbase_master" recipe "cluster_chef::cluster_webfront" recipe "hbase::utils" recipe "route53::runa" role "monitored_client" end
GlobalCloud & Cluster
Configs
Facets are the “Trees”
Roles & Recipesare the “Leaves”
ClusterChef.cluster 'base0-cluster0' do setup_role_implications cloud :ec2 do region 'us-east-1' availability_zones ['us-east-1b'] backing 'ebs' image_name 'natty' security_group(cluster_root) do authorize_port_range(22) end end
role "base_role" role "chef_client" role "base0-cluster0" role "production" role "runastack" role "ebs_volumes_attach" role "ebs_volumes_mount"
Global Cloud & Recipe ConfgsClusterChef.cluster 'base0-cluster0' do setup_role_implications cloud :ec2 do region 'us-east-1' availability_zones ['us-east-1b'] backing 'ebs' image_name 'natty' security_group(cluster_root) do authorize_port_range(22) end end
role "base_role" role "chef_client" role "base0-cluster0" role "production" role "runastack" role "ebs_volumes_attach" role "ebs_volumes_mount"
Cluster Name
CloudConfigs
SharedRoles
Configure SecurityGroup
Facets add Specifics facet 'master' do instances 1 cloud.image_id 'ami-93c31afa' cloud.flavor "cc1.4xlarge"
role "hadoop_master" role "hbase_master" recipe "cluster_chef::cluster_webfront" recipe "hbase::utils" recipe "route53::runa" role "monitored_client" end
facet 'regionserver' do instances 7 cloud.image_id 'ami-93c31afa' cloud.flavor "cc1.4xlarge"
role "hadoop_slave" role "hbase_regionserver" recipe "hbase::utils" recipe "route53::runa" role "monitored_client" server 0 do role "zookeeper_server" end end
Facet Name
Facet Name
Number ofCopies
Number ofCopies
CloudOverrides
FacetRoles &Recipes
FacetRoles &Recipes
CloudOverrides
Make oneinstancespecial
Facets Composed of Components
• Components are Services
• Nginx, MySQL server, Zookeeper, HBMaster, Namenode, etc.
• Chef Cookbooks manage components
• Ironfan Silverware for service discovery
• Auto-Connects components together
Silverware Service Discovery
hbase_config = Mash.new({ :namenode_fqdn => discover(:hadoop, :namenode ).private_hostname), :jobtracker_addr => discover(:hadoop, :jobtracker).private_ip), :zookeeper_addrs => discover_all(:zookeeper, :server).map(&:private_ip).sort, :ganglia => discover(:ganglia, :server), :ganglia_addr => discover(:ganglia, :server).private_hostname), :private_ip => private_ip_of(node) })
announce(:hadoop, :namenode)• Recipe that creates a service , announces it
• Recipe that requires a service , discovers it
Aspects enable Zeroconf Amenities
• A log aspect would enable the following amenities
• logrotated to manage its logs
• flume to archive logs to a location
• A port aspect would enable
• Configuration of firewall
• Monitoring of port uptime & latency
• Remote checks that firewalled ports do NOT respond
Knife Cluster: Lifecycle Management
• A Plugin for Opscode Chef Knife
• Deployment & Lifecycle Operations:• launch, bootstrap, kill, start, stop
• Access and Chef Operations• ssh, kick, proxy
• Utilities• show, sync
Knife Launch Cluster, Facet or Instance[s]
• Launch all the nodes in a clusterknife cluster launch base0-master0
• Launch just a single instance of a single facetknife cluster launch base0-master0 master 0
• Launch all the instances of a facetknife cluster launch base0-master0 regionserver
Stop/Start Cluster, Facet or Instance[s]
• Stop whole clusterknife cluster stop base0-master0
• Stop a single instance of a single facetknife cluster stop base0-master0 master 0
• Stop all instances of a facetknife cluster stop base0-master0 regionserver
Same Knife Command to Launch Vagrant[s]
• Can use the same cluster configurations and knife command to launch Vagrantsknife cluster vagrant up base0-master0
• Still Experimental
Ironfan-CI
• Jenkins based Continuous Integration of Clusters
• Still Experimental
• Uses Discovery to automate baseline test creation
• Leverages Vagrant to create clean test environments
References
• Basic Chef Stuff:http://wiki.opscode.com/display/chef/Home
• Ironfan Screencast:http://vimeo.com/37279372
• Ironfan Wiki for the most complete info : https://github.com/infochimps-labs/ironfan/wiki
• The Forest for the Trees Photo - Ame Otokohttp://www.flickr.com/photos/ameotoko/5383225925/