Upload
evans-ye
View
372
Download
0
Tags:
Embed Size (px)
DESCRIPTION
The experience about join a Taiwan hadoop deployment competition .
Citation preview
How We Lose Etu Hadoop Competition
Evans Ye
2014.6.16
04/07/2023 Confidential | Copyright 2013 TrendMicro Inc. 1
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
This April, a Hadoop Competition hosted by Etu was announced
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
It’s about hadoop deployment
2
04/07/2023
4
I have a dream… to win that 150 grand
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Our Team
• Fann Wu, Mammi Chang– Solid Hardware related knowledge– knowing well how to tune performance on
hadoop clusters• Evans Ye
– Have some experience on developing a automatic hadoop deployment tool
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Agenda
• The preliminary– Winning criteria– What we’ve prepared
• The final– Winning criteria– What we’ve prepared
• Why we lost the competition• Lesson learned
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
The preliminary
• Deploy a all-in-one hadoop EC2 instance
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Criteria to win the preliminary
• namenode daemon exist• put 100MB file up to hdfs • yarn daemons exist• run a pi job• zookeeper daemon exist• hbase daemon exist• run hbase put and scan• run a pig script • run a hive query
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
And the most Improtant one, Finish Time
2
04/07/2023
10
Prepare for the fight
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
What we prepare to do
• in order to achieve fastest finish time, we need to practice over and over.– A Vagrant based scripts to simulate the AWS
environment– A shell script which will automatically provision
all-in-one hadoop
2
04/07/2023
Copyright 2013 Trend Micro Inc.
Vagrant
• An open source command line VM provision tool– http://www.vagrantup.com/
• Support Virtualbox, VMware, AWS and more as VM provider
• Support shell, puppet, chef on provisioning• previous sharing
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Vagrant-aws plugin
• https://github.com/mitchellh/vagrant-aws• Vagrantfile
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Provision script
• Jazz Wang already leaked the script to provision a all-in-one hadoop on Ubuntu in OSDC.TW– package based deployment
(you can also started from tarballs)
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Our hack #1
• Use self cloned S3 repo instead of worldwide public repos– avoid SPOF– co-located with Singapore region to speed up
network transmission
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Our hack #2• the evil /usr/lib/hadoop/libexec/init-hdfs.sh
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Our hack #2
• /usr/lib/hadoop/libexec/init-hdfs.sh– A hdfs directories bootstrap script
• /user/hbase, /tmp, /var/log/hadoop-yarn/apps…– Execute lots of hadoop shell command
• HELL SLOW!– BIGTOP-952 attempt to solve it by calling HDFS
API directly using groovy– Our hack is to concatenate similar commands
into one command• hadoop fs -mkdir -p /tmp /var/log /tmp/hadoop-yarn• 50 15 calls
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Our hack #3
• run hdfs, hbase, pig, hive test case in parallel– (hdfs test case here) &– (hbase test case here) &– (…) &– wait– send my score
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Pretty good result on the preliminary
2
04/07/2023
20
The Final
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Evans: GJ, let’s get some rest
• 2 weeks gone
2
04/07/2023
22
The Final
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Criteria to win the final
• held on 5/31 at Etu’s building
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Criteria to win the final
• 部署完整性 (20%)– Zookeeper, HDFS, YARN deployed
• 高可用性驗證 (20%)– Namenode HA using Journalnodes
• 系統安全性驗證 (10%)– Kerberos enabled
• 運行效能 (30%)– DFSIO (write throughput)– Terasort (sort speed)– HBaseEvaluation (Hbase write throughtput)
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Environment
• Hardware
• Software
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Summarize things we need to do
• This time, finish time doesn’t matter. We need to focus on correctness and performance– Choose a hadoop deployment tool which
supports• Namenode HA• Kerberos • YARN
– Figure out how to get best performance on YARN and Virtualbox
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Choosing the deoloyment tool
• Cloudera Manager– You need to install/configure Kerberos by yourself
• Ambari– “Claimed” support Kerberos, while actually it does
not• Bigtop
– Do have Kerberos and namenode HA puppet recipes, but currently is kind of buggy
• Hadooppet– Need to implement yarn deployment
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Cloudera Manager
…Kerberos installation/configuration is on your own
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Ambari has great UI design, but…
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Comparison
2
Deployment Tool
Namenode HA
Kerberos YARN Hadoop distro
Troubleshooting
Cloudera Manager
YES NO YES Hadoop 2.3.0(CDH5)
HARD
Ambari YES NO(enable failed)
YES Hadoop 2.4.0(HDP2.1)
HARD
Bigtop NO(NFS)
NO(buggy)
YES Hadoop 2.0.6-alpha(bigtop-0.7.0)
MIDDLE
Hadooppet YES YES NO Hadoop 2.3.0(CDH5)
EASY勝 勝
04/07/2023
31
Getting our deployment tool ready
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Trap#1
• Got connection refused from JournalNodes while formatting namenodes
• The root cause– When hostname defined in Vagrantfile
– It will help to setup VM’s hostname, AND the /etc/hosts
– Which lead Journalnodes listening on 127.0.0.1 and results in connection refused error while formatting namenodes
• The fix– cat /dev/null > /etc/hosts
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Trap#2
• Kerberos database initialization failed due to timeout exceed
• The root cause– Virtualbox has poor entropy performance(
Ticket #11297)– Kerberos DB init can not get enough random
data– Entropy is often collected from hardware
sources for use in cryptography
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Trap#2
• A quick test to get entropy– A xen VM
– A virtualbox VM
• The fix– Setup havege package which will improve
entropy performance• havege official site, Installation
2
04/07/2023
35
Performance Tuning
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
OS tuning
• Disabling Transparent Huge page compaction, THP– echo never >
/sys/kernel/mm/redhat_transparent_hugepage/enabled
– impact reported• Hadoop, oracle linux and Splunk…
• set vm.swappiness to zero– sysctl -w vm.swappiness=0– avoid processes to get swapped out despite there
is free memory available
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Virtualbox tuning
• Raw hard disk access– direct access host disks from guest VM– create a VMDK file to represent the
disk/partition
– mount it up on the guest through virtualbox GUI
– fdisk the newly added disk in guest VM
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
YARN tuning
• HDFS cache for reads(available since 2.3.0)• YARN:
– yarn.nodemanager.resource.memory-mb• Mapreduce:
– io.sort.mb– mapreduce.map.memory.mb– mapreduce.map.java.opts– mapreduce.map.speculative– …– Most properties are job specific
2
04/07/2023
39
Deployment Architecture
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
VMs configuration
2
RAM CPU DISK daemons
VM1 7G 3 vcpus Local disk NamenodeResourcemanager
VM2 7G 3 vcpus Local disk NamenodeResourcemanager
VM3 15G 8 vcpus 1T raw disk *2 DatanodeNodemanager
VM4 15G 8 vcpus 1T raw disk *2 DatanodeNodemanager
total 44G 22 vcpus 4T for hdfs -
04/07/2023
41
5/31The Day
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
The check we’re so eager to win
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
And the result
2
04/07/2023
44
WE LOST
Confidential | Copyright 2013 TrendMicro Inc.
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
The reason we lost
• VirtualBox sluggish performance on hyper-threading
• To avoid that:– Disable hyper-threading– set equal number of cores for host and guest
• VMs != physical machines– We all assume that hyper-threading helps a lot
on performance, at least it does so on our hadoop cluster
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Poor support for multi-cores
• VMs with multiple vCPUs require that all allocated cores be free before processing can begin– Do not configure too many vCPUs for 1 single
VM– A strong VM will not perform well as you
expect
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
The better architecture
2
RAM CPU DISK daemons
VM1 10G 4 vcpus 1T raw disk *1 NamenodeResourcemanagerDatanodeNodemanager
VM2 10G 4 vcpus 1T raw disk *1 NamenodeResourcemanagerDatanodeNodemanager
VM3 10G 4 vcpus 1T raw disk *1 DatanodeNodemanager
VM4 10G 4 vcpus 1T raw disk *1 DatanodeNodemanager
total 40G 16 vcpus(equal to physical cores)
4T for hdfs -
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
How about hadoop performance tuning?
• Everybody pretty much using defaults, including the team who win the competition
• …
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Lesson learned
• Don't judge too soon• Don’t stay up for a week. If so, you can’t
make decision wisely• We need better project management
– We spent to much time on tuning our deployment tool
– We don’t do much tests on different deployment architectures
2
04/07/2023
Confidential | Copyright 2013 TrendMicro Inc.
Acknowledgments
• Thanks to Fann for sorting out those trivial works– packaging the box– cloning repositories– Preparing testing environment
• Thanks to Mammi for the great presentation on that day
2
51
Q&A
04/07/2023 Confidential | Copyright 2013 TrendMicro Inc.