Upload
hadoop-summit
View
361
Download
1
Tags:
Embed Size (px)
Citation preview
Oozie towards zero downtime
Hadoop Summi t 2015 /04 /15
Pursho tam Shah purushah@yahoo - inc .com
Ryota Egash i ra egash i ra@yahoo - inc .com
● Introduction■ Scale at Yahoo■ Use Cases■ Why zero-down time matters?
● Architectural Overview● Technical Challenge
■ Security■ Log Streaming■ HCatalog Integration in HA
● Experiences● Future Work
2 Yahoo Confidential & Proprietary
Agenda
Why Oozie?The Problem The Need
▪ Doing something on the grid often
required multiple steps
▪ MapReduce job
▪ Pig job
▪ Streaming job
▪ HDFS operation (mkdir, chmod,
etc)…
▪ Workflow scheduler with better support
for grid jobs (native integration with
Hadoop)
▪ orchestrate dependency between jobs
▪ execute at specific time or on data
availability
▪ retry jobs in the event of failures
(reliable)
▪ Multiple ad-hoc solutions existed
▪ custom job control
▪ cron…
▪ Common framework for communication
and execution of production process
▪ sync (clocked dataset) awareness
▪ async (unspecified freq) data
awareness
A server-based workflow
scheduling system to
manage Hadoop jobs
3 Yahoo Confidential & Proprietary
Scale at YahooDeployed on all clusters (production, non-production)One instance per cluster
75 products / 2000 + projects255 monthly users
28.9 million Hadoop Jobs monthly (Jan 2015, total)72% from Oozie (including launcher jobs)
108,000 workflow jobs daily (Feb 2015, one busy cluster)Between 1-8 actions :Avg. 4 actions/workflowExtreme use case, submit 100-200 workflow jobs per min
1,700 coordinator jobs daily (Feb 2015, one busy cluster)Frequency: 5, 10, 15 mins, hourly, daily, weekly, monthly (25% : < 15 min)67 % of workflow jobs kicked from coordinator
60 bundle jobs daily (Feb 2015, one busy cluster)
4 Yahoo Confidential & Proprietary
Hadoop Jobs on the Platform Job distribution (Jan, 2015)
5 Yahoo Confidential & Proprietary
Y! business processed by Oozie
Ad ExchangeAd LatencySearch Advertising
Content AgilityContent OptimizationContent PersonalizationFlickr Video
Audience TargetingBehavioral TargetingPartner TargetingRetargetingWeb Targeting
Advertisement Content Targeting
6 Yahoo Confidential & Proprietary
Y! business processed by Oozie
Anti SpamContentRetargeting
ResearchDashboards & ReportsForecasting
Email Data Intelligence Data Management
Audience Event Pipeline
7 Yahoo Confidential & Proprietary
Use Case - Data pipeline
8 Yahoo Confidential & Proprietary
Number of action created hourly
Mid-Night PMPMAM4am 2pm 0am 10am 8pm
9 Yahoo Confidential & Proprietary
Number of action created per minute
10 Yahoo Confidential & Proprietary
SCALE
▪ At one point of time all 5 min, 15 min, 30 min, hourly,
daily, monthly coordinator job will collide and there will
be outburst of coordinator actions, which single host
can’t handle.
▪ We noticed processing delay and customers
complaining slowness.
11 Yahoo Confidential & Proprietary
Why Downtime matters? Downtime needed
Oozie Upgrade (Major Release > 1 per Quarter, Minor > 1 per
Month)12 Yahoo Confidential & Proprietary
Why Downtime matters? Downtime needed
Dependent Hadoop Projects Upgrade (YARN, HDFS, Hive, HBase, etc)
Oozie
YARN
HDFS
Hive
HBase
Pig
HCatlog
Pig
13 Yahoo Confidential & Proprietary
Why Downtime matters? Downtime needed
Configuration error / change
14 Yahoo Confidential & Proprietary
Why Downtime matters? Downtime needed
Hardware error / upgrade15 Yahoo Confidential & Proprietary
Why Downtime matters? Customers
Revenue-impact applications need running all the time, no delay!16 Yahoo Confidential & Proprietary
Why Downtime matters? Ops
Ops- under pressure to minimize downtime17 Yahoo Confidential & Proprietary
Solution : High Availability
18 Yahoo Confidential & Proprietary
● Definition: failure of a component != failure of entire
system
o by removing single point of failure
● Requirement: Transparency to Users
o User should not know it’s HA or not
o No change in API and usage pattern
19 Yahoo Confidential & Proprietary
Architecture
Load
Balancer
RDB
Hadoop Cluster
submit request
request redirection
Oozie Server 1
Oozie Server n
Inter server communication
Zookeeper
Curator
Architectural Overview: Database
20 Yahoo Confidential & Proprietary
● Oozie stores most of its state in a database
o (submitted jobs, workflow definitions, etc)
● Oracle database( 2 rack) in HA is used ( Hot-warm).
● Zookeeper ( Curator) for coordination
Architectural Overview: Access
21 Yahoo Confidential & Proprietary
● Users and client programs need a single address to
connect to
o Web UI, REST/Java API,
JobTracker/ResourceManager callbacks, etc
● Virtual IP (VIP) is used as user facing URL.
Architectural Overview: Security
22 Yahoo Confidential & Proprietary
We use Kerberos and some of internal security system to communicate
among components.
23 Yahoo Confidential & Proprietary
Security: https + kerberos
/ cookie-based auth
Architectural Overview: Authentication
Load
Balancer
RDB
Hadoop Cluster
submit request
request redirection
Oozie Server 1
Oozie Server n
Inter server communication
for log streaming etc
Zookeeper
Curator
Security: https + kerberos /
cookie-based-auth
Security: https+kerberos
Zookeeper for lock and
management
Security: KerberosSecurity: kerberos
Technical Challenge: Log Streaming
24 Yahoo Confidential & Proprietary
● Each Oozie server only has access to its own logs
● Jobs can execute on any server
o Job execution can switch among server
● User need to see sequential logs rather than server1 and
server2 logs.
25 Yahoo Confidential & Proprietary
Architectural: Log Streaming
2. Call other server
to fetch logs
1. user request comes
to server1
3. Call all other server are
merge logs using log
timestamp
4. Log is displayed to
user2. Fetch
server list
from ZK
Caveat:Log Streaming
26 Yahoo Confidential & Proprietary
If an Oozie Server goes down, any logs from it will be unavailable
27 Yahoo Confidential & Proprietary
Technical Challenge:HCatalog Integration
• Hive Metastore(HCatalog) : Manage metadata for datasets
– Oozie register for dataset to HCatlog
– Oozie receive notification from HCatlog through JMS (e.g., ActiveMQ)
– Oozie starts job immediately after data becomes ready
JMS
(e.g, ActiveMQ)
. Push notification
<New Partition>
1. Register Topic
. Notify New Partition
Job
Oozie Server 1
Oozie Server 2
28 Yahoo Confidential & Proprietary
Technical Challenge:Hive Metastore Integration
• Oozie maintains in-memory list of datasets which need
notification.
• Notification comes to only one server.
• One notification come to one server, Oozie need to
invalidate cache in all other servers.
• This is done by having a periodic task on each server
which check job status of each dataset and if it’s not
waiting. It remove the dataset from cache.
29 Yahoo Confidential & Proprietary
Technical Challenge:Hive Metastore Integration
3. Push notification
<New Partition>2. Register Topic
4. Notify New Partition JMS
(e.g, ActiveMQ)
Job
Oozie Server 1
Oozie Server 2
Remove registrationPeriodic check
Challenges
30 Yahoo Confidential & Proprietary
● Distributed Job IDo Maintain distributed sequence number for Job ID using
Apache Curator + Zookeeper
● Zookeeper Failure Handlingo Oozie servers automatically shutdown when Zookeeper is
down
● Sharelibo Support sharelib update in HA
More Challenges
• SLA support
– Oozie has in-memory data structure to track sla status
for each job (start/duration/end met/miss and
notifications)
– add check of sla status against Database
– use ZK lock to synchronize update on the same job
from multiple servers.
• Distributed Locks
– Reentrant distributed lock using Apache Curator +
Zookeeper31 Yahoo Confidential & Proprietary
Experiences
• HA running on all
production grids > 7
months at Yahoo!
– Stable !
32 Yahoo Confidential & Proprietary
Issues
– Zookeeper down
(when upgrading zk quorum h/w)
– Server going out of sync
(during upgrade, sharelib)
33 Yahoo Confidential & Proprietary
Benefits
▪ Zero downtime for applications
▪ Rolling upgrade (zero downtime)
› Maintenance upgrade
› Configuration upgrade
▪ No more materization delay
34 Yahoo Confidential & Proprietary
Workflow Job Submission Throughput
35 Yahoo Confidential & Proprietary
Future work
• Faster job fail-over
– currently wait for a thread (Recovery Service) to pick
non-progressing jobs every few minutes
– Oozie server should immediately notice when other
server is down and fail-over job (e.g, using ZK
watcher)
• Improve log streaming
36 Yahoo Confidential & Proprietary
AcknowledgementRobert Kanter
Olga L. Natkovich
Rohini Palaniswamy
Michelle Chiang
Jacob Tolar
Sumeet Singh
37 Yahoo Confidential & Proprietary