Oozie meetup Hadoop Summit 2014

Preview:

DESCRIPTION

by Bowen Zhang (Hortonworks)

Citation preview

Oozie MeetupBowen Zhang

Agenda

● Oozie Log HA● Oozie cron scheduling

○ Use cases○ Troubleshooting

● Prospective 4.1 release

Oozie Log HA

● HA implemented already○ Server○ HCat○ SLA○ Sharelib

● Remaining piece○ log streaming: if a node is down, all oozie.log

content on that node is not accessible

Proposed Solution

● YARN faced the same issue when it comes to log streaming and retrieval

● Currently, YARN puts container log to HDFS when log aggregation is enabled

● Oozie can duplicate logs onto HDFS○ Log directly to HDFS○ Copy the log onto HDFS during log rotation

Direct logging to HDFS

● Pros○ complete log HA with 100% accuracy of the

content● Cons

○ This could be hard to implement and may need significant oozie logging mechanism changes

○ This introduces strict dependency on HDFS○ Potential server performance issues.

Copy log rotation

● Pros○ Easy to implement without significant

changes to oozie logging structure○ Less performance issue

● Cons○ Always has less than one hour window of log

unavailability due to rotation schedule

Other ideas?

Other ideas are always welcome.Eg. Putting it into DB?Eg. Integrate Zookeeper to solve this problem?

Coordinator Cron Scheduling

Various Cron syntaxes exist and unix cron syntax is only one of them.● Oozie cron has 5 fields since oozie

operates on per minute base.● Weekday starts at 2 which is Monday● Complicated Overflowing ranges are

discouraged to use

Use cases

● A job running at 9am every weekday○ frequency="0 9 * * 2-6" or "0 9 * * MON-FRI"○ Notice in the first expression, we use 2-6

instead of 1-5● A job running every 15 minutes from 9-

11am every day○ frequency="0/15 9,10 * * *" or "0,15,30,45

9,10 * * *"○ Notice hour field should be 9,10 instead of 9-

11

Use cases continued

● A job running at 9am of every last Friday of the month○ frequency = "0 9 * * 6L"

● A job running at 9am of every 2nd Friday of the month○ frequency = "0 9 * * 6#2"

General mistakes

● Oozie timezone by default is UTC. So your cron syntax should calculate this timezone differences○ If you live in LA and want to run a job at 9am

every day:■ frequency = "0 9 * * *" is wrong!!!■ frequency = "0 16 * * *" is the right one

○ We understand the inconvenience, but that’s the oozie server timezone.

General mistakes continued

● Day of month and day of week are union, not intersection○ "0 10 12-18 * 2-6" DOES NOT MEAN running a

job at 10 am on weekdays between 12 and 18th of the month

○ It MEANS running a job at 10 am on weekdays AND on 12-18th of the month. Only one of the two needs to be satisfied for the job to fire.

General Mistakes Continued

● An overflow range goes wild○ "0 23-1 * * *" is reasonable○ "0 23-1 * DEC-MAR FRI-MON". What does this

mean?■ Cron can produce many different edge

cases in the above scenario. Do it at your own peril!

4.1 ReleaseWe have around 200 patches for this release and it’s been a while!● All HA related work● Cron scheduling for coordinator job● Consolidation of JPAExecutors (not user

facing”● Introduction of SharelibService (some

user facing backward imcompatibilty issue)

4.1 Release continued

● Better integration with YARN RM HA and restart

● Oozie Sqoop CLI functionality● Many more versatility of Coordinator

job functionalities● Major overhaul of coordinator job

execution order

Recommended