15
Stream Upload And Asynchronous Job Processing System Lê Bá Minh – [email protected] Technical Manager – Zalo Team - VNG

Stream upload and asynchronous job processing in large scale systems

Embed Size (px)

DESCRIPTION

Presentation at Barcamp Saigon 2013 - RMIT 7th July Presenter: Lê Bá Minh (VNG)

Citation preview

Page 1: Stream upload and asynchronous job processing  in large scale systems

Stream Upload And Asynchronous Job Processing System

Lê Bá Minh – [email protected] Manager – Zalo Team - VNG

Page 2: Stream upload and asynchronous job processing  in large scale systems

Agenda

• 1/ Why we need an Asynchronous Job Processing System?• 2/ How it works ?• 3/ Application• 4/ Q &A

Page 3: Stream upload and asynchronous job processing  in large scale systems

Parallel Stream Upload

• Data is separated in chunks

Page 4: Stream upload and asynchronous job processing  in large scale systems

Facts

• Zalo Stream Upload• Background continuous Voice Upload• Background Image upload• …

• Facts (now)• 1M voices /day • 800K images /day• Peak: 500 Chunks/second

• Expect:• Scalable (more than 5000 chunks/second)• High performance

Page 5: Stream upload and asynchronous job processing  in large scale systems

What we need• Asynchronous Job processing System

Collect Data

Processing Data

Response

Collect Data

Processing DataResponse

Workers

Page 6: Stream upload and asynchronous job processing  in large scale systems

What we need

• Asynchronous Job processing System• Batch Job• Big data job• High Reliable: No job missed• Distributed job processing workers • High performance• Persistent• Load balancing, Failed over, Recoverable

Page 7: Stream upload and asynchronous job processing  in large scale systems

Open-source solutions

• Share-memory workers• All workers in one physical server• No fail-over• Un-scalable

• Gearman• Good but not completely fit our requirement• No Batch Job support• Not full reliable (lost job)• Not full load-balance• Un-stable if more than 2000 jobs/second

Page 8: Stream upload and asynchronous job processing  in large scale systems

Zalo Asyn Job Processing System

Client

Client

Worker 1

Worker 2

Worker 3

Z Database

Short Connection

Long Connection

TCP

TCP

Worker Manager

Job Caching

Job Manager

Persistent Manager

Job Clean-Up

Job Server

TCP

TCP

TCP

Page 9: Stream upload and asynchronous job processing  in large scale systems

Implementation

• C/C++ for Job Server• C/C++, Java for client and workers • Binary Protocol• Z-Database

Page 10: Stream upload and asynchronous job processing  in large scale systems

Job State

Queuing

Processing

Failed Time Out

Finished

Deliver to Worker

Worker ACK Failed

Worker ACK Finished

No ACK

Started

Page 11: Stream upload and asynchronous job processing  in large scale systems

Job Type

• Single Job• Simple task • Immediately deliver

• Batch Job• Multiple tasks• Deliver when received all tasks

Page 12: Stream upload and asynchronous job processing  in large scale systems

Deployment

Job Server 1

Job Server 2

Synchronized

Business Server

Worker 1

Worker 2

Worker 3

Page 13: Stream upload and asynchronous job processing  in large scale systems

Applications

• Using for all Asynchronous job processing in Zalo: voice upload, image upload, feed processing…• Benchmark (single server)

• 50K images/seconds (640x480)• 50k voices/seconds (30s)

• Advantages• Batch Jobs• Never lost job• Worker can restart or stop any time• Fail-over, Load Balancing, Quick recover in failure

• Issue• Job duplication (handled by worker)

Page 14: Stream upload and asynchronous job processing  in large scale systems

Q&A

Page 15: Stream upload and asynchronous job processing  in large scale systems