34
Easy and efficient processing of massive data sets

Easy and efficient processing of massive data sets

Embed Size (px)

Citation preview

Page 1: Easy and efficient processing of massive data sets

Easy and efficient

processing of massive data

sets

Page 2: Easy and efficient processing of massive data sets
Page 3: Easy and efficient processing of massive data sets

All non geeks, please get a coffee, this is going to be

painful!

Page 4: Easy and efficient processing of massive data sets

SMS Campaigns logging system

PROCESSING LOGSSMS

Page 5: Easy and efficient processing of massive data sets

SMS Campaigns logging system

PROCESSING LOGSSMS

Huge SMS Queueto much processing and logging time

Page 6: Easy and efficient processing of massive data sets

Use Timers for Asynchronous Processing

RAW PROCESSING LOGSSMS

Page 7: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

Timer times out!

Use Timers for Asynchronous Processing

Page 8: Easy and efficient processing of massive data sets

Best practices for building timers

• Never times out

• Doesn’t repeat work

• Ensures completeness

• Ensures data integrity

Page 9: Easy and efficient processing of massive data sets

Best practices for building timers

Page 10: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

Best practices for building timers

Page 11: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

Add Email campaigns logging

Page 12: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

Database performance degradation and timeouts due to locks

Add Email campaigns logging

Page 13: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

Sequence Timers

Page 14: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

3

Add Website logging

Page 15: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

3

Add Website logging

Page 16: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

Buffer

Page 17: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

Long peaks in SMS Campaigns caused starvation for Email and WebSite logs processing

Buffer

Page 18: Easy and efficient processing of massive data sets

Round Robin

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

3

RAW

PROCESSING

BUFFER MOVE

1 2

3

Page 19: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

EXTERNALSYSTEM

COLLECT RELEVANT DATA

31 2 4

4

Collect data to other systems

Page 20: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

EXTERNALSYSTEM

COLLECT RELEVANT DATA

31 2 4

4

Too heavy to get the records to collect.Available time slot not enough.

Collect data to other systems

Page 21: Easy and efficient processing of massive data sets

Queues

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

EXTERNALSYSTEM

COLLECT RELEVANT DATA

31 2 4

4

QUEUE

Queue entity to store the IDs to be processed.

Page 22: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

EXTERNALSYSTEM

COLLECT RELEVANT DATA

31 2 4

4

QUEUE

Enqueue is done in the processing timers through a light check.

Queues

Page 23: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

EXTERNALSYSTEM

COLLECT RELEVANT DATA

31 2 4

4

QUEUE

Dequeue next.

Queues

Page 24: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

RAW

PROCESSING

BUFFER MOVE

3

EXTERNALSYSTEM

COLLECT RELEVANT DATA

31 2 4

4

QUEUE

Gets one record by ID instead of doing a heavy query to get a list of records

Queues

Page 25: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

3

RAW

PROCESSING

BUFFER MOVE

1 2

3

EXTERNALSYSTEM

QUEUE

COLLECT RELEVANT DATA

No need to be included in the round robin with the processing timers

Queues

Page 26: Easy and efficient processing of massive data sets

Queues – timer to process logs

Page 27: Easy and efficient processing of massive data sets

Queues – Enqueue inside DoSomething

Page 28: Easy and efficient processing of massive data sets

Queues – Timer to process queue

Page 29: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

3

RAW

PROCESSING

BUFFER MOVE

1 2

3

EXTERNALSYSTEM

QUEUE

COLLECT RELEVANT DATA

Queues

Page 30: Easy and efficient processing of massive data sets

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

3

RAW

PROCESSING

BUFFER MOVE

1 2

3

EXTERNALSYSTEM

QUEUE

COLLECT RELEVANT DATAProgressively slower!

Queues

Page 31: Easy and efficient processing of massive data sets

Archive

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

3

RAW

PROCESSING

BUFFER MOVE

1 2

3

EXTERNALSYSTEM

QUEUE

COLLECT RELEVANT DATA

ARCHIVEMOVE

Page 32: Easy and efficient processing of massive data sets

OTHER CATALOG

Multiple database catalogs

RAW PROCESSING LOGSSMS

RAW

PROCESSING

EMAIL

1

2

3

RAW

PROCESSING

BUFFER MOVE

1 2

3

EXTERNALSYSTEM

QUEUE

COLLECT RELEVANT DATA

ARCHIVEMOVE

Page 33: Easy and efficient processing of massive data sets

A system that deals with massive amounts of data is

never finished…

Page 34: Easy and efficient processing of massive data sets

Thank You!

[email protected]