Download pdf - Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Transcript

Page 1: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Big Data Processing withSpark and AWS EMR @glomex17.10.2016MichaelLudwig

Page 2: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Our Architecture

2

Page 3: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

3

Page 4: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Our Use Cases

4

Billing Pre-Aggregations

Interactive Big Data

Page 5: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Spark components

5

Spark 1.6, PySpark, spark-submit, DataFrames, SparkSQL, UDFs, Accumulators

Page 6: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Example: SparkSQL

6

Page 7: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

EMR Cluster Startup

7

AWS Web Console AWS CLI

AWS SDKs(Python, Java, JS

etc.)

Page 8: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Startup parameters

8

Page 9: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Spot prices

9

Page 10: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Cluster Interaction

10

Page 11: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

YARN Manager

11

Page 12: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Monitoring: Spark UI

12

Page 13: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Monitoring: Ganglia on EMR

13

Page 14: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Error Troubleshooting

14

Page 15: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Summary§ EMR§ Easyclusterstartupandconfiguration§ Throw-Away,isolatedclusters§ Nobigupfrontinvestmentsneeded

§ Spark§ BestframeworktogetstartedwithBigdata§ Bigcommunity&fastdevelopment§ Localdevelopmenteasy

15

Page 16: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Backup§ TODO

16

Page 17: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

EMR Access Urls

17

Page 18: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

RDD, DataFrame and DataSet

18

Page 19: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Spark Cluster

19

Page 20: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

In-Memory Computation

20

Page 21: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Operations§ placeholder

21

Page 22: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

Sample Transformations

22

Page 23: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

RDD Lineage

23

Page 24: Big Data Processing with Spark and AWS EMR @ glomexfiles.meetup.com/14546712/AWSMeetup-EMR_final_compressed.pdfEstimated cost for this cluster is L 76 USD per hour (avg last 9ø days

RDD DAG

24

Recommended

EMR 11491 Flight Of The Intruder - alle-noten.de · Ben Hur Suite (Rozsa) N° EMR Blasorchester Concert Band EMR 11524 EMR 11547 EMR 11529 EMR 11491 EMR 11431 EMR 11322 EMR 11456

EMR 11491 Flight Of The Intruder - alle-noten.de · Ben Hur Suite (Rozsa) N° EMR Blasorchester Concert Band EMR 11524 EMR 11547 EMR 11529 EMR 11491 EMR 11431 EMR 11322 EMR 11456 Documents

DISCOGRAPHY Sweet Bells Rumba · Sweet Bells Rumba (Noris) N° EMR Blasorchester Concert Band EMR 11463 EMR 11458 EMR 11468 EMR 11457 EMR 11449 EMR 11466 EMR 11464 EMR 11470 EMR 11454

DISCOGRAPHY Sweet Bells Rumba · Sweet Bells Rumba (Noris) N° EMR Blasorchester Concert Band EMR 11463 EMR 11458 EMR 11468 EMR 11457 EMR 11449 EMR 11466 EMR 11464 EMR 11470 EMR 11454 Documents

EMR 12123 Remo Williams · The Avengers (McNeely) N° EMR Blasorchester Concert Band EMR 12057 EMR 12070 EMR 12092 EMR 12109 EMR 12111 EMR 12121 EMR 12123 EMR 12124 Time 6’14 4’31

EMR 12123 Remo Williams · The Avengers (McNeely) N° EMR Blasorchester Concert Band EMR 12057 EMR 12070 EMR 12092 EMR 12109 EMR 12111 EMR 12121 EMR 12123 EMR 12124 Time 6’14 4’31 Documents

12th Street Rag · 2016-07-08 · Hit The Road Jack (Mayfield) N° EMR Blasorchester Concert Band EMR 10088 EMR 10169 EMR 10172 EMR 10236 EMR 10151 EMR 10076 EMR 10209 EMR 10148 EMR

12th Street Rag · 2016-07-08 · Hit The Road Jack (Mayfield) N° EMR Blasorchester Concert Band EMR 10088 EMR 10169 EMR 10172 EMR 10236 EMR 10151 EMR 10076 EMR 10209 EMR 10148 EMR Documents

· trumpet solo emr 6001 emr 639 emr 677 trumpet, piano emr 625 emr 624 emr 626 emr 693 emr 640 emr 615 emr 617 emr 618 emr 619 emr 678 emr 6060 emr 6016 emr 616 emr 6066 emr 6067

· trumpet solo emr 6001 emr 639 emr 677 trumpet, piano emr 625 emr 624 emr 626 emr 693 emr 640 emr 615 emr 617 emr 618 emr 619 emr 678 emr 6060 emr 6016 emr 616 emr 6066 emr 6067 Documents

16069 Romance Strs - alle-noten.deHejre Kati (Hubay) N° EMR Clarinet & Orchestra EMR 16044 EMR 16058 EMR 16060 EMR 16062 EMR 16064 EMR 16066 EMR 16068 EMR 16069 EMR 16071 EMR 16073

16069 Romance Strs - alle-noten.deHejre Kati (Hubay) N° EMR Clarinet & Orchestra EMR 16044 EMR 16058 EMR 16060 EMR 16062 EMR 16064 EMR 16066 EMR 16068 EMR 16069 EMR 16071 EMR 16073 Documents

DISCOGRAPHY - Amazon S3 · 2020. 6. 9. · Brass Band EMR 1433 EMR 1241 EMR 2507 EMR 2760 EMR 2753 EMR 2574 EMR 1424 EMR 2622 EMR 1240 EMR 1886 EMR 2634 EMR 2551 EMR 1693 EMR 2761

DISCOGRAPHY - Amazon S3 · 2020. 6. 9. · Brass Band EMR 1433 EMR 1241 EMR 2507 EMR 2760 EMR 2753 EMR 2574 EMR 1424 EMR 2622 EMR 1240 EMR 1886 EMR 2634 EMR 2551 EMR 1693 EMR 2761 Documents

DISCOGRAPHY - Amazon S3 · 2020. 6. 24. · Concert Band EMR 10093 EMR 10389 EMR 10081 EMR 10057 EMR 10405 EMR 10331C EMR 10041 EMR 10379 EMR 1826 EMR 10390 ... The chorus may also

DISCOGRAPHY - Amazon S3 · 2020. 6. 24. · Concert Band EMR 10093 EMR 10389 EMR 10081 EMR 10057 EMR 10405 EMR 10331C EMR 10041 EMR 10379 EMR 1826 EMR 10390 ... The chorus may also Documents

DISCOGRAPHY - Amazon S3 · Take Five N° EMR Brass Band EMR 3619 EMR 3620 EMR 3621-EMR 3622 EMR 3623 EMR 3624 EMR 3625 EMR 3626 EMR 3627 EMR 3628 EMR 3629 EMR 3630 EMR 3631 EMR 3632

DISCOGRAPHY - Amazon S3 · Take Five N° EMR Brass Band EMR 3619 EMR 3620 EMR 3621-EMR 3622 EMR 3623 EMR 3624 EMR 3625 EMR 3626 EMR 3627 EMR 3628 EMR 3629 EMR 3630 EMR 3631 EMR 3632 Documents

EMR 11503 Rock Star ancien titre Rock Fever · EMR 11503 EMR 10119 EMR 11808 EMR 11623 EMR 11515 EMR 11411 EMR 11802 EMR 11739 EMR 11625 EMR 11426 EMR 11439 EMR 11831 Time 3’26

EMR 11503 Rock Star ancien titre Rock Fever · EMR 11503 EMR 10119 EMR 11808 EMR 11623 EMR 11515 EMR 11411 EMR 11802 EMR 11739 EMR 11625 EMR 11426 EMR 11439 EMR 11831 Time 3’26 Documents

EMR 9120 Dynasty Theme BB · Ben Hur Suite (Rozsa) N° EMR Blasorchester Concert Band EMR 11524 EMR 11547 EMR 11529 EMR 11491 EMR 11431 EMR 11322 EMR 11456 EMR 11455 Time 3’12 5’02

EMR 9120 Dynasty Theme BB · Ben Hur Suite (Rozsa) N° EMR Blasorchester Concert Band EMR 11524 EMR 11547 EMR 11529 EMR 11491 EMR 11431 EMR 11322 EMR 11456 EMR 11455 Time 3’12 5’02 Documents

Voices 8 Gospel - Musiknoten Johanna Lindner & SohnVoices 8 Gospel N° EMR Brass Band EMR 3664 EMR 3939 EMR 3940 EMR 3941 EMR 3662 EMR 3942 EMR 3943 EMR 3944 EMR 3716 EMR 3945 EMR

Voices 8 Gospel - Musiknoten Johanna Lindner & SohnVoices 8 Gospel N° EMR Brass Band EMR 3664 EMR 3939 EMR 3940 EMR 3941 EMR 3662 EMR 3942 EMR 3943 EMR 3944 EMR 3716 EMR 3945 EMR Documents

EMR 11813 Pearls and Diamonds · Bass Guitar / String Bass (optional) ... Pearls and Diamonds N° EMR Brass Band EMR 9436 EMR 9437 EMR 9438 EMR 9223 EMR 9439 EMR 9440 EMR 9441 EMR

EMR 11813 Pearls and Diamonds · Bass Guitar / String Bass (optional) ... Pearls and Diamonds N° EMR Brass Band EMR 9436 EMR 9437 EMR 9438 EMR 9223 EMR 9439 EMR 9440 EMR 9441 EMR Documents

· Concerto N° 1 Trumpet, Piano (continued) EMR 666 EMR 676 EMR 665 EMR 663 EMR 641 EMR 679 EMR 682 EMR 6098 EMR 644 EMR 6075 EMR 6061 EMR 6012 EMR 6065 EMR 683 EMR 6021 EMR 6026

· Concerto N° 1 Trumpet, Piano (continued) EMR 666 EMR 676 EMR 665 EMR 663 EMR 641 EMR 679 EMR 682 EMR 6098 EMR 644 EMR 6075 EMR 6061 EMR 6012 EMR 6065 EMR 683 EMR 6021 EMR 6026 Documents

Can-Can Alla Rossini! DISCOGRAPHY · Can-Can Alla Rossini (Buttall) N° EMR Blasorchester EMR 11274 EMR 10034 EMR 11127 EMR 11235 EMR 11200 EMR 10963 EMR 11239 EMR 11279 EMR 11170

Can-Can Alla Rossini! DISCOGRAPHY · Can-Can Alla Rossini (Buttall) N° EMR Blasorchester EMR 11274 EMR 10034 EMR 11127 EMR 11235 EMR 11200 EMR 10963 EMR 11239 EMR 11279 EMR 11170 Documents

EMR 9107 Zodiac - BB Parts on Landscape · 2015. 5. 28. · Blasorchester Concert Band EMR 11532 EMR 11582 EMR 11584 EMR 11589 EMR 11599 EMR 11603 EMR 11612 EMR 11602 EMR 11637 EMR

EMR 9107 Zodiac - BB Parts on Landscape · 2015. 5. 28. · Blasorchester Concert Band EMR 11532 EMR 11582 EMR 11584 EMR 11589 EMR 11599 EMR 11603 EMR 11612 EMR 11602 EMR 11637 EMR Documents

DISCOGRAPHY - s3.eu-central-1.amazonaws.com · Concert Band EMR 1203 EMR 11058 EMR 11216 EMR 11206 EMR 11217 EMR 11056 EMR 11213 EMR 11057 EMR 11210 ... N° EMR Brass Band EMR 1204

DISCOGRAPHY - s3.eu-central-1.amazonaws.com · Concert Band EMR 1203 EMR 11058 EMR 11216 EMR 11206 EMR 11217 EMR 11056 EMR 11213 EMR 11057 EMR 11210 ... N° EMR Brass Band EMR 1204 Documents

1ERE PAGE BIG BAND 11197 - edrmartin.com · 4’02 The World Of Flowers N° EMR Brass Band EMR 3814 EMR 3815 EMR 3816 EMR 3817 EMR 3818 EMR 3819 EMR 3820 EMR 3821 ... EMR 1659 Go

1ERE PAGE BIG BAND 11197 - edrmartin.com · 4’02 The World Of Flowers N° EMR Brass Band EMR 3814 EMR 3815 EMR 3816 EMR 3817 EMR 3818 EMR 3819 EMR 3820 EMR 3821 ... EMR 1659 Go Documents