58
1 15-319 / 15-619 Cloud Computing Recitation 8 October 16, 2018

15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

1

15-319 / 15-619Cloud Computing

Recitation 8

October 16, 2018

Page 2: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Overview● Last week’s reflection

○ Project 3.2

○ OLI Unit 3 - Module 13

○ Quiz 6

● This week’s schedule○ Project 3.3

○ OLI Unit 4 - Module 14 (Storage)

○ Quiz 7 - Thursday, Oct 18

○ Intro. to Scala Primer and Intro. to Apache Spark Primer

● Team Project, Phase 1○ Checkpoint 1 report is due on Sunday!

○ Q1 early bird bonus is due on Sunday

2

Page 3: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Last Week● OLI : Module 13 - Storage and network virtualization

○ Quiz 6

● Project 3.2

○ Social Networking Timeline with Heterogenous Backends

■ MySQL

■ Neo4j

■ MongoDB

■ Choosing Databases

● Consistency Programming Exercise on Cloud9

3

Page 4: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

This Week● OLI : Module 14 - Cloud Storage

● Quiz 7 - Thursday, Oct 18 (Not Friday!)

● Project 3.3 - Sunday, October 21

○ Task 1: Implement a Strong Consistency Model for

distributed data stores

○ Task 2: Implement a Strong Consistency Model

cross-region data stores

○ Bonus task: Implement an Eventual Consistency Model

● Primers released this week

○ Introduction to Scala Primer

○ Introduction to Apache Spark

4

Page 5: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Conceptual Topics - OLI Content

● OLI UNIT4 - Module 14: Cloud Storage○ File Systems and Databases○ Scalability and Consistency○ NoSQL, NewSQL and Object Storage

● Quiz 7○ DUE on Thursday, October 18

■ Remember to hit submit before the deadline!

5

Page 6: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Individual Projects

● Done

○ P3.1: Files v/s Databases - comparison and Usage of flat

files, MySQL, Redis, and HBase

○ NoSQL Primer, HBase Basics Primer

● Done

○ P3.2: Social networking with heterogeneous backends

○ MongoDB Primer

● Now

○ P3.3: Replication and Consistency models

○ Intro. to Java Multithreading Primer

○ Thread-safe Programming Primer

○ Intro. to Consistency Models Primer6

Page 7: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Scale of Data is Growing

International Data Corporation's (IDC) Digital Universe Study predicts an increase in the amount of data created globally from ● 16 zettabytes in 2016

to ● 160 zettabytes in 2025.

7

Guo H. Big Earth data: A new frontier in Earth and information sciences[J]. Big Earth Data, 2017, 1(1-2): 4-20.

Page 8: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Users are Global

8

~26ms

~14ms

● Speed of Light (≈3.00×108 m/s)● Inherent latencies

Pittsburgh

Moscow

San Francisco

Page 9: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

● Typical end-to-end latency

○ The client sends the request to the server

■ Network latency

○ The backend processes the request and sends

the response

■ Overhead of fetching and processing data

from backend

■ Network latency

○ The client receives the response

Typical End-To-End Latency

9

Page 10: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Latency with a Single Backend

10

Client 2:Pittsburgh

Client 3:Moscow

Client 1: San Francisco

Backend Storage

~20ms ~40ms

~320ms

Client Statistics:Min Latency: 20msMax Latency: 320msAverage Latency: 126ms

Page 11: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Replicate the Data Globally

11

Client 2:Pittsburgh

Client 3:Moscow

Client 1: San Francisco

Backend Storage 1: USA West

~20ms

Backend Storage 2: Europe Central

~40ms

~20ms

Client Statistics:Min Latency: 20msMax Latency: 40msAverage Latency: 26.6ms

Page 12: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Replicate the Data Close to Users

12

Client 2:Pittsburgh

Client 3:Moscow

Client 1: San Francisco

Backend Storage 1: USA West

~20ms

Backend Storage 2: Europe Central

~20ms

~20ms

Client Statistics:Min Latency: 20msMax Latency: 20msAverage Latency: 20ms

Backend Storage 3: USA East

Page 13: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Demo

Run:• ping www.cmu.edu• ping www.google.com• ping www.berkeley.edu• ping www.nus.edu.sg

Compare the latencies of these global webpages!

13

Page 14: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

● As you can see, by adding replicas to strategic

locations in the world, we can significantly reduce

the latency seen by our global clients

● Each added datacenter decreases the average

latency

● But how about the cost?

Replication

14

Page 15: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

What If We Continue to Replicate?

15

Client Statistics:Min Latency: ??Max Latency: ??Average Latency: ??

Cost: ?????

We have to consider cost as well as data consistency across replicas, which increases the latency for writes.

Page 16: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Replication READ

16

Client 2:Pittsburgh

Client 3:Moscow

Client 1: San Francisco

Backend Storage 1: USA West

~20ms

Backend Storage 3: Europe Central

~20ms

~20ms

Read Operation:

Min Latency: 20msMax Latency: 20msAverage Latency: 20ms

Backend Storage 2: USA East

Page 17: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Replication WRITE

17

Client 2:Pittsburgh

Client 3:Moscow

Client 1: San Francisco

Backend Storage 1: USA West

Backend Storage 3: Europe Central

~20ms

Write Operation:

Latency for Client 2 = 20ms +MAX(40ms, 240ms)= 260ms

All the clients suffer fromlong latency

Backend Storage 2: USA East

~40ms~240ms

Page 18: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

● Read operations are very fast! ○ All clients have a replica close to them to

access● Write requests are quite slow

○ Write requests must update all the replicas○ If multiple write requests for a certain key,

then they may have to wait for each other to complete

Replication Reads and Writes

18

Page 19: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

● Duplicate the data across multiple instances● Advantages

○ Low latency for reads○ Reduce the workload of a single backend server

(Load balance for hot keys) ○ Handle failures of nodes (High availability)

● Disadvantages○ Requires more storage capacity and cost○ Updates are slower○ Changes must reflect on all datastores either

instantly or eventually (Data Consistency)

Pros and Cons of Replication

19

Page 20: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Data Consistency Becomes Necessary

● Data consistency across replicas is important○ Five consistency levels:

Strict, Strong (Linearizability), Sequential, Causal

and Eventual Consistency

● This week’s task: Implement Strong Consistency○ All datastores must return the same value for a key

at all times

○ The order in which the values are updated must

be preserved at all replicas

● Bonus: Implement Eventual Consistency20

Page 21: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Consistency LevelBad Example

21

Account Balance

xxxxx-4437 $100

Page 22: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Consistency LevelBad Example

22

Account Balance

xxxxx-4437 $100

Withdraw $100

Withdraw $100

Page 23: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Consistency LevelBad Example

23

Account Balance

xxxxx-4437 $0

$100

$100

Bank lost $100

Page 24: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Consistency LevelGood Example

24

Account Balance

xxxxx-4437 $100

Withdraw $100

Withdraw $100

Page 25: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Consistency LevelGood Example

25

Account Balance

xxxxx-4437 $100

Withdraw $100

Withdraw $100

Page 26: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Consistency LevelGood Example

26

Account Balance

xxxxx-4437 $0

$100

$0

Page 27: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3: Consistency Models

27

Tradeoff: Consistency vs. Latency● Strict● Strong● Sequential● Causal● Eventual

vs.

Page 28: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3 Task 1: Strong Consistency

28

Coordinator:

● A request router that

routes the web requests

from the clients to

datacenter

● Preserves the order of

both READ&WRITE

requests

Datastore:

● The actual backend

storage that persists

collections of data

Page 29: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3 Task 1: Strong Consistency

29

Single PUT request for key ‘X’

● Block all GET for key ‘X’

until all datastores are

updated

● GET requests for a

different key ‘Y’ should

not be blocked

Multiple PUT requests for ‘X’

● Resolved in order of their

timestamp when received

by Coordinator.

● Any GET request in

between 2 PUTs must

return the first PUT value

Page 30: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3 Task 2: ArchitectureGlobal Coordinators and Data Stores

us-westus-east

Singapore

DCI

coordinator datacenter

DCI

coordinator datacenterDCI

coordinator datacenter

30

Page 31: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3 Tasks 1 & 2: Strong Consistency

31

● Note: Every request has a global timestamp order

○ In task 1, the timestamp is issued by the

coordinator

○ In task 2, the timestamp is issued by the TrueTime

Server

● Operations must be ordered by the timestamps

Requirement: At any given point of time, all clients

should read the same data from any datacenter

replica

Page 32: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3 Task 2: Architecture

3dd

Page 33: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Task 2 Workflow and Example

• Launch a total of 8 machines (3 data centers, 3 coordinators, 1

truetime server and 1 client).

• All machines should be launched in US East region.

We will simulate global latencies for you.

• The “US East” here has nothing to do with

the simulated location of datacenters

and coordinators in the project.

• Implement the code

for the Coordinators and Datastores

33

Page 34: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

PRECOMMIT

34

● This API method will contact the datastores of a given region, and notify it that a PUT request is being serviced for the specified key, starting at the specified timestamp.

Page 35: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

US-EAST DC

US-WEST DC

SINGAPORE DC

US-EAST COORDINATOR

US-WEST COORDINATOR

SINGAPORECOORDINATOR

Client

P3.3 Task 2: Complete KeyValueStore.java (in DCs) and Coordinator.java (in Coordinators)

35

TrueTime Server

Page 36: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

US-EAST DC

US-WEST DC

SINGAPORE DC

US-EAST COORDINATOR

US-WEST COORDINATOR

SINGAPORECOORDINATOR

Client

P3.3 Task 2: Complete KeyValueStore.java (in DCs) and Coordinator.java (in Coordinators)

36

TrueTime Server

put?key=X&value=1

Page 37: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

US-EAST DC

US-WEST DC

SINGAPORE DC

US-EAST COORDINATOR

US-WEST COORDINATOR

SINGAPORECOORDINATOR

Client

P3.3 Task 2: Complete KeyValueStore.java (in DCs) and Coordinator.java (in Coordinators)

37

TrueTime Server

put?key=X&value=1

KeyValueLib.getTime()

Page 38: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

US-EAST DC

US-WEST DC

SINGAPORE DC

US-EAST COORDINATOR

US-WEST COORDINATOR

SINGAPORECOORDINATOR

Client

P3.3 Task 2: Complete KeyValueStore.java (in DCs) and Coordinator.java (in Coordinators)

38

TrueTime Server

put?key=X&value=1

precommit?key=X&timestamp=1

Page 39: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

US-EAST DC

US-WEST DC

SINGAPORE DC

US-EAST COORDINATOR

US-WEST COORDINATOR

SINGAPORECOORDINATOR

Client

P3.3 Task 2: Complete KeyValueStore.java (in DCs) and Coordinator.java (in Coordinators)

39

TrueTime Server

put?key=X&value=1

PUT(REGIONAL-DNS, "X", "1", 1, "strong")

Page 40: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

US-EAST DC

US-WEST DC

SINGAPORE DC

US-EAST COORDINATOR

US-WEST COORDINATOR

SINGAPORECOORDINATOR

Client

P3.3 Task 2: Complete KeyValueStore.java (in DCs) and Coordinator.java (in Coordinators)

40

TrueTime Server

put?key=X&value=1

Response back

Page 41: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

P3.3: Eventual Consistency (Bonus)

41

● Write requests are performed in the order received by local coordinator○ Operations may not be blocked for replica

consensus (no communication between servers across region)

● Clients that request data may receive multiple versions of the data, or stale data○ Problems left for the application owner to

resolve

Page 42: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

More Hints● In strong consistency, “PRECOMMIT” should be

useful to help you lock requests because they are

able to communicate with datastores

● Don’t wait for the PRECOMMIT messages that

might be sent from other coordinators halfway,

or you cannot pass all the test cases

● Lock by the key across all the datacenters in

strong consistency

● Remember to update both KeyValueStore.java

and Coordinator.java in Eventual Consistency 42

Page 43: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

● Read all three primers (PLEASE!)

● Consider the differences between the 2

consistency models before writing code

● Think about possible race conditions

● Read the hints in the writeup carefully

● Don’t modify any class except

Coordinator.java and KeyValueStore.java

Suggestions

43

Page 44: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

How to Run Your Program

● Run “./copy_code_to_instances” in client instance to copy your

code to servers on each of the data centers instance,

coordinators instance.

● Run “./start_servers” in the client instance to start the servers

on each of the data center instances, coordinator instances

and the truetime server instance.

● Use “./consistency_checker strong”, or “./consistency_checker

eventual” to test your implementation of each consistency.

(Our grader uses the same checker)

● If you want to test one simple PUT/GET request, you could

directly send the request to datacenters or coordinators.

44

Page 45: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Start early!Trickiest Individual Project!

45

Page 46: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

tWITTER DATA ANALYTICS:TEAM PROJECT

46

Page 47: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Team Project

33

Twitter Analytics Web Service• Given ~1TB of Twitter data• Build a performant web service

to analyze tweets• Explore web frameworks• Explore and optimize storage systems

Page 48: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Team Project● Phase 1:

○ Q1○ Q2 (MySQL AND HBase)

● Phase 2○ Q1○ Q2 & Q3 (MySQL AND HBase)

● Phase 3○ Q1○ Q2 & Q3 (MySQL OR HBase OR ???)

Input your team account ID and GitHub

username on TPZ

34

Page 49: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Team Project Deadlines● Phase 1 milestones:

○ Checkpoint 1:■ Report, due on Sunday, 10/14

○ Checkpoint 2:■ Q1 on scoreboard, due on Sunday, 10/21

○ Phase 1 Deadline:■ Q2 on scoreboard, due on Sunday, 10/28

○ Phase 1, code and report:■ due on Tuesday, 10/30

36

Page 50: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Web Frameworks● Java: Vertx, Undertow, Rapidoid, Spring Boot● Python: Flask, Django, Tornado● Javascript: Node.JS● Ruby: Ruby on Rails

50

Page 51: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Choosing a Web Framework● Web Framework

○ Which one should I choose?■ Consider:

● Performance is at the top priority● Performance is not the only criteria to choose

the web framework

51

Page 52: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Q1 FAQ, QR Decoding● QR Code Q & A

○ Why is the string in the decoding example different from the encoding example?■ Read the write-up carefully. We asked you not only

to decode a simple QR Code. We want you to identify the QR Code in a 32*32 Matrix.

○ What is the order of decoding a QR Code?■ Do the logistic map with the whole matrix from left

to right, top to bottom■ Locate the QR Code and recognize the rotation

degree■ Extract the payload and translate it.

52

Page 53: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Q1 FAQ, QR Encoding● QR Code Q & A

○ The left part of the QR, why do the top and bottom position detection patterns have 8*9 cells (Figure 7)?■ We made the 9th column blank to simplify the

format information. What you need to do is follow the zigzag as shown in Figure 7 in the writeup.

53

Page 54: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Q1 FAQ, Misc● QR Code Q & A

○ Do we need to consider the case when the payload happens to have a position detection pattern at the bottom right.■ This will not happen since every 8 bits follow 1

correction bit. If the QR Code is valid, it won’t form a position detection pattern on the bottom right.

○ What type of instance should we use for submission?■ M family but not larger than the large series (e.g.,

m4.large, m5.large)

54

Page 55: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Profiling● Benchmarking and Logging Tools

○ Cloudwatch○ Stopwatch (Java) & Log○ NewRelic

55

Page 56: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Team Project, Q2● Query 2 is coming next week

○ ETL is the most costly part. Please review your ETL code carefully before running it.

○ Think about the schema before running ETL. Otherwise, you might have to rerun your ETL job.

● Read this good question on Piazza: https://piazza.com/class/jkvtywetsu35vh?cid=1914

56

Page 57: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

57

Team Project Time TablePhase (and query due) Start Deadline Code and Report Due

Phase 1● Q1, Q2

Monday 10/08/201800:00:00 ET

Q1: Sunday 10/21/201823:59:59 ETQ2: Sunday 10/28/201823:59:59 ET

Tuesday 10/30/201823:59:59 ET

Phase 2● Q1, Q2,Q3

Monday 10/29/201800:00:00 ET

Sunday 11/11/201815:59:59 ET

Phase 2 Live Test (Hbase AND MySQL)

● Q1, Q2, Q3

Sunday 11/11/201818:00:00 ET

Sunday 11/11/201823:59:59 ET

Tuesday 11/13/201823:59:59 ET

Phase 3● Q1, Q2, Q3

Monday 11/12/201800:00:00 ET

Sunday 12/02/201815:59:59 ET

Phase 3 Live Test (Hbase OR MySQL)

● Q1, Q2, Q3

Sunday 12/02/201818:00:00 ET

Sunday 12/02/201823:59:59 ET

Tuesday 12/04/201823:59:59 ET

57

Page 58: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation08.pdf · Intro. to Scala Primer and Intro. to Apache Spark Primer Team Project, Phase 1 Checkpoint 1 report

Upcoming Deadlines

● Conceptual Topics: OLI (Module 14)

○ Quiz 7 due: Thursday, 10/18/2018 11:59 PM Pittsburgh

● P3.3: Consistency Models

○ Due: Sunday, 10/21/2018 11:59 PM Pittsburgh

● Team Project: Phase 1

○ Query 1

■ Due: 10/21/2018 11:59 PM Pittsburgh

○ Query 2, (Next Sunday, Oct 28)

■ Due: 10/28/2018 11:59 PM Pittsburgh

58