40
Introduction First Steps Concurrency Replication Systems Future of Overlog Conclusion Experiences with Overlog Robert Soul´ e New York University Robert Soul´ e Depth Qualifying Exam Part 2

Experiences with Overlog - NYU Computer Sciencesoule/DQE_pt2.pdf · Experiences with Overlog Robert Soul e New York University Robert Soul e Depth Qualifying Exam Part 2. Introduction

Embed Size (px)

Citation preview

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Experiences with Overlog

Robert Soule

New York University

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Introduction

Declarative languages are effective for building systems.

Provide higher level abstractions.

Free the programmer from implementation details.

Compiler can offer correctness guarantees, or increasedreliability.

Compiler can implement performance optimizations.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Overview

Experiences with the declarative language Overlog.

Developed an improved parser.

Implemented static analysis concurrent execution.

Adapted Overlog for replication systems.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

P2 Review

Every node in the network has a runtime:

In memory database stores tuples.

Messages are sent asynchronously as tuples.

Programs are specified in a logic-based language.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Overlog Review

Logic based language similar to Datalog.

A program is composed of:

materialized statementslogical rules

Computation proceeds by introducing events, and performingdeductions.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

System Overview

Persistent Storage

External Queue

TransmitQueueOverLog

Program

Persistent Storage

External Queue

TransmitQueueOverLog

Program

Persistent Storage

External Queue

TransmitQueueOverLog

Program

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Developed an Improved Parser

Cleanly specify the grammar in Rats!.

Implement type inference.

Add error checking.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Parser

Lexical analysis and parsing integrated in PEGs

olg lexer.lex and olg parser.y combined

Grammar is modularized

Identifier - variable, function, aggregate identifiers...Symbol - modifies xtc.symbol with :=, :-, @Constant - floating point, integer, boolean...Reserved - materizialized, keys, infinity...Core - rule, tuple, postfix expression...

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Parser

rule: OLG_NAME functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($2, $4, false, $1); }

| OLG_NAME OLG_DEL functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($3, $5, true, $1); }

| functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($1, $3, false); }

| OLG_DEL functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($2, $4, true); }

| OLG_NAME functor OLG_IF aggview OLG_DOT{$$ = new compile::parse::Rule($2, $4, false, $1); }

| functor OLG_IF aggview OLG_DOT{$$ = new compile::parse::Rule($1, $3, false); }

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Parser

generic Rule =RuleIdentifier? ("delete":Word)? Tuplevoid:":-":Symbol TupleOrExpressionList ;

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Type Inference

Single Java file, extends xtc’s Visitor

Uses a Hindley-Milner-like algorithm

Constraints and unification

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Error Checking

Adds error checking for Overlog

Non-unique rule names

Redefined tuples (different arity, different types)

Expressions with incompatible types(boolean + int, string < boolean)

Functions redefined(different types, different number of arguments)

Variables redefined (different type)

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Concurrency

Auto-Parallelization for Declarative Network MonitoringRobert Soule, Robert Grimm, and Petros Maniatis Poster at SOSP2007

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Motivation

Building networked monitoring applications is too hard:

A variety of tasks in the same setting.

Different operational patterns for each task.

Deployed across different environments.

Computations may be interdependent.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Approach

Declarative programming can help

Higher level language abstracts away many details

Not quite a complete solution.

Need to focus on scalability.

Static analysis identifies parallelism

Scheduling decisions more accessible to the compiler

Additional declarations inform the scheduler

Augmented code allows concurrent execution

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Fixpoint

A single event fixpoint is the program state such that nofurther deductions can be made before changing system statewithout the introduction of a new event.

A fixpoint is the local unit of atomicity.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Types of Parallelism

Inter-fixpoint

Multiple fixpoint computations may proceed at the same time

Compute the transitive closure of all rules, partition tuplesinto read and write sets

Intra-fixpoint

Rules within a fixpoint can execute in parallel because thestate is static and language is single assignment

Need to be careful with side effects from function calls

Data Parallelism

Single instruction stream operating on multiple data sets(applying the same operation on every item in a list)

Requires runtime analysis (an estimate of data set size andlatency of state transfer)

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Dependencies

Compute the dependencies between tuples on a per-rule basis.

r1 synIn(@LclAddr, SrcAddr, f_now()) :-pktIn(@LclAddr, SrcAddr, D), f_tcpSyn(D).

r5 alarm(@LclAddr, SrcAddr, "CONNRST", f_now()) :-rstIn(@LclAddr, SrcAddr, TRST),synIn(@LclAddr, SrcAddr, TSYN),synackOut(@LclAddr, SrcAddr, DTSYNACK),not ackIn(@LclAddr, SrcAddr, TACK),TACK > TSYNACK > TSYN,TRST > TSYNACK > TSYN.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Dependencies

synInr1pktIn

alarmr5

synIn

rstIn

ackInglobal-Detector

synackOut

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Analysis

Perform DFS on the dependency graph.

Start at the non-materialized tuple on the right hand side of arule.

Stop recursion at the external or materialized tuple on the leftof a rule.

Mark tuples as read or write.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Analysis

alarmr5

synIn

rstIn

ackInglobal-Detector

synackOut

r1pktIn

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Example

Encode the results as Overlog:

concurrent("maxSuccDist", "node", "R").concurrent("maxSuccDist", "succ", "R").concurrent("maxSuccDist", "succ", "W").concurrent("bestLookupDist", "node", "R").concurrent("bestLookupDist", "finger", "R").concurrent("bestLookupDist", "lookup", "W").

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Conflict

Remove the next event from the external event queue.

Consult the concurrency table to compare read and write setsfor current and running fixpoints.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Conflict

cpuSpike

alarmr5

synIn

rstIn

ackInglobal-Detector

synackOut

r6

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Scheduling

Support multiple internal event queues. Check if:

Writenew ∩ (Readcurrent ∪Writecurrent) = ∅Writecurrent ∩ (Readnew ∪Writenew ) = ∅

If no conflicts, enqueue event on its own internal event queue, andstart execution.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Speculative Scheduling

An alternative is to speculatively execute, and check forconflict afterwards.

Best strategy depends on expected frequency of conflict.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Replication Systems

PADRE: A Policy Architecture for building Data REplicationsystems Nalini Belaramani, Jiandan Zheng, Amol Nayate, RobertSoule, Mike Dahlin, Robert Grimm In review, 14 pages, OSDI 2008

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Replication Systems Background

Replication is used for data availability, fault tolerance,performance, etc.

There are trade-offs for replication strategies: performance vs.consistency vs. availability vs. partition resilience.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Replication Systems Background

Can you make is easier to build new replication systems and extendexisting ones? Yes!

Separate policy from mechanism (PRACTI).

Policy is liveness strategy and safety constraints (PADRE).

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Policies

Safety

Block requests until they do not violate certain invariants.E.g. block until a write reaches 3 nodes, block until a serveracknowledges a write.

Liveness

View as a routing decision.Where to go to satisfy a read miss?Where to send updates?

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Overlog/P2

Convenient language/runtime layer specifying liveness policies.

Example:

/* send ACK of the last received invalto predecessor */

tRdy04 sendAck(@Pred, Clk, NodeId) :-readyToServer(@X, I), I == 1,predecessor(@X, Pred), Pred != -1,latestInval(@X, Clk, NodeId).

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Overlog/P2 Problem

Problems:

Poor performance.

Poor interface between PRACTI and P2.

How do we modify Overlog for liveness in replication?

Compile from Overlog to Java directly.

Simple code generation. We don’t compile to P2DL.

Provide a clean interface from the new runtime to PRACTI.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Introducing R/Overlog

Simplified Java implementation of the P2 runtime andOverlog compiler.

3,743 lines of code runtime (UT Austin).

5,652 lines of code in code generation (NYU).

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

R/Overlog Types Declarations

R/Overlog supports the following types:

int, float, location, string, boolean

Usually, the code generation can infer the types of all tuples.

Explicit type declarations when inference doesn’t work.

tuple apple (location, int).fun int f_now().

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

R/Overlog Location Type

Location is a first class type.

Syntactically, that means

neighbor("localhost:5000", "localhost:5001").

becomes

neighbor(localhost:5000, locahost:5001).

and

neighbor(@X, Y), X := "-".

is not valid.Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

R/Overlog Limitations

R/Overlog does not support joins across nodes.

Not necessary for needs of PADRE.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Performance Improvements

P2 Runtime R/Overlog runtime

Local Ping Latency 3.8 ms 0.322 ms

Local Ping Throughput 232 req/s 9,390 req/s

Remote Ping Latency 4.8 ms 1.616 ms

Remote Ping Through-put

32 req/s 2,079 req/s

Performance numbers for processing NULL trigger to produce nullevent.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Next Steps

What are the global semantics? (with UCB).

Perform query optimization for Pig? (with UCB).

Provide external data interface for Java Objects. (with UCB).

How to support model checking with Overlog.

Overlog to Promela? to Mace?Model check PADRE. (with UT Austin).

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

Next Steps

Explore the advantages/disadvantages of Overlog tocomponent vs. Overlog to Java compilation.

Improved language support for building distributed systems.

Robert Soule Depth Qualifying Exam Part 2

IntroductionFirst Steps

ConcurrencyReplication Systems

Future of OverlogConclusion

The End

Questions?

Robert Soule Depth Qualifying Exam Part 2