Upload
vonguyet
View
214
Download
2
Embed Size (px)
Citation preview
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Experiences with Overlog
Robert Soule
New York University
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Introduction
Declarative languages are effective for building systems.
Provide higher level abstractions.
Free the programmer from implementation details.
Compiler can offer correctness guarantees, or increasedreliability.
Compiler can implement performance optimizations.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Overview
Experiences with the declarative language Overlog.
Developed an improved parser.
Implemented static analysis concurrent execution.
Adapted Overlog for replication systems.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
P2 Review
Every node in the network has a runtime:
In memory database stores tuples.
Messages are sent asynchronously as tuples.
Programs are specified in a logic-based language.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Overlog Review
Logic based language similar to Datalog.
A program is composed of:
materialized statementslogical rules
Computation proceeds by introducing events, and performingdeductions.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
System Overview
Persistent Storage
External Queue
TransmitQueueOverLog
Program
Persistent Storage
External Queue
TransmitQueueOverLog
Program
Persistent Storage
External Queue
TransmitQueueOverLog
Program
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Developed an Improved Parser
Cleanly specify the grammar in Rats!.
Implement type inference.
Add error checking.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Parser
Lexical analysis and parsing integrated in PEGs
olg lexer.lex and olg parser.y combined
Grammar is modularized
Identifier - variable, function, aggregate identifiers...Symbol - modifies xtc.symbol with :=, :-, @Constant - floating point, integer, boolean...Reserved - materizialized, keys, infinity...Core - rule, tuple, postfix expression...
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Parser
rule: OLG_NAME functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($2, $4, false, $1); }
| OLG_NAME OLG_DEL functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($3, $5, true, $1); }
| functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($1, $3, false); }
| OLG_DEL functor OLG_IF termlist OLG_DOT{$$ = new compile::parse::Rule($2, $4, true); }
| OLG_NAME functor OLG_IF aggview OLG_DOT{$$ = new compile::parse::Rule($2, $4, false, $1); }
| functor OLG_IF aggview OLG_DOT{$$ = new compile::parse::Rule($1, $3, false); }
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Parser
generic Rule =RuleIdentifier? ("delete":Word)? Tuplevoid:":-":Symbol TupleOrExpressionList ;
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Type Inference
Single Java file, extends xtc’s Visitor
Uses a Hindley-Milner-like algorithm
Constraints and unification
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Error Checking
Adds error checking for Overlog
Non-unique rule names
Redefined tuples (different arity, different types)
Expressions with incompatible types(boolean + int, string < boolean)
Functions redefined(different types, different number of arguments)
Variables redefined (different type)
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Concurrency
Auto-Parallelization for Declarative Network MonitoringRobert Soule, Robert Grimm, and Petros Maniatis Poster at SOSP2007
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Motivation
Building networked monitoring applications is too hard:
A variety of tasks in the same setting.
Different operational patterns for each task.
Deployed across different environments.
Computations may be interdependent.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Approach
Declarative programming can help
Higher level language abstracts away many details
Not quite a complete solution.
Need to focus on scalability.
Static analysis identifies parallelism
Scheduling decisions more accessible to the compiler
Additional declarations inform the scheduler
Augmented code allows concurrent execution
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Fixpoint
A single event fixpoint is the program state such that nofurther deductions can be made before changing system statewithout the introduction of a new event.
A fixpoint is the local unit of atomicity.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Types of Parallelism
Inter-fixpoint
Multiple fixpoint computations may proceed at the same time
Compute the transitive closure of all rules, partition tuplesinto read and write sets
Intra-fixpoint
Rules within a fixpoint can execute in parallel because thestate is static and language is single assignment
Need to be careful with side effects from function calls
Data Parallelism
Single instruction stream operating on multiple data sets(applying the same operation on every item in a list)
Requires runtime analysis (an estimate of data set size andlatency of state transfer)
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Dependencies
Compute the dependencies between tuples on a per-rule basis.
r1 synIn(@LclAddr, SrcAddr, f_now()) :-pktIn(@LclAddr, SrcAddr, D), f_tcpSyn(D).
r5 alarm(@LclAddr, SrcAddr, "CONNRST", f_now()) :-rstIn(@LclAddr, SrcAddr, TRST),synIn(@LclAddr, SrcAddr, TSYN),synackOut(@LclAddr, SrcAddr, DTSYNACK),not ackIn(@LclAddr, SrcAddr, TACK),TACK > TSYNACK > TSYN,TRST > TSYNACK > TSYN.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Dependencies
synInr1pktIn
alarmr5
synIn
rstIn
ackInglobal-Detector
synackOut
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Analysis
Perform DFS on the dependency graph.
Start at the non-materialized tuple on the right hand side of arule.
Stop recursion at the external or materialized tuple on the leftof a rule.
Mark tuples as read or write.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Analysis
alarmr5
synIn
rstIn
ackInglobal-Detector
synackOut
r1pktIn
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Example
Encode the results as Overlog:
concurrent("maxSuccDist", "node", "R").concurrent("maxSuccDist", "succ", "R").concurrent("maxSuccDist", "succ", "W").concurrent("bestLookupDist", "node", "R").concurrent("bestLookupDist", "finger", "R").concurrent("bestLookupDist", "lookup", "W").
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Conflict
Remove the next event from the external event queue.
Consult the concurrency table to compare read and write setsfor current and running fixpoints.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Conflict
cpuSpike
alarmr5
synIn
rstIn
ackInglobal-Detector
synackOut
r6
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Scheduling
Support multiple internal event queues. Check if:
Writenew ∩ (Readcurrent ∪Writecurrent) = ∅Writecurrent ∩ (Readnew ∪Writenew ) = ∅
If no conflicts, enqueue event on its own internal event queue, andstart execution.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Speculative Scheduling
An alternative is to speculatively execute, and check forconflict afterwards.
Best strategy depends on expected frequency of conflict.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Replication Systems
PADRE: A Policy Architecture for building Data REplicationsystems Nalini Belaramani, Jiandan Zheng, Amol Nayate, RobertSoule, Mike Dahlin, Robert Grimm In review, 14 pages, OSDI 2008
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Replication Systems Background
Replication is used for data availability, fault tolerance,performance, etc.
There are trade-offs for replication strategies: performance vs.consistency vs. availability vs. partition resilience.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Replication Systems Background
Can you make is easier to build new replication systems and extendexisting ones? Yes!
Separate policy from mechanism (PRACTI).
Policy is liveness strategy and safety constraints (PADRE).
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Policies
Safety
Block requests until they do not violate certain invariants.E.g. block until a write reaches 3 nodes, block until a serveracknowledges a write.
Liveness
View as a routing decision.Where to go to satisfy a read miss?Where to send updates?
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Overlog/P2
Convenient language/runtime layer specifying liveness policies.
Example:
/* send ACK of the last received invalto predecessor */
tRdy04 sendAck(@Pred, Clk, NodeId) :-readyToServer(@X, I), I == 1,predecessor(@X, Pred), Pred != -1,latestInval(@X, Clk, NodeId).
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Overlog/P2 Problem
Problems:
Poor performance.
Poor interface between PRACTI and P2.
How do we modify Overlog for liveness in replication?
Compile from Overlog to Java directly.
Simple code generation. We don’t compile to P2DL.
Provide a clean interface from the new runtime to PRACTI.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Introducing R/Overlog
Simplified Java implementation of the P2 runtime andOverlog compiler.
3,743 lines of code runtime (UT Austin).
5,652 lines of code in code generation (NYU).
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
R/Overlog Types Declarations
R/Overlog supports the following types:
int, float, location, string, boolean
Usually, the code generation can infer the types of all tuples.
Explicit type declarations when inference doesn’t work.
tuple apple (location, int).fun int f_now().
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
R/Overlog Location Type
Location is a first class type.
Syntactically, that means
neighbor("localhost:5000", "localhost:5001").
becomes
neighbor(localhost:5000, locahost:5001).
and
neighbor(@X, Y), X := "-".
is not valid.Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
R/Overlog Limitations
R/Overlog does not support joins across nodes.
Not necessary for needs of PADRE.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Performance Improvements
P2 Runtime R/Overlog runtime
Local Ping Latency 3.8 ms 0.322 ms
Local Ping Throughput 232 req/s 9,390 req/s
Remote Ping Latency 4.8 ms 1.616 ms
Remote Ping Through-put
32 req/s 2,079 req/s
Performance numbers for processing NULL trigger to produce nullevent.
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Next Steps
What are the global semantics? (with UCB).
Perform query optimization for Pig? (with UCB).
Provide external data interface for Java Objects. (with UCB).
How to support model checking with Overlog.
Overlog to Promela? to Mace?Model check PADRE. (with UT Austin).
Robert Soule Depth Qualifying Exam Part 2
IntroductionFirst Steps
ConcurrencyReplication Systems
Future of OverlogConclusion
Next Steps
Explore the advantages/disadvantages of Overlog tocomponent vs. Overlog to Java compilation.
Improved language support for building distributed systems.
Robert Soule Depth Qualifying Exam Part 2