What Is RPC

Embed Size (px)

Citation preview

  • 7/24/2019 What Is RPC

    1/44

    What Is RPC?

    In this section

    Terms and Defnitions

    RPC Dependencies and Interactions

    Microsoft Remote Procedure Call (RPC) is a powerful technology for creating distributed

    client/server programs. RPC is an interprocess communication technique that allows client

    and server software to communicate. he Microsoft RPC facility is compatible with the !pen

    "roup#s $istributed Computing %nvironment ($C%) specification for remote procedure calls

    and is interoperable with other $C%&based RPC systems' such as those for P&* and +,M

    -+* +*based operating systems.

    Computer operating systems and programs have steadily gotten more comple0 over the years.

    1ith each release' there are more features. he growing intricacy of systems ma2es it more

    difficult for developers to avoid errors during the development process. !ften' developers

    create a solution for their system or application when a nearly identical solution has already

    been devised. his duplication of effort consumes time and money and adds comple0ity to

    already comple0 systems.

    RPC is designed to mitigate these issues by providing a common interface between

    applications. RPC serves as a gobetween for client/server communications. RPC is designed

    to ma2e client/server interaction easier and safer by factoring out common tas2s' such as

    security' synchroni3ation' and data flow handling' into a common library so that developers

    do not have to dedicate the time and effort into developing their own solutions.

    Terms and Defnitions

    he following terms are associated with RPC.

    Client

    - process' such as a program or tas2' that requests a service provided by another program.

    he client process uses the requested service without having to 4deal5 with many wor2ing

    details about the other program or the service.

    Server

    - process' such as a program or tas2' that responds to requests from a client.

    https://technet.microsoft.com/en-us/library/cc787851(v=ws.10).aspx#w2k3tr_rpc_what_vrluhttps://technet.microsoft.com/en-us/library/cc787851(v=ws.10).aspx#w2k3tr_rpc_what_pslohttps://technet.microsoft.com/en-us/library/cc787851(v=ws.10).aspx#w2k3tr_rpc_what_pslohttps://technet.microsoft.com/en-us/library/cc787851(v=ws.10).aspx#w2k3tr_rpc_what_vrlu
  • 7/24/2019 What Is RPC

    2/44

    Endpoint

    he name' port' or group of ports on a host system that is monitored by a server program for

    incoming client requests. he endpoint is a networ2&specific address of a server process for

    remote procedure calls. he name of the endpoint depends on the protocol sequence being

    used.

    Endpoint Mapper (EPM)

    Part of the RPC subsystem that resolves dynamic endpoints in response to client requests and'

    in some configurations' dynamically assigns endpoints to servers.

    Client Stu

    Module within a client application containing all of the functions necessary for the client to

    ma2e remote procedure calls using the model of a traditional function call in a standalone

    application. he client stub is responsible for invo2ing the marshalling engine and some of

    the RPC application programming interfaces (-P+s).

    Server Stu

    Module within a server application or service that contains all of the functions necessary for

    the server to handle remote requests using local procedure calls.

    RPC Dependencies and Interactions

    RPC is a client/server technology in the most generic sense. here is a sender and a receiver6

    data is transferred between them. his can be classic client/server (for e0ample' Microsoft

    !utloo2 communicating with a server running Microsoft %0change 7erver) or system

    services within the computer communicating with each other. he latter is especially

    common. Much of the 1indows architecture is composed of services that communicate with

    each other to accomplish a tas2. Most services built into the 1indows architecture use RPC

    to communicate with each other.

    he following table briefly describes the services in 1indows 7erver 899: that depend on the

    RPC system service (RPC77).

    Services That Depend on RPCSS

    Service Description

  • 7/24/2019 What Is RPC

    3/44

    ,ac2ground

    +ntelligent ransfer

    7ervice

    ransfers data between clients and servers in the bac2ground.

    C!M; %vent 7ystem

    7upports 7ystem %vent otification 7ervice (7%7)' which provides

    automatic distribution of events to subscribing Component !bey 7ervice' which helps enroll

    this computer for certificates.

    $CP 7erver

    Performs CP/+P configuration for $CP clients' including dynamic

    assignments of +P addresses' specification of the 1+7 and $7

    servers' and connectionspecific $omain ame 7ystem ($7) names.

    $istributed ?in2

    rac2ing Client

    %nables client programs to trac2 lin2ed files that are moved within an

    @7 volume to another @7 volume on the same computer or to an

    @7 volume on another computer.

    $istributed ?in2

    rac2ing 7erver

    %nables the $istributed ?in2 rac2ing Client service within the same

    domain to provide more reliable and efficient maintenance of lin2s

    within the domain.

    $istributed ?in2

    ransaction

    Coordinator

    Coordinates transactions that span multiple resource managers' such as

    databases' message queues' and file systems.

    $7 7erver%nables $7 clients to resolve $7 names by answering $7 queries

    and dynamic update requests.

    %rror Reporting

    7ervice

    Collects' stores' and reports une0pected application failures to

    Microsoft.

  • 7/24/2019 What Is RPC

    4/44

    @ile Replication

    7ervice

    -llows files to be automatically copied and maintained simultaneously

    on multiple servers.

    elp and 7upport %nables elp and 7upport Center to run on the computer.

    uman +nterface

    $evice -ccess

    %nables generic input access to uman +nterface $evices (+$)'

    which activates and maintains the use of predefined hot buttons on

    2eyboards' remote controls' and other multimedia devices.

    +nde0ing 7ervice

    +nde0es contents and properties of files on local and remote

    computers6 provides rapid access to files through fle0ible querying

    language.

    +P7ec 7ervicesProvides end&to&end security between clients and servers on CP/+P

    networ2s.

    >erberos >ey

    $istribution Center

    !n domain controllers' enables users to log on to the networ2 using the

    >erberos authentication protocol.

    ?ogical $is2

    Manager

    $etects and monitors new hard dis2 drives and sends dis2 volume

    information to ?ogical $is2 Manager -dministrative 7ervice for

    configuration.

    ?ogical $is2

    Manager

    -dministrative

    7ervice

    Configures hard dis2 drives and volumes.

    Messengerransmits net send and -lerter service messages between clients and

    servers. his service is not related to 1indows Messenger.

    Microsoft 7oftware

    7hadow Copy

    Provider

    Manages software&based volume shadow copies ta2en by the Aolume

    7hadow Copy service.

    etwor2 Connections

    Manages ob

  • 7/24/2019 What Is RPC

    5/44

  • 7/24/2019 What Is RPC

    6/44

    erminal 7ervices

    -llows users to connect interactively to a remote computer. Remote

    $es2top' @ast ser 7witching' Remote -ssistance' and erminal

    7erver depend on this service.

    erminal 7ervices

    7ession $irectory

    %nables a user connection request to be routed to the appropriate

    terminal server in a cluster.

    pload ManagerManages the synchronous and asynchronous file transfers between

    clients and servers on the networ2.

    Airtual $is2 7ervice Provides software volume and hardware volume management service.

    Aolume 7hadow

    Copy

    Manages and implements Aolume 7hadow Copies used for bac2up and

    other purposes.

    1indows -udio Manages audio devices for 1indows&based programs.

    1indows +mage

    -cquisition (1+-)Provides image acquisition services for scanners and cameras.

    1indows +nstaller+nstalls' repairs' and removes software according to instructions

    contained in .M7+ files.

    1indows +nternet

    ame 7ervice

    (1+7)

    Resolves et,+!7 names for CP/+P clients by locating networ2

    services that use et,+!7 names.

    1indows

    Management

    +nstrumentation

    Provides a common interface and ob

  • 7/24/2019 What Is RPC

    7/44

    Microsoft Remote Procedure Call (RPC) is an interprocess communication (+PC) mechanism

    that enables data e0change and invocation of functionality residing in a different process.

    hat process can be on the same computer' on the local area networ2 (?-)' or across the

    +nternet. he Microsoft RPC mechanism uses other +PC mechanisms' such as named pipes'

    et,+!7' or 1insoc2' to establish communications between the client and the server. 1ith

    RPC' essential program logic and related procedure code can e0ist on different computers'which is important for distributed applications.

    his section covers the architecture of RPC and the way in which RPC communication ta2es

    place.

    RPC Architecture

    ,y replacing dedicated protocols and communication methods with a robust and standardi3ed

    interface' RPC is designed to facilitate communication between client and server processes.

    he functions contained within RPC are accessible by any program that must communicateusing a client/server methodology. he following figure shows the RPC architecture.

    RPC Architecture

    he following table lists and describes the components and functions of the RPC architecture.

    RPC Components

    Component Description

  • 7/24/2019 What Is RPC

    8/44

    Client or server

    processProgram or service that initiates or responds to an RPC request.

    RPC stubs Program subsystems used by a client or server to initiate an RPCrequest.

    Marshalling engine

    ($R89 or

    $RDE)

    Provides a common RPC interface between RPC clients and servers.

    $R89 is used in a :8&bit architecture and $RDE is optimi3ed for a

    DE&bit architecture. he client and the server negotiate which

    marshalling engine is used for the communication.

    Runtime application

    programming

    interface (-P+)

    Provides a direct interface for RPC to clients or servers. RPC clients andservers typically call the runtime -P+ to initiali3e RPC and prepare the

    data structure that is used to ma2e RPC calls. his runtime -P+ layer

    also determines if an RPC request coming from a marshalling engine or

    directly from a client or server is going to a local server or a remote

    server. he runtime -P+ layer then routes the RPC to the Connection

    RPC' $atagram RPC' or ?ocal RPC layers.

    Connection RPC

    protocol engine

    sed when the RPC requires a connectionoriented protocol. his layer

    designates the connectionoriented protocol to use if the RPC is

    outgoing or receives an incoming connectionoriented RPC.

    $atagram RPC

    protocol engine

    sed when the RPC requires a connectionless protocol. his layer

    designates the connectionless protocol to use if the RPC is outgoing or

    receives an incoming connectionless RPC.

    ?ocal RPC protocol

    enginesed when the server and client are located on the same host.

    Registry

    -ccessed when the RPC service first loads. Registry 2eys may specify

    +P port ranges and the device names of networ2 cards that RPC should

    bind to. nless -P+s force its use' the registry is not used in normal RPC

    operations.

    1in:8 -P+s

    (2ernel:8.dll'

    advapi:8.dll'ntdll.dll)

    >ernel:8.dll is a 1indows base -P+ client dynamic&lin2 library

    ($??) file that provides system services for managing threads' memory'

    and resources.

    -dvapi:8.dll is an advanced 1indows :8 base -P+ $?? file6 it is an

  • 7/24/2019 What Is RPC

    9/44

    -P+ services library that supports security and registry calls.

    tdll.dll is an layer $?? file that controls 1indows system

    functions.

    77P+

    (secur:8.dll)

    Provides a security interface for RPC. egotiates the use of >erberos'

    ?M' or 7ecure 7oc2ets ?ayer (77?) for authentication and

    encryption.

    %ndpoint Mapper

    (%PM)

    (rpcss.dll)

    Rpcss.dll primarily provides the infrastructure for C!M' but a portion of

    rpcss.dll is used for the %PM. -n RPC server contacts the %PM to

    receive dynamic endpoints and register those endpoints in the %PM

    database. RPC clients contact the %PM from the protocol&engine level toresolve endpoints when establishing a connection with an un2nown RPC

    server endpoint.

    -ctive $irectory

    sed in the RPC client process only when the security interface

    specifies >erberos or egotiate as the security provider or when the

    server uses ?M as the security provider.

    etwor2 stac2

    sed to pass RPC requests and replies between a client and a remote

    server.

    >ernelsed to pass RPC requests and replies between a client and a local

    server.

    RPC Processes and Interactions

    he RPC components ma2e it easy for clients to call a procedure located in a remote server

    program. he client and server each have their own address spaces6 that is' each has its ownmemory resource allocated to data used by the procedure. he following figure shows the

    RPC process.

    RPC Process

  • 7/24/2019 What Is RPC

    10/44

    he RPC process starts on the client side. he client application calls a local stub procedure

    instead of code implementing the procedure. 7tubs are compiled and lin2ed with the clientapplication during development. +nstead of containing code that implements the remote

    procedure' the client stub code retrieves the required parameters from the client address space

    and delivers them to the client runtime library. he client runtime library then translates the

    parameters as needed into a standard etwor2 $ata Representation ($R) format for

    transmission to the server.

    Note

    here are two $R marshalling engines within the RPC runtime library= $R89 and

    $RDE. - :8&bit client initiating the communication uses the $R89 marshalling

    engine6 a DE&bit client can use either the $R89 or the $RDE marshalling engine.

    he same marshalling engine is used on both the client and the server side' regardless

    of program architecture. here is a slight decline in performance when either the

    client or server uses an architecture different from the other because the marshalling

    engine must do additional translation during the communication.

    he client stub then calls functions in the RPC client runtime library (rpcrtE.dll) to send the

    request and its parameters to the server. +f the server is located on the same host as the client'

    the runtime library can use the ?ocal RPC (?RPC) function and pass the RPC request to the

    1indows 2ernel for transport to the server. +f the server is located on a remote host' the

    runtime library specifies an appropriate transport protocol engine and passes the RPC to thenetwor2 stac2 for transport to the server. RPC can use other +PC mechanisms' such as named

    pipes and 1insoc2' to accomplish the transport. he other +PC mechanisms allow RPC more

    fle0ibility in the way in which it completes its communications tas2s. @or more information

    about +PC mechanisms' see 4+nterprocess Communications5 on M7$.

    he following table lists the networ2 protocols supported by RPC and the type of RPC

    connection for which the protocol is used.

    RPC-Supported Network Protocols

  • 7/24/2019 What Is RPC

    11/44

    Protocol RPC Type

    ransmission Control Protocol (CP) Connectionoriented

    7equenced Pac2et %0change (7P*) Connectionoriented

    amed Pipe Connectionoriented

    P Connectionoriented

    ser $atagram Protocol ($P) Connectionless

    Cluster $atagram Protocol (C$P) Connectionless

    1hen the server receives the RPC' either locally or from a remote client' the server RPC

    runtime library functions accept the request and call the server stub procedure. he server

    stub retrieves the parameters from the networ2 buffer and' using one of the $R marshalling

    engines' converts them from the networ2 transmission format to the format required by the

    server. he server stub calls the actual procedure on the server. he remote procedure then

    runs' possibly generating output parameters and a return value. 1hen the remote procedure is

    complete' a similar sequence of steps returns the data to the client.

    he remote procedure returns its data to the server stub which' using one of the $R

    marshalling engines' converts output parameters to the format required for transmission bac2

    to the client and returns them to the RPC runtime library functions. he server RPC runtime

    library functions transmit the data to the client computer using either ?RPC or the networ2.

    he client completes the process by accepting the data over the networ2 and returning it to

    the calling function. he client RPC runtime library receives the remote&procedure return

    values' converts the data from its $R to the format used by the client computer' and returns

    them to the client stub.

    @or Microsoft 1indows' the runtime libraries are provided in two parts= an import library'

    which is lin2ed to the application' and the RPC runtime library' which is implemented as a

    $??.

    he server application contains calls to the server runtime library functions' which register

    the server#s interface with the RPC runtime and' optionally' the %PM' and allow the server to

    accept remote procedure calls. he server application also contains the application&specific

    remote procedures that are called by the client applications.

    RPC Security Context ultiplexin!

  • 7/24/2019 What Is RPC

    12/44

    1indows 7erver 899: 7ervice Pac2 (7P) provides RPC security conte0t multiple0ing for

    connection&oriented connections' such as those that use ransmission Control Protocol

    (CP). his allows the RPC server to negotiate multiple security conte0ts over a single

    connection. @or e0ample' when multiple RPC clients establish a connection to an RPC server

    and a middle&tier RPC server resides between the clients and the destination server' the

    middle&tier server multiple0es the security conte0t of the additional clients over an alreadyestablished connection to the destination server. his eliminates the need for the middle&tier

    RPC server to use one of the e0haustible CP ports to establish a new connection for each

    RPC client connecting to the RPC server

    Network Ports "sed #y RPC

    RPC server programs typically use dynamic port mappings to avoid conflicts with programs

    and protocols registered in the range of well&2nown CP ports. RPC server programs

    associate their universally unique identifier (+$) with a dynamic port and register the

    combination with the RPC %PM. he %PM provides a single point of contact for RPC clients.he RPC clients contact the %PM and use the server program#s +$ to determine the port

    being used by the server program. he following table indicates the networ2 ports normally

    used by RPC.

    Port Assi!nments $or RPC

    Service Name "DP TCP

    P B9' EE:' FG: B9' EE:' FG:

    amed Pipes EEF EEF

    RPC %ndpoint Mapper :F :F

    RPC 7erver Programs H$ynamically assignedI H$ynamically assignedI

    $istributed enterprise applications' such as web applications' are often built from more basic

    services' such as storage services' database management systems' authentication and

  • 7/24/2019 What Is RPC

    13/44

    configuration services' and services for interfacing with e0ternal components (e.g.' credit

    card processing' ban2ing' vendors' etc).

    -s systems become larger' more comple0' and more ubiquitous' there is a corresponding

    increase in the number' diversity' and geographical dispersion of the remote services that they

    use. @or instance' otmail and ?ive Messenger share an address boo2 service and an

    authentication service6 there are also services speciali3ed for each application' say' for email

    storage or virus scanning. hese services are heterogeneous6 they are often developed by

    different teams and aregeo-distributed' running in different parts of the world.

    "eo&distribution provides many benefits= high availability' disaster tolerance' locality' and

    ability to scale beyond one data center or site. owever' the thin and slow lin2s connectingdifferent sites pose challenges' especially in an enterprise setting' where applications have

    strict performance requirements. @or instance' web applications should ideally respond within

    one second J13K.

    $i%ure &' (Left) Standard RPCs (Right) RPC chain

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#Nielsen1999https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#Nielsen1999
  • 7/24/2019 What Is RPC

    14/44

    he most common primitives for inter&service communication are remote procedure calls

    (RPCLs) or RPC&li2e mechanisms. RPCLs can impose undesirable communication patterns and

    overheads when a client needs to ma2e multiple calls to servers. his is because RPCLs

    impose communication of the formABA(- calls , which returns to -) even though this

    pattern may not be optimal. @or e0ample' in @igure left' a client - in site uses RPCLs to

    consecutively call serversB' C' andDin site 8. 7erverB' in turn' calls serversEandFin site

    :. he use of RPCLs forces the e0ecution to return to - and , multiple times' causing 9

    crossings of inter&site lin2s

    1e propose a simple but more general communication primitive called a Chain of Remote

    Procedure Calls' or simplyRPC chain' which allows a client to call multiple servers in

    succession (ABB8A)' where the request flows from server to server without involving

    the client every time. he result is a much improved communication pattern' with fewer

    communication hops' lower end&to&end latency' and often lower bandwidth consumption. +n

    @igureright' we see how an RPC chain reduces the number of inter&site crossings to E. he

    e0ample in this figure is representative of a web mail application' where host - is a web

    server that retrieves a message from an email server ,' then retrieves an associated calendar

    entry from a calendar service C' and finally retrieves relevant ads from an ad server $.

    he 2ey idea of RPC chains is to embed the chaining logic as part of the RPC call. his logic

    can be a generic function' constrained by some simple isolation mechanisms. RPC chains

    have three important features=

    (1) Server modularity.What made RPC*s so success+ul is the clean decouplin%

    o+ server code, "hich allo"s servers to e developed independentl- o+

    each other and the client RPC chains preserve this attriute, even

    allo"in% e.istin% le%ac- RPC*s to e part o+ a chain throu%h simple"rappers

    (2) Chain composability.I+ a server in the chain itsel+ "ishes to call another

    server, this nested call can e simpl- added to the chain in /u. In

    $i%ure &, "hen clientAstarts the chain, it intends to call onl- servers B, C,

    and D 0ut server B"ants to call servers Eand F, and so it adds them to

    the chain

    () Chain dynamicity.The services that a host calls need not e defned a

    priori1 the- can var- d-namicall- durin% e.ecution In the le+t f%ure, the

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintrohttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#figintro
  • 7/24/2019 What Is RPC

    15/44

    +act that clientAcalls servers Cand Dneed not e #no"n e+oreAcalls

    server B1 it can depend on the result returned - B $or e.ample, an error

    condition ma- cause a chain to end immediatel- instead o+ continuin% on

    to the ne.t server

    1e demonstrate RPC chains through a storage and a web application. @or the storage

    application' we show how a storage server can be enabled to use RPC chains' and we give a

    simple use in which a client can copy data between servers without having to handle the data

    itself. his speeds up the copying and saves significant bandwidth. @or the web application'

    we implement a simple web mail service that uses chains to reduce the overheads of an ad

    server.

    he paper is organi3ed as follows. 1e e0plain the setting for RPC chains in 7ection8.7ection :covers the design of RPC chains and 7ection Ecovers applications. 1e evaluate

    RPC chains in 7ection F' and we e0plain their limitations in 7ection D. - discussion follows

    in 7ectionN. 1e discuss related wor2 in 7ection Band we conclude the paper in 7ectionG.

    2 Settin%

    1e consider enterprise systems that span geographically&diverse sites' where each site is a

    local area networ2. 7ites are connected to each other through thinner and slower wide area

    lin2s. 1ide&area lin2s can be made faster by improving the underlying networ2' and lots of

    progress has been made here' but this progress is hindered by economic barriers (e.g.' legacy

    infrastructure)' technological obstacles (e.g.' switching speeds)' and fundamental physical

    limitations (e.g.' speed of light). hus' the large discrepancy between the performance of

    local and wide&area lin2s will continue.

    nli2e the +nternet as a whole' enterprise systems operate in a trusted environment with a

    single administrative domain and e0perience little churn. hese systems may contain a widerange of services' often developed by many different teams' including general services for

    storage' database management' authentication' and directories' as well as application&specific

    services' such as email spam detection' address boo2 management' and advertising. hese

    services are often accessed using RPCLs' which we broadly define as a mechanism in which a

    client sends a request to a server and the server sends bac2 a reply. his definition includes

    many types of client&server interactions' such as the interactions in C!R,-' C!M' R%7'

    7!-P' etc.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#settingsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#settingsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#designsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#appsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#evalsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#limitsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#discusssecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#discusssecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#relatedsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#concludesecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#concludesecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#settingsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#designsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#appsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#evalsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#limitsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#discusssecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#relatedsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#concludesect
  • 7/24/2019 What Is RPC

    16/44

    +n enterprise environments' application developers are not malicious though some level of

    isolation is desirable so that a problem in one application or service does not affect others.

    3 Desi%n

    1e now e0plain the design of RPC chains' starting with the basic mechanism for chaining

    RPCLs in 7ection :.. he code that chains successive RPCLs is stored in a repository'

    e0plained in 7ection :.8. +n 7ection :.:' we cover the state that is needed during the chain

    e0ecution. 1e then discuss composition of chains in 7ection :.E' legacy servers in

    7ection :.F' isolation in 7ection :.D' debugging in 7ection :.N' e0ceptions in 7ection :.B'

    failures in 7ection :.G'and chain splitting in 7ection :.9.

    3& Main mechanism7ervers provide services in the form ofservice functions' which is the general term we use for

    remote procedures' remote methods' or any other processing units at servers. -n RPC chain

    calls a sequence of service functions' possibly at different servers. 7ervice functions are

    connected together via chaining functions' which specify the ne0t service function to e0ecute

    in a chain (see @igure 8top). Chaining functions are provided by the client and e0ecuted at

    the server. hey can be arbitrary CO methods with the restriction that they bestand-alone

    code' that is' code which does not refer to non&local variables and functions' so that they can

    be compiled by themselves.

    1e chose this general form of chaining for two reasons. @irst' we want to allow the chain to

    unfold dynamically' so that the choice of ne0t hop depends on what happens earlier in the

    chain. @or e0ample' an error at a service function could shorten a chain. 7econd' we wanted

    to support server modularity' so that services and client applications can be developed

    independently. hus' a server may not produce output that is immediately ready for another

    server' in the way intended by the clientLs application. !ne may need to convert formats'

    reorder parameters' combine them' or even combine the outputs from several servers in the

    chain. @or e0ample' an @7 server does not output data in the format e0pected by a 7?

    server= one needs glue that will convert the output' choose the tables' and add the appropriate

    7? wrapper' according to application needs. Chaining functions provide this glue. 1e

    initially considered a simpler alternative to chaining functions' in which a client

  • 7/24/2019 What Is RPC

    17/44

    even write a programmer tool that automatically does that)' so our design includes static lists

    as a special case.

    // service function

    object sf(object parmlist)

    // parmlist: parameter list

    // chaining function

    nexthop cf(object state, object result)

    // state: from client or earlier parts of chain

    // result: from last preceding service function

    // returns next chain hop:

    // (server, sf_name, parmlist,

    // cf_name, state)

    --------------------

    chain_id start_chain(machine_t server,

    string sf_name, object parmlist,

    string cf_name, object state)

    $i%ure 2' (Top)Si%nature o+ a service +unction (s+) and chainin% +unction (c+)

    (Bottom)Si%nature o+ +unction that launches an RPC chain

  • 7/24/2019 What Is RPC

    18/44

    $i%ure 3' E.ecution o+ an RPC chain (see e.planator- te.t in Section 3&) RPCC

    stands +or RPC chain

    @igure:shows how an RPC chain e0ecutes. () - client calls our RPCC (RPC chain) library'

    specifying a server' a reference to a service functionsfat that server' its parameters' and a

    chaining function cf. (8) his information is then sent to the chosen server. (:) he server

    e0ecutes service functionsf' which (E) returns a result. (F) his result is passed to the

    chaining function cf' which then (D) returns the ne0t server' service function' and chaining

    function' and (N) the chain continues.

    @or e0ample' suppose client - wants to call service functionssfB'sfC'sfDat servers ,' C' and

    $' in this order. o do so' the client specifies a reference tosfBand a chaining function cf. cfcauses a call tosfCat server C with a chaining function cf8' which in turn causes a call tosfDat

    server $ with a chaining function cf:' which causes the final result to be returned to the client

    A.

    32 Chainin% +unction repositor-

    Chaining functions are provided by clients but e0ecuted at servers. o save bandwidth' in our

    implementation the client does not send the actual code to the server. Rather' the client

    uploads the code to a repository' and sends a reference to the server6 the server downloads the

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#mainmechsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#basicfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#basicfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#mainmechsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#basicfig
  • 7/24/2019 What Is RPC

    19/44

    code from the repository and caches it for subsequent use. he repository stores chaining

    functions in source code format' and servers compile the code at runtime using the reflection

    capabilities of .%/CO (Qava has similar capabilities).

    1e store source code because it introduces fewer dependencies' is more robust (binary

    formats change more frequently)' and simplifies debugging. ,ecause the cost of runtime

    compilation can be significant (F9 ms' see 7ection F.8.)' servers cache the compiled code'

    not the source code' to avoid repeated compilations.

    1hen the chaining function is very small' it could be transmitted by the client with the RPC

    chain' so that the server does not have to contact the repository. !ur implementation presently

    does not support this option.

    33 Parameters and state

    - chaining function is client logic that may depend on run&time variables' tables' or other

    state from the client or earlier parts of the chain. his state needs to be passed along the

    chain' and ideally it should be small' otherwise its transmission cost can outweigh the

    benefits of an RPC chain (see 7ection F.8.8). 1e represent the state as a set of name&value

    pairs' which is passed as a parameter to the chaining function (see @igure 8).

    he output of each service function is also passed as a parameter to the subsequent chaining

    function. @or e0ample' in our storage copy application (7ection E.)' the first service function

    reads a file' and the chaining function uses the result as input to the ne0t service function'

    which writes to a file on a different server. +n our email application' a service function reads

    an email message' and the chaining function adds the message to the state of the ne0t

    chaining function' so that the message is passed along the chain bac2 to the chain originator

    (a mail web server).

    34 5estin% and composition

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#srcvscompilesecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#rpccvsrpcsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#funcsigshttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#storageappsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#srcvscompilesecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#rpccvsrpcsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#funcsigshttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#storageappsect
  • 7/24/2019 What Is RPC

    20/44

    $i%ure 4' Composition o+ nested chains (Left)The main chain & and a su6chain 2

    (Right)Result and manner o+ composin% chains (I) 0 starts a su6chain, causin%

    the RPCC lirar- to push the B7Cchainin% +unction and its state parameter into

    a stac# (II) Chainin% +unction at $ returns an indication that the chain ended and

    the result that 0 is supposed to produce This causes the RPCC lirar- to pop

    +rom the stac#, otainin% the B7Cchainin% +unction and its state parameter It

    then calls this chainin% +unction "ith the result and state The chain no"

    continues at C

    RPC chains can be nested= a service function in a chain may itself start a sub&chain. @or

    e0ample' the main chain could call a storage service' which then needs to call a replica. 1e

    implement nesting so that a nested chain can be ad

  • 7/24/2019 What Is RPC

    21/44

    procedure at the service' while a chaining function at C represents logic coming from -. -t %'

    the chaining function that calls @ represents logic coming from ,.

    o compose a chain with its sub&chain' the chaining function of the parent chain needs to be

    invo2ed when a sub&chain ends (to continue the parent chain). -ccordingly' when a host starts

    a sub&chain' the RPCC library saves the chaining function and its state parameter' and passes

    them along the sub&chain. he sub&chain ends when its chaining function returns null in

    nexthop.server' and a result in nexthop.state(this is the result that the host originating

    the sub&chain must produce for the parent chain). 1hen that happens' the RPCC library calls

    the saved chaining function with the saved state and nexthop.state. ote that a chain and a

    sub&chain need not be aware of each other for composition.

    o allow multiple levels of nesting' we use a chain stackthat stores the saved chaining

    function and its state for each level of composition. he stac2 is popped as each sub&chain

    ends.

    38 !andlin% le%ac- RPC services

    RPC chains support legacy services that have standard RPC interfaces. @or that' we use a

    simple wrapper module' installed at the legacy RPC server' which includes the RPCC library

    and e0poses the legacy remote procedures as service functions.

    %ach service function passes requests and responses to and from the corresponding legacy

    remote procedure. ,ecause the service function calls the legacy remote procedure locally

    through the RPCLs standard networ2 interface (e.g.' CP)' the legacy server will see all

    requests as coming from the local machine' and this can affect networ2 address&based server

    access control policies. (his is not a problem if access control is based on internal RPC

    authenticators' such as signatures or to2ens' which can be passed on by the wrapper.)

    !ne solution is to re&implement the access control mechanism at the wrapper' but this is

    application&specific. - better solution is for the wrapper to fa2e the networ2 address of its

    requests and capture the remote procedureLs output before it is placed on the networ2.

    39 Isolation

    Chaining functions are pieces of client code running at servers. %ven though clients are

    trustworthy in the environment we consider' they are still prone to buffer overruns' crashes'

  • 7/24/2019 What Is RPC

    22/44

    and other problems. hus' chaining functions are sandbo0ed to provide isolation' so that

    client code cannot crash or otherwise adversely affect the server on which it runs.

    1e need two types of isolation= () restricting access to sensitive functions' such as file and

    networ2 +/! and privileged operating system calls' and (8) restricting e0cessive consumption

    of resources (CP and memory).

    1e achieve () through direct support by .%/CO of access restrictions to file +/!' system

    and environment variables' registry' clipboard' soc2ets' and other sensitive functions (Qava

    has similar capabilities). his is accomplished by placing descriptive annotations' called

    attributes' in the source code of chaining functions when they are compiled at run&time.

    1e achieve (8) by monitoring CP and memory utili3ation and chec2ing that they are within

    preset values. he appropriate values are a matter of policy at the server' but for the short&

    lived type of e0ecutions that we target with RPC chains' chaining functions should consume

    at most a few CP seconds and hundreds of megabytes of memory' even in the most e0treme

    cases.

    +f a chaining function violates restrictions on access or resource consumption' an RPC chain

    e0ception is thrown according to the mechanism in 7ection :.B.

    -nother way to isolate chaining functions is to use a chaining pro0y (7ection N.:).

    3: Deu%%in% and proflin%

    - very useful debugging tool for traditional applications is 4printf5' which allows an

    application to display messages on the console. 1e provide an analogous facility for RPC

    chain applications= a virtual console' where nodes in the chain can log debugging

    information. he contents of the virtual console are sent with the chain' and eventually reach

    the client' which can then dump the contents to a real console or file. he virtual console can

    also be used to gather profiling information for each step in the chain and be aggregated at

    the client.

    %ven with 4printf5' debugging RPC chains can be hard' because it involves distributed

    e0ecution over multiple machines. 1e can reduce this problem to the simpler problem of

    debugging RPC&based code by running RPC chains in a special interactive mode. he 2ey

    observation is that chaining functions areortable codethat can be e0ecuted at any machine.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#exceptionsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#proxysecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#exceptionsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#proxysect
  • 7/24/2019 What Is RPC

    23/44

    +n interactive mode' chaining functions always e0ecute at the client instead of the servers. o

    accomplish this' after each service function returns' the RPCC library sends its result bac2 to

    the client' which then applies the chaining function to continue the chain from there. - chain

    e0ecuted in interactive mode loo2s li2e a series of RPC calls. ,y running the client in an

    interactive debugger' the developer can control the e0ecution of the chain and inspect the

    outputs of service and chaining functions at each step.

    3; E.ceptions

    -n RPC chain may encounter e0ceptional conditions while it is e0ecuting= () the ne0t server

    in the chain can be down' (8) the chaining function repository can be down' or (:) the state

    passed to the chaining function can be missing vital information due to a bug. -ll of these

    will result in an e0ception' either at the RPCC library in cases () and (8)' or at a chaining

    function in case (:). (7ervice functions do not throw e0ceptions6 they simply return an error

    to the caller.)

    1ho should handle such e0ceptionsS !ne possibility is to handle them locally' by having the

    client send e0ception handling code as part of the chain. $oing this requires sending all the

    state that the handling code needs' which complicates the application design. +nstead' we

    choose a less efficient but simpler alternative (since e0ceptions are the rare case). 1e simply

    propagate e0ceptions bac2 to the client that started the chain. he client receives the

    e0ception name' its parameters' and the path of hosts that the chain has traversed thus far. (+f

    the client crashes' the e0ception becomes moot and is ignored.)

    +n the case of nested chains' the e0ception propagates first to the host that started the current

    sub&chain. +f that host does not catch the e0ception' it continues propagating to the host that

    started the parent chain' until it gets to the client. @or e0ample' in @igure Eright' if % throws

    an e0ception (say' because it could not contact @)' the e0ception goes to ,' the node that

    created the sub&chain. his is a natural choice because , understands the logic of the sub&

    chain that it created' and so it may 2now how to recover from the e0ception. +f , does not

    catch the e0ception' it is propagated to -.

    3< 0ro#en chains

    he crash of a host while it e0ecutes an RPC chain results in a broken chain. +n this section'

    we describe the bro2en chain detection and recovery mechanisms.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#nestingfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#nestingfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#nestingfig
  • 7/24/2019 What Is RPC

    24/44

    Detection%1e detect a bro2en chain using a simple end&to&end timeout mechanism at the

    client called chain heartbeats= a chain periodically sends an alive message to the client that

    created it' say every : seconds' and the client uses a conservative timeout of D seconds. +f

    there are sub&chains' only the top&level creator gets the heartbeats. eartbeats carry a unique

    chain identifier' a pair consisting of the client name and a timestamp' so that the client 2nows

    to which chain it refers.

    1e achieve the periodic sending through a time&to&heartbeat timer' which is sent with the

    chain' and it is decremented by each node according to its processing time' until it reaches 9'

    the time to send a heartbeat. 7ynchroni3ed cloc2s are not needed to decrement the timer6 we

    only need cloc2s that run at appro0imately the same speed as real time. 7ince we do not 2now

    lin2 delays' we assume a conservative value of 899 ms and decrement the time&to&heartbeat

    timer by this amount for every networ2 hop. his assumption may be violated when if there

    is congestion and dropped pac2ets' resulting in a premature timeout (false positive).

    owever' the impact of false positives is small because of our recovery mechanism'

    e0plained ne0t.

    Recovery%o recover from a bro2en chain' the client simply retransmits the request. ?i2e

    standard remote procedures' we ma2e chains idempotent by including a chain&id with each

    chain' and briefly caching the results of service functions and chaining functions at each

    server. +f a server sees the same chain&id' it uses the cached results for the service and

    chaining functions. he chain can continue in this fashion up to the host where the chain

    previously bro2e. -t that host' if the 4ne0t5 host is still down' an e0ception is thrown.

    -lternatively' a fail&over mechanism that calls a bac2up server can be implemented by using

    logical server names which are mapped to a bac2up when the primary fails. his is similar to

    the mechanisms used to fail over standard RPCLs.

    pon a second timeout' a client e0ecutes the RPC chain in interactive mode(as in

    7ection :.N)' to determine e0actly at which node the chain stopped' and returns an error to the

    application.

    3&= Splittin% chains

    @or performance reasons' it may be desirable to split a chain to allow parallel e0ecution. he

    decision to split a chain should be made with consideration of the added comple0ity' as

    concurrent computations are always harder to understand' design' debug' and maintain

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#debugsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#debugsect
  • 7/24/2019 What Is RPC

    25/44

    compared to sequential computations. -lthough our applications do not use splitting chains'

    we now e0plain how such chains can be implemented.

    Split%1e modify chaining functions so that they can return more than one ne!thoparameter.

    he RPCC library calls each ne0thop concurrently' resulting in the several split&chains. %ach

    chain has an id comprised of the id of the parent plus a counter. @or e0ample' if there is a :&

    way split of chain NE' the split&chains will have ids NE.' NE.8' and NE.:. %ach of these split

    chains can in turn be split again' and result in split&chains with increasingly long ids. @or

    e0ample' if split&chain NE. splits into two' the resultant split&chains will have ids NE.. and

    NE..8. 1e note for future reference that each split&chain 2nows how many siblings it has

    (this information is passed on to the split&chains when the chain splits).

    &roken split chains%Recall that we use an end&to&end mechanism to handle bro2en chains

    (7ection :.G) via a chain heartbeat. 1hen a chain splits' we also split the heartbeats= each

    split&chain sends its own heartbeat (with the split&chain id) and the client will be content only

    if it periodically sees the heartbeat from all the split&chains. he heartbeat messages indicate

    the number of sibling split&chains' so that the client 2nows how many to e0pect. +f a split&

    chain is missing' the client starts the chain again (even if other split&chains are still running'

    this does not cause a problem because of idempotency).

    er!e%o merge split&chains' a merge hostcollects the results of each split&chain and

    invo2es a merge functionto continue the chain. he merge host and function are chosen when

    the chain splits (they are returned by the chaining function causing the split). he merge host

    can be any host6 a good choice is the ne0t host in the chain. he merge host awaits outcomes

    from all split&chains before calling the merge function' which ta2es the vector of results and

    returns ne!tho' specifying the ne0t service function and chaining function to call.

    -fter split&chains complete (i.e.' reach the merge host)' the parent chain will continue and

    resume its heartbeats. owever' split&chains do not necessarily complete at the same time' so

    there may be a period from when the first split&chain completes until the parent chain

    resumes. $uring this period the merge host sends heartbeats on behalf of the completed split&

    chains' so that the client does not time out.

    Crash !ar#a!e%1hen there are crashes in the system' the merge host may end up with the

    outcome of stale split&chains. his garbage can be discarded after a timeout= as we

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#brokenchainsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#brokenchainsect
  • 7/24/2019 What Is RPC

    26/44

    mentioned' RPC chains are intended for short&lived computations' so we propose a timeout of

    a minute. ote that if a slow system causes a running chain to be garbage collected' the client

    will recover after it times out.

    4 >pplications

    To demonstrate RPC chains, "e appl- and evaluate them in t"o important

    enterprise applications' a stora%e application (Section 4&) and a "e application

    (Section 42)

    4& Stora%e applications

    7torage services generally provide two basic functions' readand "rite' based on 2eys' file

    names' ob

  • 7/24/2019 What Is RPC

    27/44

    !ne solution is to modify the storage service on a case&by&case basis for different operations

    and different settings. @or e0ample' the -ma3on 7: storage service recently added a new

    copy operation to its interface J#K' so that an end user can efficiently copy her data between

    data centers in the 7 and %urope' without having to transfer data through her machine.

    -lthough such application&specific interfaces can be beneficial' they are specific to particular

    operations and do not mitigate adverse communication patterns in other settings.

    RPC chains provide a more general solution= they not only enable the direct copying of data

    from one server to another (through a simple chain that reads and then writes)' but also

    enable broader uses. o demonstrate this idea' we layered RPC chains over a legacy @7 v:

    storage server' as e0plained in 7ection:.F. (1e could have used other types of storage' such

    as an ob

  • 7/24/2019 What Is RPC

    28/44

    $i%ure 9' > simplifed "e mail server that uses RPC chains The solid line sho"s the

    lo%in se@uence +ollo"ed - retrieval o+ email and ads The dashed line sho"s

    ho" a s-stem ased on standard RPC*s "ould diAer The chain is not used +or the

    "e client, since it is outside the s-stem It is used in the communication

    et"een mail, stora%e, and ad servers

    1e consider a typical web mail application. here are web servers that handle P

    requests' authentication servers and address&boo2 servers that are shared with other

    applications' email storage servers that store the usersL mail' and ad servers that are

    responsible for displaying relevant ads. hese services can be located in multiple data centers'

    for several reasons= () no single data center can host them all6 (8) a service may have been

    developed in a particular location and so it is hosted close by6 (:) for performance reasons' it

    may be desirable for some services to be located close to their users (e.g.' users created in

    -sia may have their mailbo0 stored in -sia)' though this is not always achievable (e.g.' an

  • 7/24/2019 What Is RPC

    29/44

    -sian user travels to the .7. and his mailbo0 is still in -sia)6 and (E) a service may need

    high availability or the ability to withstand disasters.

    1e implemented a simple web mail service as shown in @igure D' to study the benefits of

    RPC chains in such a setting. !ur web mail system consists of a front&end server that

    authenticates users by verifying their logins and passwords. pon successful authentication'

    the front&end server returns a coo2ie to the client along with the name of an email server. he

    client then uses the coo2ie to communicate with the email server to send and receive email

    messages. pon receiving a client request' the email server first verifies the coo2ie' then calls

    the bac2&end storage server to fetch the appropriate emails for the user. @inally' the mail

    server sends the message to an ad server so that relevant ads can be added to the messages

    before they are returned to the client.

    ote that the adding of ads to emails imposes a significant overhead on performance. his is

    of particular concern because one of the primary performance goals of a webmail service is to

    minimi3e the response time observed by clients. +n addition' emails and ads cannot be fetched

    in parallel' since relevant ads cannot be selected without 2nowing the contents of the emails.

    +t is also difficult to pre&compute the relevant ads because the relevance of ads may change

    over time.

    sing RPC chains' we can mitigate some of the ad&related overheads. %ven though we can

    only fetch ads after fetching the emails' we can eliminate one latency hop from the

    communication path of the web mail application' by creating a chain that causes emails to be

    sent directly from the storage server to the ad server' without having to go through to email

    server (as shown in step N of @igure D). !nce the ad server has appended the appropriate ads

    to the emails' the emails can be sent to the email server which then returns it to the client. +n

    7ection F.E' we evaluate the benefit of using RPC chains to improve the communication

    pattern in this fashion.

    8 Evaluation

    1e now evaluate RPC chains. 1e start with some microbenchmar2s' in which we measure

    the overhead of chaining functions and we compare RPC chains versus standard RPCLs. 1e

    then evaluate the storage and web applications to demonstrate the performance improvements

    provided by RPC chains. he general question we address is when are RPC chains

    advantageous and what are the e0act benefits.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#mailfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#mailfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#webmailevalsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#mailfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#mailfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#webmailevalsect
  • 7/24/2019 What Is RPC

    30/44

    8& Setup

    In this section, "e present the evaluation o+ our stora%e and multi6tier "e

    application Bur e.perimental setup consists o+ ten machines in +our

    %eodistriuted sites in a corporate net"or# that spans the %loe We had

    machines in 4 sites' (&) Mountain ie", Cali+ornia, S>, (2) Redmond,

    Washin%ton, S>, (3) Camrid%e, nited in%dom, and (4) 0eiin%, China The

    measured latenc- and throu%hput o+ the lin#s et"een these sites are sho"n in

    $i%ure :

    (a)

    Redmond 0eiin% Camrid%e

    Mtie" 32 ms &;= ms 24= msRedmond &49 ms 2&= ms

    0eiin% 384 ms

    ()

    Redmond 0eiin% Camrid%e

    Mt ie" 93 M0Fs 2& M0Fs &4 M0Fs

    Redmond ;8 M0Fs ;9 M0Fs

    0eiin% 24 M0Fs

    $i%ure :' (a) Pin% round6trip times and () and"idth o+ TCP connections et"een

    pair o+ sites

    82 Microenchmar#s

    5.2.1 Overhead of chaining functions

    +n our first e0periment' we evaluate the overhead imposed by chaining functions (pieces of

    client code) at servers. 1e considered chaining functions of three si3es' D8 bytes' F >,' and

    F9 >,' corresponding to small' medium' and large functions.

    1e first measured the time it ta2es to compile a function at run&time. he results are shown in

    the first two columns of @igure B' averaged over 9 runs (T refers to standard error). 1e used

    a : "h3 +ntel Core 8 $uo processor running 1indows Aista %nterprise 7P. he functions

    were written in CO and compiled using Microsoft Aisual 7tudio 899B.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#linkcharfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#chfuncoverheadfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#linkcharfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#chfuncoverheadfig
  • 7/24/2019 What Is RPC

    31/44

    Source siGe Compile time Compiled siGe(0) (ms) (0)

    =9 48: H =3 =4

    8 4:& H =4 498= :9= H =3 &8,' 8F9 >,'

    and F99 >,) between two servers' and we measure the time it ta2es. 1e vary the location of

    the client (Mt. Aiew' Redmond' ,ei

  • 7/24/2019 What Is RPC

    36/44

    $i%ure &2' Throu%hput6latenc- o+ RPC cop- and Chain cop- Katenc- is the time to

    cop- a &2; 0 fle, and throu%hput is the rate at "hich fles are copied

    +n our ne0t e0periment' we vary the number of clients simultaneously copying files from one

    server to another' and measure the resultant throughput and latency of the system. his allows

    us to observe the behavior of the system under varying load as well as measure the pea2

    throughput of the system. -s before' the client machine was located in Redmond and the

    servers were located in Mt. Aiew. 1e ran multiple client instances in parallel on the client

    machine' each client copying 999 files in succession' each file measuring 8FD >, in si3e.1e measure the time that each client ta2es to complete copying 999 files' and compute

    conservative throughput and latency numbers based on the slowest client.

    @igure8shows the results of the e0periment. @or both RPC copy and Chain copy' the

    average latency decreases as the amount of wor2load placed on the system increases.

    +nitially' the increase in wor2load also results in an increase in the aggregate throughput of

    the system' but once the system becomes saturated' any increase in wor2load only increases

    latency without any gain in throughput. !ur results show that RPC copy is able to sustain a

    pea2 throughput of E.F M,/s. his pea2 throughput occurs when the networ2 lin2 between

    the client and the servers' which had a bandwidth of D.: M,/s' becomes saturated. 7ince

    Chain copy does not require that the data bloc2s of the files being copied actually flow

    through the client' it was not sub

  • 7/24/2019 What Is RPC

    37/44

    5.3.2 Benet of chain composition

    +n this e0periment' we measure the benefit of composing RPC chains. 1e use two chains= one

    for copying from one server to another (as above) and the other for primary&bac2up

    replication of the second server (as in @igure F). 1e compare two systems that use RPC

    chains6 one system uses chain composition to combine the two chains' while the other has

    composition disabled. +n the e0periment' one client copies one file of variable si3e from the

    non&replicated server to the replicated server. he client is in Cambridge' the source server is

    in Mt. Aiew' the primary of the destination server is in Mt. Aiew' and the bac2up of the

    destination server is in Redmond.

    $i%ure &3' 0eneft o+ chain composition

    @igure:shows the result. -s we can see' composing the chain reduces the duration of the

    copy by 8V&89V' with larger files having a greater reduction. 1ithout composition' the

    destination server has to handle both requests from the source server as well as the replies

    from the bac2up server. Composition reduces the load on the destination server by allowing

    the bac2up server to send replies directly to the client. +n addition' composition eliminates the

    unnecessary messages from the bac2up server to the destination server' reducing the amount

    of bandwidth consumption. - combination of these factors allow composition to improve the

    overall performance of the system. -s file si3e increases' the setup cost becomes relatively

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#nestingstoragefighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#compobenefitfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#compobenefitfighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#nestingstoragefighttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#compobenefitfig
  • 7/24/2019 What Is RPC

    38/44

    small compared to the actual cost of e0ecuting the chains. his ma2es the impact of the more

    efficient chain that resulted from composition more apparent.

    84 We mail application

    $i%ure &4' RPC chain in "e mail application

    1e now describe the evaluation of the web mail application presented in 7ectionE.8. +n our

    e0perimental setup' we placed the client in Mountain Aiew' the mail server and the

    authentication server in Redmond' and all other servers in ,ei

  • 7/24/2019 What Is RPC

    39/44

    server is located close to the client but far away from the storage server and ad server'

    traversing the long lin2s between Redmond and ,ei

  • 7/24/2019 What Is RPC

    40/44

    chainAWBW(C or D)' because the choice of going to C versus $ must be made at - where

    the sensor is.

    Pro!rammin! with continuations%o use RPC chains' developers need to ma2e use of

    continuation&style programming. his can be much harder than programming using

    sequential code' because continuations must e0plicitly 2eep trac2 of all their state.

    Continuations are notoriously hard to debug' because there is no simple way to trac2 the

    e0ecution that led to a given state.

    1e note' however' that programming with continuations is already tolerated in code that uses

    asynchronous RPCLs and callbac2s. Moreover' one could perhaps write a tool that

    automatically produces continuations from sequential code' using techniques from thecompiler literature (see' e.g.' J3K).

    Terminatin! chains%1hen an application terminates' it is usually desirable to release its

    resources and halt all its activities. owever' if the application has outstanding RPC chains' it

    is not easy to terminate them. his problem e0ists with traditional RPCLs as well (there is no

    easy way to terminate a remote procedure)' but it is worse with RPC chains because the

    remote servers involved may not be 2nown.

    RPC chains are designed for relatively short&lived e0ecutions' and for these uses' this

    problem is less of a concern' because a chain soon terminates anyways. he only e0ception is

    a buggy chain that runs forever. @or such chains' the RPCC library can impose a ma0imum

    chain length' say 8999 hops' and throw an e0ception after that.

    : E.tensions

    1e now discuss some e0tensions of RPC chains.

    :& Intermediate chain results

    +f a client wants to receive some results from intermediate servers of the chain' these results

    need to be relayed through the chain. +f the amount of data is large' it can impose a significant

    overhead. 1e can e0tend RPC chains to address this issue' by allowing each server in the

    chain to directly return some data to the client. his data is application&specific and is

    returned by the chaining function. hus' we add a new field' client-response' to the

    nexthopresult of a chaining function. he RPCC library sends client-responseto the

    client concurrently with continuing the chain.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#Appel1992https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#Appel1992
  • 7/24/2019 What Is RPC

    41/44

    1hat happens under chain compositionS +n this case' the 4client5 that gets client-response

    is the server that created the sub&chain. he name of these creators' at each level of composed

    chain' are 2ept in the chain stac2 (the chain stac2 is e0plained in 7ection :.E).

    :2 Dealin% "ith lar%e chainin% states

    he chaining state is the state that the client sends along the chain to e0ecute the chaining

    functions. +f this state is large' this can incur a significant state overhead. wo optimi3ations

    are possible to mitigate this cost.

    (all-#ack to standard RPC%-s e0plained in 7ection :.N' we can e0ecute a chain in

    interactive mode' which causes the chain to go bac2 to the client at every step. his is

    effectively a fall&bac2 to standard RPC' causing all chaining functions to e0ecute at the client'which eliminates the overhead of sending the chaining state' at the cost of e0tra networ2

    delays. 1e e0plored this trade&off in 7ectionF.8.8. +t is possible to have the RPCC library

    gauge the si3e of the chaining state before starting the chain' and if the state is larger than

    some threshold' e0ecute the chain in interactive mode. he threshold can be chosen

    dynamically based on previous e0ecutions of the same chain' in an adaptive manner. ,y

    doing so' an RPC chain will always perform at least as well as standard RPCLs' modulo the

    small computational overhead of e0ecuting chaining functions and the time it ta2es to adapt.

    owever' in the applications we e0amined in this paper' we did not need this technique

    because the chaining state was always small.

    )idin! latency%+n our implementation' servers wait to receive the chaining state before

    e0ecuting the ne0t service function in the chain. his waiting is not necessary' because the

    service function depends only on its parameters' not on the chaining state (the chaining state

    is only needed for the chaining function' which e0ecutes later). herefore' a natural

    optimi3ation is to start the service function even as the chaining state is being received. +f the

    service function ta2es significant time to complete' (e.g.' it involves dis2 +/! or some lengthy

    computation)' this will mas2 part or all of the latency of transmitting the chaining state.

    :3 Chainin% pro.-

    -s we said' chaining functions areortable codethat do not have to e0ecute at the server.

    hey can e0ecute at a designated chaining ro!&machine' to avoid any overhead at the

    server. $oing so incurs e0tra communication' but if the chaining pro0y is geographically

    close to the server' this cost is small relative to that of a wide&area hop. o choose the

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#composesecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#debugsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#rpccvsrpcsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#rpccvsrpcsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#composesecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#debugsecthttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#rpccvsrpcsect
  • 7/24/2019 What Is RPC

    42/44

    chaining pro0y' we can use a simple mapping from servers to nearby pro0ies configured by

    an administrator.

    ; Related "or#

    RPC chains utili3e two well&understood ideas in the conte0t of remote e0ecution= function

    shiing' and continuations.

    @unction shipping is the general technique of sending computation to the data rather than

    bringing the data to the computation. +t is used in some systems where the cost of moving

    data is large compared to the cost of moving computation. @or e0ample' $iamond J1'K is a

    storage architecture in which applications download searchlet code to dis2 to perform

    efficient filtering of large data sets locally' thereby improving efficiency. RPC chains usefunction shipping to send chaining logic.

    - continuation J1(K refers to the shifting of program control and transfer of current state from

    one part of a program to another. %0tending this to distributed continuationsis a natural step'

    allowing a continuation to shift program control from one processor to another. 7everal wor2s

    in the parallel programming community give high&level programming continuation constructs

    and specify their behavior formally' e.g.' J1#) 11K. $istributed continuations were e0ploited to

    enhance the functionality of web servers and overcome the stateless nature of P

    interaction. ,y comparison' the RPC chain is a generic mechanism that is independent of the

    service provided by servers. RPC chains support comple0 chaining structures' and can be

    used with a diverse set of servers.

    he above mentioned ideas for code mobility' and others' are leveraged in a variety of high&

    level programming paradigms for distributed e0ecution. $istributed wor2flows' e.g.' J*)##K'

    can use distributed continuations to distribute a wor2flow description in a decentrali3edfashion. MapReduce J+K' and $ryad J#3K are programming models for data&parallel

  • 7/24/2019 What Is RPC

    43/44

    processLs current state to the new host and resuming e0ecution. he motivation for mobile

    agents include (a) bringing processes closer to the resources they need in a given stage of the

    computation' and (b) allowing clients to disconnect from the networ2 while an agent e0ecutes

    on their behalf. -n RPC Chain can be considered as a mobile agent whose purpose is to

    e0ecute a series of RPC calls. owever' mobile agents are much more general and ambitious

    than RPC Chains (which possibly contributed to their eventual demise)= they have social

    abilities' being able to ad

  • 7/24/2019 What Is RPC

    44/44

    @inally' 7!-P J#1K is a protocol that supports RPCLs using *M? over P. +t has the notion

    of intermediaries that can process a 7!-P message (RPC) before it reaches the final

    recipient. owever' there is no client logic that routes and transform messages' and the notion

    of a pre&specified distinguished final recipient is inherent to 7!-P. ypical uses for

    intermediary nodes include bloc2ing messages (firewall)' buffering and batching of

    messages' tracing' and encrypting/decrypting messages as it passes through an untrusted

    domain.

    < Conclusion

    1e proposed the RPC chain' a simple but powerful primitive that combines multiple RPC

    invocations into a chain' in order to optimi3e the communication pattern of applications that

    use many composite services' possibly developed independently of each other. 1ith RPC

    chains' client can save networ2 hops' resulting in considerably smaller end&to&end latencies in

    a geodistributed setting. Clients can also save bandwidth because they are not forced to

    receive data they do not need. 1e demonstrated the use of RPC chains for a storage and a

    web application' and we thin2 RPC chains could have many more applications beyond those.

    https://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#SOAPW3Chttps://www.usenix.org/legacy/event/nsdi09/tech/full_papers/song/song_html/#SOAPW3C