Upload
manchu
View
20
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Scalable, Robust Wide-area Control Architecture for Integrated Communications. Helen J. Wang Qualifying Examination March 8, 2000. Cellular. Pager. PSTN. Internet. Motivation. Lack support for: Integrated use of heterogeneous devices (old & new) - PowerPoint PPT Presentation
Citation preview
1
Scalable, Robust Wide-area Control Architecture for Integrated
Communications
Helen J. Wang
Qualifying Examination
March 8, 2000
2
Motivation
Lack support for:• Integrated use of heterogeneous devices (old & new)• Rapid arbitrary communication service customization
PSTN
CellularPager
Internet
3
Limitations of Existing Systems
• Telecommunications network: – engineered with one app and device in mind
• Existing Internet Telephony systems:– ease of service creation, but limited– scalability, availability and fault tolerance not
fully addressed
4
How good is a communication system?(Dissertation Goals)
• Functionality: communication services it can support, and the ease of creating them
• Viability: scalability, robustness
• Focus on the control aspect:– control architecture = system components +
signaling protocol (session setup, tear-down, and control)
5
Problem Statement
• Given heterogeneity, how to design a scalable, robust wide-area control architecture that supports easy creation of a wide range of communication services? And how should these services be created?
6
Outline
• Related Work and Research Contribution
• Control Architecture
• Signaling Protocol
• Service Creation Model
• Summary, Methodology, Research Agenda
7
Related Work
8
Overview of Research Contributions
• A scalable control architecture
• A robust signaling protocol
• A user-level, easy service creation model• Publications:
– “A Signaling System Using Light Weight Sessions” accepted to
Infocom 2000. – Helen J. Wang, et al. “ICEBERG, An Internet-Core Network
Architecture for Integrated Communications,” accepted to IEEE Personal Communications April/2000.
9
Outline
• Related Work and Research Contribution
• Control Architecture• Signaling Protocol
• Service Creation Model
• Summary, Methodology, Research Agenda
10
Control Architecture: Goals
• Any-to-any communication– inter-working, composition of data transformation
• Personal mobility– unique ID, name mapping
• Personalized communication services– preference storage and management
• Enable user-activity driven services– activity tracking
11
Control ArchitectureComponents and Their Operations
Call Agent
Alice@domain1iPOP
Call Agent
Bob@domain2iPOP
PR
PAC
NMSIAP
dialed333-2222
NMS PR
PAC
Pick up
APC APC
Data Path
IAP
12
Leverage Cluster Computing Platforms
• iPOP must be scalable and robust: leverage cluster computing platforms such as Ninja, AS1
• Our requirements:– highly available service invocation: Ninja Base – fault tolerant service session: AS1
• session state maintained on client (IAP)
• iPOP on Ninja Base augmented with client heartbeat support from AS1
13
Control Architecture:Facts
iPOP
Call Agent
PR PAC
iPOP
Call Agent
PR PAC
Local areacommunication
Wide-areacommunication
Access net
• One Call Agent per caller per device• One type of IAP per access network
IAP
14
Outline
• Related Work and Research Contribution
• Control Architecture
• Signaling Protocol• Service Creation Model
• Summary, Research Methodology, Agenda
15
Signaling Protocol
• Basic call service: building blocks for supplementary services– Conventional: two party, homogeneous devices– ICEBERG communication model:
• multi-device communication• invitation-based participation• large number of dynamic small group communication• Richer primitives: add/remove an endpt during a session• conference call, service handoff first class service; trivial to
implement services that require endpoint changes.
16
Challenges in Signaling:Problems with SIP
CA1 CA2Alice Bob
CA3 CA4Carol Dale
Invite(also Bob)
Invite(also Alice)
Invite AliceInvite Bob
Alice Bob Carol Dale Alice Bob CarolDale
Alice Bob DaleAlice Bob Carol
• no consideration of session dynamics: membership, component failure
• bridged conference: centralized component to maintain states -- single point of failure
CA5
17
Problems with H.323
• Centralized approach for conferencing
• Limited fault tolerance measure:– process-pair style– cannot capture new state during fault recovery
• Complex
18
Lessons Learned
• Correctness and robustness: – need to maintain up-to-date membership and
session state (call parties, device status, data path info) in the face of transient component failures, network partitions, and any exceptional conditions.
– distributed approach rather than centralized
19
Our Approach
• Maintain membership and session state as soft state in a distributed fashion. – Soft state: expired unless refreshed, protocol
action upon new state or timeout, error recovery same as normal operation
• Question: call setup latency requirement? bandwidth scalability problems?
20
Signaling Protocol: Session Membership
• Session membership– membership: CAs– IP multicast’s group service an overkill for
small group communication• per group state in routers, IP addr scarcity,
deployment issues: access control, accountability
– Solution: run an application-level group membership protocol among participating IAPs
21
Signaling Protocol: Capture the Complete Session State
iPOP
Call Agent
Session state
iPOP
Session state
iPOP
Call Agent
Session state
Comm Session
Call Agent
APC APC
APC
Listen
Listen
Listen
IAP
IAP
IAP
iPOP HB
iPOP HBiPOP HB
HB
HBHB
Announce Announce
Announce
22
Signaling Protocol: Fault Tolerance
iPOP
Call Agent
Session state
iPOP
Session state
iPOP
Comm Session
Call Agent
APC APC
APC
Listen
Listen
Listen
IAP
IAP
IAP
iPOP HB
iPOP HBiPOP HB
HB
HBHB
Announce Announce
Announce
Call Agent
Session state
23
Signaling Protocol: Fault Tolerance
iPOP
Call Agent
Session state
iPOP
Session state
iPOP
Comm Session
Call Agent
APC APC
APC
Listen
Listen
Listen
IAP
IAP
IAP
iPOP HB
iPOP HBiPOP HB
HB
HBHB
Announce Announce
Announce
Session state
Call Agent
24
Signaling Protocol: Fault Tolerance
iPOP
Call Agent
Session state
iPOP
Session state
iPOP
Comm Session
Call Agent
APC APC
APC
Listen Listen
IAP
IAP
IAP
iPOP HBiPOP HB HBHB
Announce Announce
25
Invitation Protocol
• Invite a Call Agent to participate a session
• Also a soft state protocol for robustness: – IAP maintains the call state machine, sends stateful, keep-
alive heartbeat to the iPOP
– Call Agents advance call state machines on IAPs through periodic install-state message until receiving new heartbeat with the new state
– Soft state inter-iPOP communication
26
Bandwidth Scalability
• Soft state period selection: call setup latency, fault recovery time vs Bandwidth overhead– An optimization problem: minimize bandwidth overhead,
subject to the following contraints:• expected call setup latency (1.5 second)• standard deviation (0.5 second)• fault recovery time (1, 4 seconds for local and wide area)
– parameters: 2% wide-area loss rate, 0.2% local-area loss rate, 2ms local-area propagation delay, 100 ms wide-area delay
– local: 1 sec, 800bps; wide: 3 sec, 233 bps; for 64kbps data stream, local area control traffic 1%
27
Processing Scalability
• Compare our single cluster system against a class 4 switch which is a local (end) office: 250 calls/second
• Our current prototype yields 10 calls/second on a PC due to inefficient RMI implementation (10’s ms), 25+ PCs = a class 4 switch
28
Outline
• Related Work and Research Contribution
• Control Architecture
• Signaling Protocol
• Service Creation Model• Research Agenda
29
Service Creation Model
• Focus: control, redirection services• Goal: end users can easily customize the
control services in any arbitrary way• Issues:
– service creation/customization– service invocation – service portability– system support
30
Intelligent Network
• Separate service logic from basic call processing
Switch
ServiceLogic
Trigger
• Service portability: standardize basic call state machine too strict a standard failed
• Limitation: no user-level customization
31
Proposed Approach
• Call processing implementation independent customization: use high-level events, e.g., call request received, callee device busy, callee device not answer
• Service creation: condition-action pairs– condition: conjunction of high level events, user
interested conditions, and boolean expressions;
– Action: composition of system primitives
• Hypothesis: condition-action pair sufficient
32
Proposed ApproachService Invocation & Portability
PreferenceRegistry
Call Agent
PAC
event
checkupdate
Condition Action
Activity
Condition ActionConditionCondition
ActionAction
• Service Portability: standardize the events and system primitives, much easier than call state machine
33
An Example Completion of calls to busy subscriber
callee busy && caller hang up register with callee PAC;
callee PAC reject exit
callee PAC notify invite caller; invite callee;
caller busy wait 5 minutes; re-register with the callee PAC;
hangup time > 1 hours de-register with callee PAC; exit
34
An Example, Cont.
• System support issues:– extended Call Agent life time– queue management on the PAC
– track event sequence: stack of timed events, stack depth depending on user preferences
35
How good is a communication system?
• Functionality: services– component identification– powerful signaling protocol primitives– easy, user-centric service creation model
• Viability: scalability, robustness– first application of soft state to signaling protocol,
bandwidth overhead not an issue, can fulfill latency requirements
– processing scalability, local area robustness by leveraging cluster computing platforms
36
Outline
• Related Work and Research Contribution
• Control Architecture
• Signaling Protocol
• Service Platform
• Methodology and Research Agenda
37
Methodology1st Iteration (Completed)
PrototypeDesign
Analysis Evaluation
• Control architecture
• Signaling protocol– session maintenance
protocol
• Control architecture
• Session maintenance protocol
• Measured the current prototype• Simple soft state period analysis
38
Methodology2nd Iteration Overview
PrototypeDesign
Analysis Evaluation
• Service creation model– Possibly revise the design of the
control architecture and the signaling protocol
• Completed work:– invitation protocol
– membership protocol
• Wide-area testbed
• Group membership protocol
• Invitation protocol
• Service creation model
• Evaluation: scalability, robustness, service creation, hard/soft state comparison
• Analysis: group membership protocol, service creation
39
Research Agenda
• Phase 1: complete and fine-tune service creation model design (1 month)– define events and system primitives– preference conflict resolution– identify service creation interaction with the
control architecture and signaling Planned paper submission on service creation model design to SmartNet 3/31
40
Research Agenda
• Phase 2: 2nd iteration Prototyping (3 - 6 months)– invitation protocol, membership protocol– employ Ninja vSpace– release ICEBERG to Ericsson, TU Berlin, NTT
and construct a wide-area test-bed– service creation modelPlanned paper submission to ICNP (May) orINFOCOM (July) on protocols and analysis
41
Research Agenda, Cont.
• Phase 3: Evaluation (6 months)– processing scalability: measure call processing time, #
of simultaneous sessions, compare against class 4 switch
– bandwidth scalability: group membership protocol analysis; dynamic soft state period selection
– robustness: emulate failure conditions (losses, long delays, component failures), run system over time
– hard/soft state comparison: bandwidth usage, latency, fault recovery time
42
Research Agenda, Cont.
– Service creation evaluation:• comparable functionality : implement representative IN
services such as “call completion upon busy”
• new services such as policy-based call waiting
• system extensibility: # of lines of code and amount of time to develop new primitives for new services
Planned paper submission on wide-area testbed experience and evaluation to SIGMETRICS 3/2001
43
Research Agenda, Cont.
• Phase 4: Write thesis (6 month)– compile the publications
44
Acronyms Lookup
• APC: Automatic Path Creation• CA: Call Agent• IAP: ICEBERG Access Point• iPOP: ICEBERG Point of Presence• NMS: Name Mapping Service• PAC: Personal Activity Coordinator• PR: Preference Registry
45
Soft and Hard State
• Soft State– expire unless refreshed,
protocol action upon new state and timeout
– loss of state will not stop the system -- robust
– eventual consistency – error recovery built into
normal operation --simple, but longer latency, and no diagnosis
• Hard State– explicit state setup once
only (bandwidth and processing efficiency)
– explicit error detection and recovery synchronously at involved components -- complex but immediate
– better consistency guarantees
46
Signaling Protocol: Group Membership Protocol
• Periodic membership exchange among members– no bootstrapping needed: every member knows at least
one other member (invitation-based)– receive superset or disjoint set: immediate
synchronization with the rest of the session– run among the IAPs for Call Agent fault recovery– time stamped <IAP, CA> list
• Convergence efficiency rather than bandwidth efficiency
47
Period Selection
• Soft State Period: dominates fault recovery time, affects bandwidth overhead– cannot trade latency for bandwidth scalability
• Problem: what period values to select to fulfill the call setup latency, fault recovery latency requirements and minimize the bandwidth overhead? -- an optimization problem
48
Select PeriodProblem Formulation
• Call setup latency = receiving 8 local-area and 4 wide-area msgs in sequence + msg processing time
• Receive a local-area msg = f (local-area period, local-area loss-rate, local-area propagation delay)
• The optimization problem:– find local-area and wide-area period that minimize bandwidth overhead,
subject to the following constraints• E(call setup latency) <1.5 second• Standard deviation (call setup latency) < 0.5 second• local-area fault recovery time <1 s; wide < 4 s
– with parameters: 2% wide-area loss rate, 0.2% local-area loss rate, 2ms local-area propagation delay, 100 ms wide-area delay
49
Results: Period = f (processing)
• fault recovery time constraints dominate the effects on period
• local-area period = 1s– 800 bps overhead
• wide-area period = 3s– 233 bps overhead
• for 64kbps data stream, 1% * # of members
50
Proposed Approach: Service Creation
User GUI
Condition Action
PreferenceRegistry
Call Agent
• Condition: conjunction of high level events, user interested conditions, and boolean expressions;
• Action: sequence of system primitives• Advantage: call processing impl. independent• Hypothesis: condition-action pair sufficient
51
An Example