The Next Generation Internet: Unsafe at any Speed? Ken Birman
Dept. of Computer Science Cornell University
Slide 2
Convergent Trends Existing Internet exhibiting brownouts,
security and quality-of-service problems Talk of a next generation
Internet offering 10 to 100-fold performance improvements A new
generation of networked applications includes large numbers of
critical ones
Slide 3
Typical Critical Applications Medical monitoring and clinical
databases. Community health information networks. Remote home care
and Remote telesurgery Integrated modular avionics systems. Air
traffic control. Free flight, 4-D navigation
Slide 4
Medical Networks Contacted a number of technical and business
people in this field (HP Careview, EMTEK, Hospital for Sick
Children) Asked: What are the trends? How are networks changing
healthcare? How are these systems made secure & reliable? Got
any good stories for me?
Slide 5
An ICU Computer System Bedside Clinical data server Digital
library and online PDR Laboratories Pharmacy Doctors office
Slide 6
a field in transition During 1980s, hospitals used largely
dedicated systems Client-server architectures now becoming
dominant, but trend is a recent one Systems ran in physical
isolation and had limited, mission-specific functionality
Slide 7
Important distinction Medical monitoring equipment, computer
controlled devices These practice medicine FDA regulated, like a
drug Software subjected to extreme verification methods, safety
certification is costly and hard Extends to the IEEE medical
information bus for connecting bedside devices
Slide 8
Important distinction Medical monitoring equipment, computer
controlled devices Clinical data systems By definition, not
considered safety critical Maintain the legally binding patient
record Think of a database system. Human checks all entries, even
data obtained directly from devices or lab reports.
Slide 9
Traditional Approach Each runs as a separate network Developed
completely independently No interconnection of any kind
Slide 10
Networking technology? Monitoring network is increasingly a
dedicated real-time LAN, this permits configuration flexibility,
remote telemetry, even adjustment of monitoring devices Clinical
database system increasingly connected to laboratories, community
health information networks (CHINs), physicians office, insurance
and HMOs
Slide 11
Platform choices? Overwhelming trend is to introduce standard
PCs and workstations, standard Internet technologies Forced
migration from dedicated platforms to shared, standard network
platforms Web access now common from PCs that run clinical database
software
Slide 12
bluring the distinction Increasingly, see monitoring network
cross- connected to the clinical data network Some physical
isolation: not yet common to see an IV perfusion drip controllable
over an internet within a hospital Perimeter security using
passwords, firewalls. But medical security needs are unusual;
mismatched to standard solutions.
Slide 13
Creep of critical role Technically, clinical data system is
non-critical But increasingly, the system actually is critical:
doctors and nurses depend upon the Accuracy and timeliness of
reporting Correct data for lab results, vitals, medications FDA is
simply late to catch up with trends Moreover, already seeing
Windows 95 and MS Access as basis for such systems
Slide 14
Consumer / society pull Intensive and growing cost pressures
Desire for freedom from medical system, home care Consolidation of
hospitals HMOs want to control care plan create trend towards
remote telemedicine, even robot telesurgery, CHINs
Slide 15
Vision: A Virtual Private Network Application shares the
network with untrusted agents but is isolated from them.
Slide 16
Reality? Current VPN support approximates this, but
configuration potentially awkward, slow Many CHINs wont use VPNs By
running over the Internet, CHINs are exposed to bandwidth
fluctuations and denial of service from many causes
Slide 17
Good stories? Many cases of security or privacy violations
(EMTEK has a good one). HP told me that some hackers accidently
disrupted a cardiac monitor in the Boston area a few years ago
(trying to track this down) Nutty nursing aide in Britain changed
orders, discharged patients, scheduled tests HP Careview, starved
for bandwidth, flickers on and offline in some critical care
units...
Slide 18
Broad picture? Application trends outstripping technology
Decision making is by societal consensus, cost pressures, reflects
HMO needs. Hospital executives insisting on standards Hospital
network of future: PCs, off-the- shelf Internet software, standard
Web stuff. Critical or not, like it or not, its happening!
Slide 19
What about aviation? Much use of computer technologies Flight
management system (controls airplane) Flaps, engine controls
(critical subsystems) Navigational systems TCAS (collision
avoidance system) Air traffic control system on ground In-flight,
approach, international hand-off Airport ground system (runways,
gates, etc)
Slide 20
What about aviation? Much use of computer technologies Flight
management system (controls airplane) Flaps, engine controls
(critical subsystems) Navigational systems TCAS (collision
avoidance system) Air traffic control system on ground In-flight,
approach, international hand-off Airport ground system (runways,
gates, etc)
Slide 21
ATC system components Controllers Air Traffic Database (flight
plans, etc) X.500 Directory Radar Onboard
Slide 22
similar turmoil On-board systems moving to COTS, integrated
modular avionics Boeing 777 SafeBus a success story Unlikely it
could be replicated with standard O/S and standard ATM or LAN
hardware Emergence of 4-D navigation (free flight) systems: ground
network penetrates level-A critical cockpit components.
Slide 23
Free flight Ground system On-board conflict alert and
resolution system Transponder and GPS
Slide 24
Future avionics systems Ground systems rely increasingly on
automation, have form of a highly available, highly critical
network. Built using standard PCs, software tools Ground network
becomes critical to flight safety On-board avionics are basically a
dedicated real- time LAN built with standard PCs but perhaps
non-standard O/S. One platform, many apps. Safety validation of
components replaces current validation of system. Think plug n
play
Slide 25
The list goes on Disaster warning and response coordination
Power management (grid control) Banking, stock markets, trading
systems Computer-controlled vehicles Military intelligence, command
and control Critical business applications
Slide 26
Commercial Off The Shelf Build using COTS Standard components
Buy off the shelf, then harden them Intended to be cheaper, easier
to maintain As a practical matter, there is nothing else on the
shelf! Roll-your-own solutions abandon powerful tools that make
modern computing great!
Slide 27
COTS Technology Mountain
Slide 28
COTS Reliable Technology Mountain
Slide 29
Next Generation Internet Current Internet looks frail Only
government investment can address security, reliability,
scalability and performance problems of the Internet Expectation is
that well build it quickly, hence that we basically know how
today
Slide 30
Next Generation Internet Concrete details? Seeks 10 to 100-fold
performance improvement Originally expected to provide IP-v6
interface Originally expected to implement Long IP addresses IPSec,
DNSec Quality of Service options over some form of Diffsrv (or
RSVP) mechanism
Slide 31
Reality check Both IPv6 and RSVP now uncertain due to
resistance from mainstream IPv4 crowd RSVP resource use on routers
grows as O(n 2 ) IPv6 would outmode a huge existing investment How
likely is it that the NGI will solve the practical problems
identified earlier? How does one build a secure, reliable,
scalable, high performance network application, anyhow? Do we in
fact know how to do this?
Slide 32
Glimpse of the IPv4 crowd They gave us TCP/IP, core internet
services, stuff on which we run email, web They elevate the
end-to-end argument to a religion (basically: packets, not
circuits) Little experience with critical applications
Slide 33
What about QoS? Best scheme: Diffsrv Uses an
edge-classification of packets; routers look at just a byte or two
But routers distort flow dynamics You send 50 packets per second
but within the network, a router might see a burst of 100, then a
second of silence Consequence is that Diffsrv will be at best
stochastic (and it also cant handle routing changes)
Slide 34
a troubling implication It seems unlikely that the NGI will
easily support isolation of critical subsystems with the range of
properties required More likely: a tool for building virtual
circuits (one-one connections) that run at very high speeds Missing
connection is the step from the network to the robust
application
Slide 35
What do we need? Isolation of functions Critical functionality
compartmentalized Components only interact through well-defined
interfaces with well-defined semantics Developer proves that
implementation respects interface definition and semantics On the
other hand, adequate performance is fundamental to providing
robustness
Slide 36
Evidence for these claims? This is how modern avionics modules
are built (wing flap and engine control, flight management system,
inertial navigation) Process is extremely costly and works only for
very small pieces of software SafeBus on Boeing 777 allows such
software to share platform by creating very strong firewalls
between components
Slide 37
Agenda emerges Find ways to divide and conquer Transform big
nasty system into smaller independent modules Run them in an
environment that has strong properties, which the modules exploit
Resulting system has strong properties too Can we apply this to
familiar distributed computing problems?
Slide 38
Philosophy Imagine a network as an abstract data type An
Overlay Network or ON We can instantiate it multiple times,
condition each copy with desired quality properties: A Virtual
Overlay Network or VON How to introduce properties? Mixture of
resource reservation at routers, on a per-ON basis, and management
actions at edges
Slide 39
A VON Looks like a dedicated Internet, although hosted on a
shared infrastructure Supports guarantees of properties such as
Bandwidth Noise level Security and freedom from denial of service
Treated as an aggregate, not a set of pt-to-pt connections!
Slide 40
Making Vision a Reality 1) NGI needs to give us the ON
mechanism 2) We need to implement VONs using fairly standard
protocols over the base ONs 3) Must be able to produce specialized
solutions for reliability/security needs 4) Solutions amenable to
selective use of formal tools
Slide 41
NGI hooks? Diffsrv and RSVP wont do it Creates an O(n 2 )
resource reservation problem Problem is that both schemes are
fundamentally connection oriented, and VON concept is fundamentally
multipoint in nature Hence these point-to-point QoS mechanisms are
not suitable for supporting VONs Any other options?
Slide 42
Switches supporting flows already exist MCI, Sprint, AT&T
already sell each other dedicated bandwidth with isolation This is
on a scale of perhaps 10s of flows and hence classification is easy
VONs might mean that a switch would see thousands, but such scaling
seems well within technical feasibility
Slide 43
Router understands flows Looks like this
Slide 44
Router understands flows Looks like this Acts like this Flow 1
Flow 2 Flow 3 Everything else
Slide 45
Things to notice A flow in this sense aggregates all the
traffic for one ON the identifier is for the ON not the endpoints
Classification task is thus much smaller and resources needed to
support this are linear in number of ONs that pass through the
switch, not the number of potential connections Each ON is like a
dedicated network
Slide 46
An ON has A bandwidth guarantee (router sets resources aside on
its behalf) Perhaps latency guarantees Can offer isolation between
flows But not much else
Slide 47
NGI part of the picture NGI needs to give us raw ONs but also:
Robust routing infrastructure Naming Ability to build an ON
tolerant of one link or router failure Many building blocks are
already in place But the core Internet community is balking on all
forms of QoS: isolation or other guarantees seen as inconsistent
with end-to-end philosophy
Slide 48
But suppose we get our wish Next President declares moral
equivalent of war after continuing Internet siege shuts down his
web site during election: Let there be Overlay Networks! Then
what?
Slide 49
Our new goal? Create VONs by adding properties to Ons User sees
VON as a set of end-points with minimum guarantees, like isolation,
between them We need a way to strengthen these properties E.g.
manage security keys, manage RSVP parameters, routing, network name
space We may also need ways to reliably communicate (1-1, 1-n
patterns)
Slide 50
VONs as abstract data types
Slide 51
Focus on the processes and network
Slide 52
VONs as abstract data types Think of the ON interface as an
abstract type ON ON ON
Slide 53
VONs as abstract data types Add encryption by substituting a
module that looks the same but encrypts messages encrypt encrypt
encrypt
Slide 54
Layered Microprotocols Interface to Horus is extremely flexible
Horus manages group abstraction group semantics (membership,
actions, events) defined by stack of modules encrypt filter sign
ftol Horus stacks plug-and-play modules to give design flexibility
to developer vsync
Slide 55
Layered Microprotocols in Horus Interface to Horus is extremely
flexible Horus manages group abstraction group semantics
(membership, actions, events) defined by stack of modules encrypt
filter sign ftol Ensemble stacks plug-and-play modules to give
design flexibility to developer vsync
Slide 56
Layered Microprotocols in Horus Interface to Horus is extremely
flexible Horus manages group abstraction group semantics
(membership, actions, events) defined by stack of modules encrypt
filter sign Ensemble stacks plug-and-play modules to give design
flexibility to developer vsync ftol
Slide 57
Same stack under each endpoint encrypt vsync ftol encrypt vsync
ftol encrypt vsync ftol
Slide 58
Multiple VONs in single application encrypt vsync ftol encrypt
vsync ftol encrypt vsync ftol encrypt vsync ftol encrypt vsync ftol
encrypt vsync ftol Yellow group for video communication Green for
control and coordination
Slide 59
Examples of reliability models Virtual synchrony model: emerged
from our work on Isis, now widely accepted Bimodal multicast model:
probabilistic and has neat performance properties but weaker
logical consistency guarantees Secure group communication
Multimedia channels
Slide 60
Virtual Synchrony Model crash G 0 ={p,q} G 1 ={p,q,r,s} G 2
={q,r,s} G 3 ={q,r,s,t} pqrstpqrst r, s request to join r,s added;
state xfer t added, state xfer t requests to join p fails
Slide 61
Virtual Synchrony Tools Various forms of replication:
Replicated data, replicate an object, state transfer for starting
new replicas... 1-many event streams (network news) Load-balanced
and fault-tolerant request execution Management of groups of nodes
or machines in a network setting
Slide 62
Stock Exchange Problem: Vsync. multicast is too fragile Most
members are healthy. but one is slow
Slide 63
Throughput (msgs/sec) Amount Perturbed Measured Impact of
Perturbation
Slide 64
The problem gets worse as the system scales up
00.10.20.30.40.50.60.70.80.9 0 50 100 150 200 250 Virtually
synchronous Ensemble multicast protocols perturb rate average
throughput on nonperturbed members group size: 32 group size: 64
group size: 96
Slide 65
Why does stability matter? Swiss Stock Exchange Exchange is
fully electronic [FTCS-27 paper] Uses Isis SDK to distribute all
bids/offers and all trades. Every node has the picture But this
means that entire trading history available to 50 member banks
& firms and hundreds or thousands of traders! Unstable node
could bring exchange to its knees. Similar issues seen in many
other settings
Slide 66
Pbcast has a probabilistic reliability model Either almost all
destinations receive the message or almost none do so This is
strong enough to use in applications with critical reliability
needs (but not necessary for all their communication purposes --
put side by side with virtual synchrony)
Slide 67
Slide 68
Figure 5: Graphs of analytical results
Slide 69
Pbcast has stable throughput Gets this from a mixture of
gossip-style local repair with several innovations to avoid
overload when some process fails We implemented the protocol and
experimentally confirmed this
Slide 70
Slide 71
Slide 72
Slide 73
Now we have several styles... Each style or model yields a VON
with different properties Application might not see the multicast
stack Instead, the environment in which the application runs could
see the stack and use it on behalf of the application For example,
a library could use stack to maintain the keys with which it
authenticates actions
Slide 74
Formal methods With so much riding on VON, we need strong
guarantees that the stack works! If protocols can be formally
proved correct, confidence will be far stronger Can we use formal
tools on network protocols built in this compositional manner?
Slide 75
Exploiting formal methods Van Renesse and Hayden: code stack
with language having strong semantics They used OCaml dialect of ML
Now we can bring formal tools to bear on issues of correctness:
Using Nuprl system for this Basically, it automates proofs and
program transformations
Slide 76
Initial Progress? Presented in 1999 ACM SOSP paper Have
formalized the transformations used to optimize stacks for high
performance We show that from one initial stack, we can produce
multiple optimized stacks for common cases. Yields big
speedups!
Slide 77
Steps Transform Ensemble stack into a single function in a
functional style Use partial evaluation to produce optimized
version for common cases Use theorem proving to establish that
stacks provide desired properties Transform back to imperative
style Resulting code is optimized yet retains properties of
original stack
Slide 78
Optimization Example encrypt vsync ftol Original code is simple
but inefficient Optimized code for common case is provably
equivalent yet inefficiencies are eliminated encrypt vsync ftol ?
Common case?
Slide 79
Optimization Example encrypt vsync ftol encrypt vsync ftol ?
Common case? We do nearly as well as hand-optimization and can
automatically handle much bigger stacks!
Slide 80
Wrapping things up By building better networks, and isolating
protocol components and system components and adopting a modular
architecture and selectively using formal methods we make it more
and more practical to gain both high performance and other desired
properties, such as reliability, security, stability, etc.
But will it happen? Current political agenda focuses on speed
and e- commerce transactions End-to-end community resists giving
any guarantees no matter how simple And NGI focus is exclusively on
point-to-point QoS, which seems unscalable denying us the one
primitive building block on which the whole concept depends!
Slide 83
Conclusions? The world needs better networks! Improve them by
improved opportunity for modularity, isolation, guarantees of
security and quality of service VONs and layers built over them
Lacking this, we face very serious problems simply going forward in
directions to which society is already committed.