67
27/1/2010 1 Lecture 3: Lecture 3: State of the Art State of the Art D.Sc. Arto Karila Helsinki Institute for Information Technology (HIIT) [email protected] T-110.6120 – Special Course on Data Communications Software: Publish/Subscribe Internetworking www.psirp.org

27/1/20101 Lecture 3: State of the Art D.Sc. Arto Karila Helsinki Institute for Information Technology (HIIT) [email protected] T-110.6120 – Special

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

27/1/2010 1

Lecture 3:Lecture 3:State of the ArtState of the Art

D.Sc. Arto Karila

Helsinki Institute for Information Technology (HIIT)

[email protected]

T-110.6120 – Special Course on Data Communications Software: Publish/Subscribe Internetworking

www.psirp.org

27/1/201027/1/2010 22

ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture

1. Protocols2. Mechanisms3. Publish/Subscribe paradigm

4. Design Considerations1. Economics2. Security3. Trust4. Privacy

27/1/201027/1/2010 33

IntroductionIntroduction The PSIRP project aims to solve some major issues of

the current Internet by applying… information-centric publish/ subscribe

… paradigm throughout the layers

In fact, many current applications are inherently pub/sub in nature:

Distribution of software and anti-virus updates IPTV BitTorrent RSS feedsand more!

A clean-slate pub/sub architecture could serve such applications very well

27/1/201027/1/2010 44

IntroductionIntroduction To succeed, we must know the current state of the art,

make use of it, and extend it in many areas of communication

In early 2008 a rather thorough state-of-the-art study was conducted and collected to a report (D2.1)

Development has not stopped there and the wiki used has lived on but D2.1 presents a snap-shot of the situation two years ago

Because of the breadth of the area, we had to focus on promising sub-areas

27/1/201027/1/2010 55

ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture

1. Protocols2. Mechanisms3. Publish/Subscribe paradigm

4. Design Considerations1. Economics2. Security3. Trust4. Privacy

27/1/201027/1/2010 66

Guiding PrinciplesGuiding Principles Our vision is based on these concepts:

Everything is information, which can be organized hierarchically to build complicated structures from simple elements

There are different forms of information reachability on all levels of the design and they can change in real-time

Control is given to the recipient of information, fixing the imbalance of powers inherent in TCP/IP

The state-of-the-art study was focused on issues that appear to serve these ideas

27/1/201027/1/2010 77

ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture

1. Protocols2. Mechanisms3. Publish/Subscribe paradigm

4. Design Considerations1. Economics2. Security3. Trust4. Privacy

27/1/201027/1/2010 88

ScopeScope The goals were mapped into areas of

investigation that seemed to be relevant Future Internet Architecture

• Protocols Naming Addressing Routing Multicast

• Mechanisms Compensation Caching Security Network Coding

27/1/201027/1/2010 99

Scope (cont’d)Scope (cont’d) Publish Subscribe Design considerations

• Economics• Socio-economic aspects• Security must be designed into the architecture• Trust is an important aspect of networking• Privacy is of increasing importance

27/1/201027/1/2010 1010

MethodologyMethodology The methodology of the SoA study was

dictated by the envisioned scope The SoA was simply the first step towards

understanding the relevant prior work “A system as complex as the Internet can

only be designed effectively if it is based on a core set of design principles, or tenets, that identify points in the architecture where there must be common understanding and agreement” [Cla2003]

27/1/201027/1/2010 1111

MethodologyMethodology The original Internet was created by people who share

the common goal of interconnecting their computing equipment

Computers were physically large, with extremely limited resources You kept your data with you and not on the system

Communication was modeled to share resources point-to-point…NOT for many-to-many content sharing and retrieval

As the Internet has grown well out of its envisioned scope, several of its limitations have become apparent

From the socio-economic point of view, solving tussles (conflicts of interest) is one of the key problems facing future Internet

This leads to design for change [Cla2003] and the requirement of evolvability [Rat05]

The importance of trust (E2E => T2T)

27/1/201027/1/2010 1212

Naming Currently naming usually happens at the

service-level: domain names, e-mail addresses, URIs etc.

The Domain Name System (DNS) defines a static, hierarchical namespace organized into a tree, where ICANN manages the top-level domains

The DNS namespace is decoupled from the (also hierarchical) IP address space

27/1/201027/1/2010 1313

Quick Discussion What is good about DNS?

What is bad about DNS?

Why is DNS is insufficient to support host mobility?

27/1/201027/1/2010 1414

Naming DONA replaces domain names with self-

certifying, two-part, hash-based names, naming data (not hosts or interfaces)

[Ram2004a] proposes a new design for name resolution

[Ram2004b] proposes prefix-matching DHT In [Cal2007] on channels are named with

unique identifiers without hierarchy or centralized control

[Cro2003] introduces contexts – collections of homogeneous network elements

There are lots of different proposals

27/1/201027/1/2010 1515

Addressing Traditionally IP addresses are divided into

classes A, B, and C In 1993 Classless Inter-Domain Routing (CIDR)

was introduced, with variable-length prefixes and aggregation of blocks

[And2007] proposes an address structure where the subnet prefix is replaced with a self-certifying Autonomous Domain identifier (AD) and the suffix with a self-certifying Host Identifier (EID), adresses now being of form: AD:EID

ROFL proposes routing on flat labels, in a totally topology-independent way (this does not scale)

27/1/201027/1/2010 1616

Addressing

In [Cal2007] nodes are anonymous and addressed through their incoming channels

In [Cro2003] specific addresses are bound to different addresses in different contexts

[Han2004] proposes seven steps towards an Internet resistant against DoS attacks – the first two calling for separation of client and server addresses and removal of globally reachable client addresses

27/1/201027/1/2010 1717

Inter-Domain Routing Border Gateway Protocol (BGP) is suffering from

serious scaling problems Default-free zone In June 2007, APNIC router in Tokyo had ~225,000

routes! Any change in a globally visible prefix causes

Internet-wide route updates The number of globally visible prefixes is growing

for a number of reasons, such as: Provider-independent addressing Multi-homing of sites Protecting against prefix hijacking

27/1/201027/1/2010 1818

Domain-Level Routing To tackle BGP’s scaling issues [And2007] proposes to

route at the domain level Removal of path selection from packet-forwarding-level

routing has been proposed Explicit domain-level path construction fits with name-

based routing (e.g. TRIAD) [Lak2006] proposes providing the path selection function

as a separate routing service [Key2006] lets the sending host optimize path selection

based on congestion information NIRA [Yan2007] proposes a separate path discovery

protocol for the up-graph, Name-to-Route Lookup Service (NRLS) for the downhill route, and allowing the endpoints to further negotiate end-to-end path selection

27/1/201027/1/2010 1919

Domain-Level Routing Some of these functionalities are needed by multi-path

capable transport protocols, such as the Stream Control Transmission Protocol (SCTP) [Ste2000]

[Fea2004] proposes removing the routing function from routers to allow for better domain-level control of routing policies and allow a more direct domain-level mechanism for inter-domain routing

ROFL uses domain-level source routes as the means to route packets between endpoints – the first packet of a session uses hierarchical DHT routing, but after that the endpoints can use NIRA-like [Yan2007] end-to-end domain-level path control

27/1/201027/1/2010 2020

Compact Routing Routing table sizes and communication

cost of BGP are increasing exponentially with the number of global prefixes [Kri2007]

Routing on AS numbers doesn’t offer a real solution to the growing complexity

Compact Routing aims to decrease the size of routing tables while allowing non-shortest paths to be used

Traditional shortest-path algorithms yield routing tables of size O[n*log(n)] [Gav1996]

27/1/201027/1/2010 2121

Compact Routing A routing scheme is said to be compact if it

produces: Logarithmic address and header sizes Sub-linear routing table sizes Stretch bounded by a constant

A compact routing scheme can be.: Specialized or universal (works on all graphs) Name-dependent or name-independent

Two compact routing schemes with small stretch (3) are the non-hierarchical Cowen [Cow1999] and the Thorup-Zwick (TS) [Tho2001] schemes

[Kri2004] focuses on the TZ scheme with Internet-like graphs

27/1/201027/1/2010 2222

Overlay Routing In overlay routing the topology is formed

over an underlying (usually IP) network DHTs are examples of overlay routing DHT techniques can be utilized e.g. in

implementing non-hierarchical rendezvous An example of DHT-based solutions is the

Content Addressable Network (CAN) CAN is based on a d-dimensional

Cartesian space, each node having a coordinate zone that it is responsible for

27/1/201027/1/2010 2323

CAN A two-dimensional example

27/1/201027/1/2010 2424

Chord Ring Greedy forwarding (cmp w/ ROFL)

27/1/201027/1/2010 2525

Pastry DHT An example with hexadecimal identifiers

27/1/201027/1/2010 2626

Content-Based Pub/Sub Routing

Hosts subscribe to content by specifying filters on the events

The content of the message defines its ultimate destination

Subscribers use interest registration facility which sets up data delivery paths

Pub/sub has been proposed as a replacement for TCP/IP

This would change the economic model too

27/1/201027/1/2010 2727

Content-Based Pub/Sub Routing

Filter-based event routing – pub/sub servers are organized into an acyclic tree

Multicast-based event routing – a multicast tree is build for every interest group

Kyra [Cao2004] combines the approaches using a two-level hierarchy Within a clique (based on proximity) all nodes

know each other On a higher level minimum spanning trees to

the cliques are built for various events

27/1/201027/1/2010 2828

Content-Based Pub/Sub Routing

Siena is a classic example of distributed content-based routing implemented in the application layer, coexisting with TCP/IP [Car2001]

Overlay networks allow more complex functionality to be implemented on top of IP

Good overlay routing configuration follows the placement of network-level routers

27/1/201027/1/2010 2929

Multicast Multicast is vital for the efficient distribution of

media (such as video) IPv4 has class D addresses for multicast DVRMP is and early mcast routing protocol The topological map of OSPF allows MOSPF to

operate with little overhead Protocol Independent Multicast (PIM) works with

any routing protocol in two modes: sparse (PIM-SM) and dense (PIM-DM)

In the local network, IGMP is used

27/1/201027/1/2010 3030

Multicast Multicast is considered valuable but it is not

supported in the Internet The main reasons for this are its security

and scalability issues DVRMP and PIM-DM initially flood the n/wk Each multicast router requires a lot of state The sender runs the risk of getting traffic

back from a large group of recipients [PAS1998] provides a summary of

approaches emphasizing different goals

27/1/201027/1/2010 3131

Recent Trends in Multicast There are many proposals for more scalable

or more easily deployable multicast These can be roughly divided into three

groups: Router-based Host-based Overlay (DHT) -based

27/1/201027/1/2010 3232

Mechanisms

Compesation Cacheing Security Network Coding

27/1/201027/1/2010 3333

Compensation To facilitate efficient use of resources by

providing the “owner” with some assurance that he will eventually benefit from the use of his resource

Different forms of compensation: Authorization Community membership Resource exchange Sacrifice or evidence of deliberate waste of the user’s

own resources Payment or promise of future reimbursement

27/1/201027/1/2010 3434

Compensation Types of transaction-related costs:

Immediate technical costs Information search costs Collateral costs associated with the use

Compare w/ Transaction Cost Economics: Researching potential suppliers Collecting information on prices Negotiating contracts Monitoring the supplier’s output Legal costs incurred (contract breaches)

27/1/201027/1/2010 3535

Compensation Weber, Biggard and Delbridge: exchange =

voluntary agreement involving the offer of any sort of present, continuing, or future utility in exchange for utilities of any sort offered in return

Four categories of exchange systems: Price System Associative System Moral System Communal System

27/1/201027/1/2010 3636

Caching [PIT2008] studies caching performance in

nodes of a Delay Tolerant Network (DTN), providing ad-hoc communication services within (sparse) mobile user communities when end-to-end IP service is unavailable

The network acting as a distributed cache Caching is needed to handle heavy traffic The price of storage is dropping faster

than the price of communication => caching is getting more tempting

27/1/201027/1/2010 3737

Storage vs. Transit PriceDisk space price (logarithmic)

1985 1990 1995 2000 2005 2010

$100/MB

$10/MB$1/MB

$100/GB

$10/GB$1/GB

$0.1/GB

Tier-1 Internet Transit

2009

Raw Disk Space

27/1/201027/1/2010 3838

Scope Security

In pub/sub architectures scopes control the spreading of information

[Fie2004] proposes an extension to a large pub/sub system Rebeca to support scopes

In [Far2002] access control is implemented with attribute certificates (ACs) used to identify nodes and their privileges

27/1/201027/1/2010 3939

Packet Layer Authentication

Each packet (or PDU at any layer) can be signed and the public key included

The authenticity of the packet can now be determined by any node on its route

This prevents the attacker from consuming a lot of resources with falsified packets

This area will be covered more on the PSIRP Security Architecture lecture

27/1/201027/1/2010 4040

Transparency and Information Accountability

Social rules ted to more often cause compliance than abuse

This is due to the fact that the consequences of compliance usually are more pleasant than those of violation

If we can build this into the architecture, a large-scale system can be made reliable, robust, secure, trusted, and efficient

[Wei2007] introduces transparency and accountability as the attributes of information systems that could result in compliance and collaboration

27/1/201027/1/2010 4141

Network Coding Communication through an unreliable and

unpredictable channel is difficult Transmission errors can lead to long delay

and large number of retransmissions Network coding includes Forward Error

Correction (FEC) as well as more modern rateless codes – i.e. digital fountain codes

27/1/201027/1/2010 4242

Reed-Solomon Codes Among the most significant traditional codes

is the Reed-Solomon code (N,K), with qm symbols in its alphabet, can be decoded after receiving K out of N symbols sent

The message consists of K original symbols and N-K parity symbols

27/1/201027/1/2010 4343

Fountain Codes Fountain techniques send randomly all the parts of

a message with added redundancy They are rateless since there is no limit on the

number of encoded packets generated from the source message and it can change on the fly

The source can send as many encoded packets as necessary for the destination to decode the data

Among fountain codes are: Random Linear Fountain Code, Tornado Codes and LT Fountain Code

27/1/201027/1/2010 4444

XOR Coding Intelligent mixing of packets can be used to

increase network throughput An example is the situation where two users

of a wifi base station (router) exchange two messages

Without network coding we need four transmissions

With simple XOR coding we can do with only three transmissions

27/1/201027/1/2010 4545

XOR Coding Message exchange without network

coding – four transmissions

27/1/201027/1/2010 4646

XOR Coding Message exchange with network coding –

three transmissions

27/1/201027/1/2010 4747

Linear Network Coding Linear network coding is rather like XOR

coding, except that the XOR operation is replaced with linear combination of data

The recipient can decode the information having received m out of the n messages

Linear coding appears to work well with multicast, which makes it interesting for PSIRP

27/1/201027/1/2010 4848

Publish/Subscribe Paradigm The starting point of our work is that event-

based computing and the pub/sub paradigm are crucial for future services

RSS feeds can be seen as pub/sub SIP is an example of event-based comp. Formal modeling of pub/sub systems and

correctness of content-based routing protocols are examined in [Müh2002b]

A routing protocol is correct if it satisfies the safety and liveliness requirements

27/1/201027/1/2010 4949

ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture

1. Protocols2. Mechanisms3. Publish/Subscribe paradigm

4. Design Considerations1. Economics2. Security3. Trust4. Privacy

27/1/201027/1/2010 5050

Design Considerations

Economics Security Formal Modeling

27/1/201027/1/2010 5151

Economics

Some key economic issues are: Which aspects of network usage are charged for? Related to above, which are the entities involved? How is charging accomplished? What happens at domain boundaries? What are the objectives of charging? Which economical "fundamentals" limit the

architectural choices?

27/1/201027/1/2010 5252

Socio-Economics

The socio-economic aspects include: Value-chain dynamics Bullwhip Effect Overlay Economics Design for Tussle Reductionism vs. Evolution

27/1/201027/1/2010 5353

Security Designing and building security into the

architecture is central to PSIRP The SoA study was concerned with:

Network attacks Threat analysis Solution methodologies Formal methods for modeling security

protocols Requirements Operating tactics

27/1/201027/1/2010 5454

DDoS Attacks Distribute Denial of Service (DDoS) attacks are the

difficult to protect against Among them the most difficult are band-width consuming

attacks [And2003] and [Par2007] use data channel and a small

control channel, over which anybody can send packets to a destination asking for permission to send data => proactive filtering

The capability is added to every packet sent and the data channel only needs to handle packets w/ valid capabilities

Obviously, the control channel now becomes a target Various computational, memory, and band-width puzzles

have been proposed to increase real customers’ chances

27/1/201027/1/2010 5555

DDoS Attacks Filtering really should be done already in the

network (cmp/ w/ PLA) [Bal2005] proposes proactive filtering based on

Bloom filters and source routes Diffusion, replication and hiding focus on making

it harder to concentrate the attack Pub/sub systems make routing decisions on

flexible messages – routing-scope flooding with complex messages consumes lots of resources

Pub/sub routing nodes maintain a lot of state information – false publications can cause DoS

[Wun2007] states DoS attacks on pub/sub systems might have unpredictable effects

27/1/201027/1/2010 5656

Threat Analysis and ResearchTo survey existing attacks, we divide them into three domains of functionality: End-user domain where publishers and

subscribers may not trust each other, the pub/sub service or the underlying infra

Pub/sub service provision domain where the provider may not trust publishers and subscribers or vice-versa

Infrastructure domain whose components (cache elements, label switching routers, forwarding nodes, multicast points, network coders) may not trust each other

27/1/201027/1/2010 5757

Threat Analysis and Research In the Pub/sub service provision domain

providers and end-users should have a symbiotic relation Replay attack – intercepting and copying

packets containing credentials and using them to masquarade

Sybil attack – the attacker presents itself with multiple identities, undermining the redundancy of a distributed system

Integrity of service means avoiding service misuse and isolating incidents (e.g. a rouge service broker generating spam)

27/1/201027/1/2010 5858

Threat Analysis and Research Infrastructure integrity means that the elements

performing networking functions are uncorrupted and trustworthy

Possible threats include: Cache Poisoning (bogus caches) Routing Service Attacks (discovery & maint.) Forwarding Phase Attacks (fast data path) Eclipse Attack (malicious nodes colluding) Amplification (e.g. dormant subscriptions) Resource Consumption Attacks (aka sleep

deprivation) Message State Effect (statefull routing nodes)

Service-layer confidentiality – Man-in-the-Middle

27/1/201027/1/2010 5959

Existing SolutionsExisting security solutions include: Access Control’

[Bel2003] proposes role-based access control EventGuard

Provides security for content-based pub/sub: authentication, confidentiality and integrity of publications using six guards: subscribe, adv., publish, unsubscribe, unadvertised, routing)

QUIP Protocol for securing content distribution in

pub/sub networks [Cor2007]

27/1/201027/1/2010 6060

Formal Modeling and Analysis of Security Protocols

The analysis of pub/sub crypto protocols is much like that of traditional send/receive

Pub/sub versions of existing protocols rely on explicit channels or pre-agreed names instead of expecting the network to deliver

Unlike in traditional crypto protocols, the sender need not know the identity of rcpt.

It may, for example, be enough to know that there is just one peer

27/1/201027/1/2010 6161

Formal Modeling and Analysis of Security Protocols

The Dolev-Yao intruder model, where the intruder can hear, intercept and synthesize any message, largely still pertains but it will need to be extended and enriched

Focus is moving from authenticating principals to various security properties related to the data itself

Group communication changes the nature of many problems

New pub/sub protocols need to be designed –resource control, including issues of fairness, compensation, and authentication

27/1/201027/1/2010 6262

Security GoalsA preliminary set of threats and security goals: Secrecy of security-related entity identities and identity

protection. Secrecy of keys and other related information, typically

needed for confidentiality and data integrity of the transmitted information.

DoS, including unsolicited bulk traffic (spam). Threats to fairness, including mechanisms such as

compensation and authorization. Authenticity and accountability of the information,

including its integrity and trustworthiness, reputation of the origin, and evidence of past behavior, if available.

Privacy and integrity of subscriptions to information. Privacy and integrity of the forwarding state

(as a result of subscriptions).

27/1/201027/1/2010 6363

Formal Methods in Security

Burrows, Abadi, Needham – BAN logic – assumes only passive eaves-dropping

The Casper/FDR combination provides a dynamic perspective

Casper translates security protocol into CSP that can be fed into the FDR model

27/1/201027/1/2010 6464

Trust Trust deals with the intentions and

knowledge of parties A lot of work has been done on analyzing

this in the protocol context In the real world, trust is about our ability

to rely on the benevolence and good intentions of people and organizations

Checks and balances have been developed to institutionalize trust

27/1/201027/1/2010 6565

Privacy

Privacy issues are central to any new technology (e.g. RFID)

Privacy can be divided into (overlapping) domains: Physical privacy Information privacy Contextual privacy

27/1/201027/1/2010 6666

Anonymity

Anonymity should be the norm – not the exception

Matt Blaze has done a lot of good work in this area but generally it is neglected

There are several anonymity architectures for preserving different kinds of anonymity

27/1/201027/1/2010 6767

Thank you for your attention!Thank you for your attention!

Questions? Comments?Questions? Comments?