146
Understanding VoIP Dr. Jonathan Rosenberg Chief Technology Strategist Skype

Understanding VoIP

  • Upload
    denise

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

Understanding VoIP. Dr. Jonathan Rosenberg Chief Technology Strategist Skype. What is this course about?. Getting “under the hood” and understanding how VoIP works An exploration of the protocols and technologies behind VoIP - PowerPoint PPT Presentation

Citation preview

Page 1: Understanding VoIP

Understanding VoIP

Dr. Jonathan Rosenberg

Chief Technology Strategist

Skype

Page 2: Understanding VoIP

What is this course about?

Getting “under the hood” and understanding how VoIP works

An exploration of the protocols and technologies behind VoIP

Conveying an understanding of the various problems that need to be solved for VoIP to work

Page 3: Understanding VoIP

What this course is not about

A general introduction to telephony A detailed cookbook or deployment guide to

VoIP A product survey of VoIP and IP telephony

products In particular, Cisco or Skype products are not

discussed except in passing

Page 4: Understanding VoIP

Ground Rules

Ask Questions ANY TIME! I will be bored if this is a one way

conversation No question is too stupid Laughing or mocking anyones questions is

unacceptable Please ask off-the-wall or exploratory

questions – there is a lot that is not in here!

Page 5: Understanding VoIP

Agenda

Breaking up the problem Voice and Video coding Voice and Video Transport Quality of Service Signaling Security NAT Traversal

Page 6: Understanding VoIP

Non-Agenda

Programming APIs Emergency Services, Lawful Intercept Numbering, Routing, Naming (ENUM, TRIP) PSTN Interworking Billing, Provisioning, OAM Conferencing, IVR, Applications

Page 7: Understanding VoIP

Breaking Up the Problem

Endpoint Endpoint

IP NetworkIP Network

SignalingServers

DirectoriesDatabases

AccountingBilling

PresenceServers

MediaServers

OAM

ApplicationServer

RTP

IPIP

SIP, H.323,MGCP,H.248 SIMPLE,

XMPP

SIP

LDAP,ENUM

RADIUSDIAMETER

Page 8: Understanding VoIP

Voice Coding

Page 9: Understanding VoIP

DTMF/Tone

Generation

DTMF/ToneDetection

Hybrid EchoCanceller

LossAdmin

NonlinearProcessing

+

-

Silence Detection

SpeechEncoding

Packetizer

No Speech

Speech

Unpacker

ComfortNoise

Generation

SpeechDecoding

2-wire interface

Voice Endpoint Model

Page 10: Understanding VoIP

Codecs Waveform codecs:

Directly encode speech in an efficient way by exploiting temporal and/or spectral characteristics

Attempt to reproduce input signal’s waveform by minimizing error between input and coded signals

Source codecs / vocoders: Estimate and efficiently encode a parametric

representation of speech

Page 11: Understanding VoIP

CELP Minimizes perceptually

weighted error similar to waveform coders

Short-term predictor is LP (vocal tract) filter

Excitation is obtained from codebook and long-term pitch predictor

Closed-loop search is MIPS intensive

Page 12: Understanding VoIP

Codec ComparisonCodec Sampling Bitrate Latency Comments

G.711 8 Khz 64 kbps 125 us PSTN Codec

G.729 8 Khz 8 kbps 10ms CS-ACELP

G.723.1 8 Khz 5.3/6.3 kbps 37.5ms

AMR 8 Khz 4.75 – 12 kbps

25ms GSM codec

G.722.1 16 Khz 24/32kbps 40ms Polycom SIREN

AMR-WB 16 Khz 6.6-23.85 kbps

25ms GSM Wideband – encumbered

SILK 8, 12, 16, 24 Khz (SWB)

6-40kbps 25ms Skype codec

Listen at: http://www.voiceage.com/listeningroom.php

Page 13: Understanding VoIP

Echo Cancellation

Packet Network

Echo Path

Estimation2-4-wire

Hybrid

Non-LinearProcessor

+

-Reflection

Analog

Digital

Echo Canceller

ERLE

ERL

This echo canceller cancels‘local’ echoes from the hybrid reflection

ERL: Echo Return Loss (dB)

ERLE: Echo Return Loss Enhancement

Double-talk Convergence time

Page 14: Understanding VoIP

Echo Canceller Specifics The voice echo path is like an electrical circuit

If a ‘break’ (cancellation) is made anywhere in the ‘circuit’, you will eliminate the echo

The easiest place to make the break is with a canceller ‘looking into’ the local analog/digital telephony network, NOT the packet network (which has much longer and variable delays)

The echo canceller at the other end of the call eliminates the echoes that YOU hear, and vice versa

Echo canceller coverage (e.g. 32 ms) is the maximum length of echo impulse response that can be cancelled from the local analog/digital network (the packet network delay does not matter)

The non-linear processor is used to ‘clean-up’ any residual echo left over from the canceller

Page 15: Understanding VoIP

Voice Activity Detection

Speech Magnitude (dB)

Speech Detected Hang-Over Speech Detected Hang-Over

time

Sentence 1 Sentence 2

Typically fixedat 200 ms

Noise Floor

Signal-to-NoiseThreshold

Front-endSpeech Clipping

Front-endSpeech Clipping

Page 16: Understanding VoIP

Comfort Noise Generation Silence isn’t golden…it’s annoying

When speech stops…what do you play to the listener?

Simple techniques: Play white/pink noise Replay last receiver packet over and over

Fancier technique: Transmitter measures local “noise environment” Transmitter sends special “comfort noise” packet

as last packet before silence Receiver generates noise based CN packet.

Page 17: Understanding VoIP

MOS of 4.0 = Toll Quality

Voice Quality:Mean Opinion Scores

Source Impairment

Codec ‘X’

Channel Simulation

“Nowadays, a chicken leg isa rare dish”

1 2 3 4 5

1 2 3 4 5

Rating

Speech Quality

Distortion

5 Excellent Imperceptible

4 GoodJust perceptible but not annoying

3 FairPerceptible and slightly annoying

2 PoorAnnoying but not objectionable

1Unsatisfactory

Very annoying and objectionable

Page 18: Understanding VoIP

Clear Channel MOS’s

MeanOpinionScore

5

G.711(64 kbit/sPCM)

4.1

G.726(32 kbit/sADPCM)

G.723.1(6.4 kbit/sMP- MLQ)

G.729(8 kbit/sCS-ACELP)

IS-54(8 kbit/sNA DigCellular)

3.8 3.9 3.93.44

3

2

1

Page 19: Understanding VoIP

MOS Under Varying ConditionsG.729

Avg Speech Level (-20 dBmO) 3.85Low I nput Level (-30 dBmO) 3.542 Tandem codings 3.463 Tandem codings 2.681% Frame Erasure Rate5% Bit Error Rate 3.245% FER 3.0210% FER20% FER

Page 20: Understanding VoIP

Video Coding

Page 21: Understanding VoIP

Key Terms

Term Description

Frame An individual picture in a sequence that makes up the video

Frame Rate The number of frames per second in video. 30 is excellent (TV quality)

Resolution The number of horizontal and vertical pixels. VGA=640x480.

Interlacing A mechanism for transmitting video by splitting a frame into two fields, one field representing the odd lines, and one the even field. This is the “i” in 1080i

Progressive As opposed to interlaced, a method for transmitting video by sending each frame as a whole.

HD High Def resolutions – 720p is 1280x720 with 60fps. 1080i is 1920x1080 at 30fps

Page 22: Understanding VoIP

Key Concept: Macroblocks

Rectangular block inan image which isa basic unit ofcompression. Typically16x16 pixels.

Page 23: Understanding VoIP

Key Concept: Inter-Frame Prediction

Encode

Predict information in the current frame by looking at previous frames,possibly taking into account motion.

Page 24: Understanding VoIP

Key Concept: Discrete Cosine Transform (DCT)

A technique for representing amacroblock by its component frequencies. Discarding the higherfrequencies throws away the finerdetails without losing the core image.

Increasing horizontal frequenciesIncreasing vertical frequencies

Page 25: Understanding VoIP

Video Encoder Block Diagram

Page 26: Understanding VoIP

Key Codec Comparisons

Codec Timeline Applications

H.261 1990 ISDN at multiples of 64kbps

H.263 1996 Early Flash using Sorenson Spark implementation. Original RealVideo codec. Required in IMS.

H.264 –AVC

2003 Youtube, iTunes, Blu-ray; most modern video conferencing. The current primary video codec for real-time. Typical VGA 15fps bitrate = 500kbps

H.264-SVC

2007 “Layered” video that provides improved quality and resilience; ideal for multiparty video conferencing.

VP7 2005 On2 Technologies codec; Skype, successor to H263 in Flash

Page 27: Understanding VoIP

Voice and Video Transport: RTP

Page 28: Understanding VoIP

RTP: What is it? Real Time Transport Protocol RFC 3550

product of avt working group 1996 proposed standard –

RFC1889 2004 full standard

What does it do e2e transport of real time media optimized for multicast provides sequencing, timing,

framing, loss detection provides feedback on reception

quality

What does it do (cont) provides information on

group members provides data to correlate

audio and video and other media

Works with any codec need payload format for

each codec Flexible

Page 29: Understanding VoIP

RTP: What isn’t it? Doesn’t guarantee quality of

service doesn’t reserve network

resources doesn’t guarantee no loss or

bounded delay can work with QoS protocols

(RSVP) Doesn’t provide signaling

other protocols must be used to set up RTP (like SIP or H.323)

Not a specific protocol type Does not run directly

ontop of IP Runs ontop of UDP No fixed port number

Page 30: Understanding VoIP

RTP Stack

IP

UDP

RTP RTCP

Page 31: Understanding VoIP

Big Picture: RTP, SDP and SIP

End

User

End

User

Proxy Proxy

IP Network

SIP w/ SDP

C=IN IP4 123.1.2.3m=audio RTP/AVP 1122 0 1m=video RTP/AVP 1130 98a=rtpmap:98 h263

RTP

Page 32: Understanding VoIP

RTP Components: Data + Control

Data aka RTP very confusing

Usually on an even UDP port (NATs change this – later)

Provides sequencing timing framing content labeling User identification

Control = Real Time Control Protocol (RTCP)

Same address as data, but one higher port usually

Provides reception quality sender statistics participant information

(multicast) synchronization

information

Page 33: Understanding VoIP

Real Time Data Transport Originator breaks stream into

packets (segmentation) application layer framing

(ALF)!!! Packets sent; network may

lose, delay, reorder packets Must, at receiver:

reorder recover resegment rescynchronize clock synchronization!

RTP Source

RTP Sink

RTP

Packets

Page 34: Understanding VoIP

Transport System

Source Digitize Audio from mike Silence Suppression Echo cancellation Compress Audio

G.711: 64 kbps G.729: 8 kbps G.723.1: 5.3/6.3 kbps

Packetize Audio in RTP Send

Sink Receive packets Un-packetize decompress comfort noise generation reorder recover loss jitter buffer A/D conversion to

speakers

Page 35: Understanding VoIP

Jitter Buffer Packets delayed differently Must play them out

periodically Packets may arrive after

designated playout time -> loss

Insert extra delay to compensate

May need to adapt this amount

time

pkts

Page 36: Understanding VoIP

RTP Packet Header

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|X| CC |M| PT | sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | timestamp | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | synchronization source (SSRC) identifier | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | contributing source (CSRC) identifiers | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Page 37: Understanding VoIP

RTP Header Fields Version: 2 P: indicates padding (for

encryption) X: extension bit CSRC count: for mixers

(later) M: Marker Bit: indicates

framing audio codecs: first packet

in talkspurt video: last packet in frame

Payload Type: indicates encoding in RTP packet allows changes

per-packet Useful for:

adaptation DTMF codec silence codecs

SN: defines ordering of packets Timestamp: when packet was

generated SSRC: identifier CSRC: list of mixed users

Page 38: Understanding VoIP

RTP Timestamp

Tick units are dependent on codec For speech: 125

microseconds (standard 8 khz sampling rate)

For video: 90 KhZ For audio: 44.1 KhZ (CD

rate) Gaps in TS, but not in

SN mean silence Initial value random for

security

Video Timestamp represents

time at beginning of frame Many packets may have

same timestamp Speech

Time per packet may vary Depends on packetization:

20-100ms typical

Page 39: Understanding VoIP

Payload Formats Each codec needs a way to

be encapsulated in RTP RFC3550 defines

mechanisms for many common codecs G.711, G.729, G.723.1,

G.722, etc. Some simple video

More complex codecs have their own payload format documents MPEG H.263 and H.261

Payload format defines How to break frame into

packets extra fields needed below

main RTP header

Page 40: Understanding VoIP

Advanced Topics

DTMF and Tones RFC 2833 Special codecs for

encoding touch tones (DTMF) and other signals

Can send either the waveform (frequency, amplitude)

Or the actual signal (#, 8, 0)

Compressed RTP RFC 2508 For dialup links Don’t send header, just

send index Far side uses index to

retrieve header, and then increments certain fields

Page 41: Understanding VoIP

Quality of Service

Page 42: Understanding VoIP

Quality of Service

The problem we are trying to solve is to give “better” service to some at the expense of

giving worse service to to others — QoS fantasies to the contrary, it’s a zero sum

game

- Van Jacobson

Page 43: Understanding VoIP

Quality of Service So, what’s the problem?

Usability of Voice Circuit as a Function of End-to-End Delay

Time (msec)

Uti

lity

0.0

0.5

1.0

0

100

200

300

400

500

600

700

800

TollQuality

Early I-Phone TechnologyyImproving I-Phone

means:

• Lower PC Delay

• Lower Network Latency

• Tighten Network Jitter

SatelliteZone

CBZone

Fax Relay, Broadcast

Private NetworkVoFR & VoIPTechnology

Page 44: Understanding VoIP

Delay Budget Device sample capture Encode delay (algorithmic delay + processing delay) Packetization/framing Move to output queue/queueing delay Access (up) link transmission Backbone network transmission Access (down) link transmission Input queue to application Jitter buffer Decode processing delay Device playout delay

“The Network”

Page 45: Understanding VoIP

Some Techniques to Improve “Network QoS”

RED — Random Early Drop (or “Detect”) WFQ — Weighed Fair Queuing Intserv/RSVP — ReSerVation Protocol IP Precedence DiffServ CRTP — Compressed Realtime

Transport Protocol MCML — Multi-Class Multi-Link PPP

Page 46: Understanding VoIP

Random Early Detect (RED)this is Basic Hygiene!

Objectives Keep average queue size

low – good for voice Fairness – bigger streams

punished more Avoid synchronization

Only works with loss responsive transport protocols

Algorithm – probabilistic dropping of packets Queue Size

Drop P

robability

1

Min Max

Page 47: Understanding VoIP

Poll: Will RED Help Voice?

Yes No

• Voice not loss responsive• Mixing voice and data in same queue bad• Voice queues usually not congested

Page 48: Understanding VoIP

Weighted Fair Queueing

Each flow “sees” a dedicated amount of bandwidth Bj

A packet arriving at time t is transmitted at time t+size/Bj

B1

B3

B2

B

B = B1 + B2 + B3

Page 49: Understanding VoIP

Whats the Problem??

WFQ is unrealizable because Variable packet sizes Causality

Example: Link speed 100Kbps Flow 1: 10Kbps Flow 2: 90Kbps

1500

100

1500 100

8.8msTheory

128msActual

Page 50: Understanding VoIP

Approximations of WFQ

Many PhDs written with approximate and implementable algorithms

Algorithms differ in their delay bound How much worse than

perfect WFQ is this? Delay bounds a function of

bandwidth, number of queues, other params

Algorithms

SCFQ: Self-Clocked Fair QueueingWF2Q: Worst-Case Fair Weighted Fair QueueingFBFQ: Frame-Based Fair QueueingPGPS: DRR:

Page 51: Understanding VoIP

WFQ Voice Configuration

How to pick allocated bandwidth? Consider G.711, 30ms framing (74.6Kbps)

If Bi = 74.6kbps, delay is at least 30ms If Bi = 149.2Kbps, delay at least 15ms

Must set voice queue bandwidth at least 2x actual voice usage to keep delays down!

Unused bandwidth will go to data Need an accurate WFQ Implementation

Page 52: Understanding VoIP

Priority Queueing

Emulates the familiar “elite airport line” experience

Voice and data packets in separate queues

If there is any packets in voice queue, they are serviced

Voice Data

Server

Page 53: Understanding VoIP

Priority Queueing Considerations Easy to configure – no bandwidth values

required Main problem – data starvation Need to police voice queue Doesn’t work as well when there is other non-

voice high priority traffic (video) Head-of-Line Blocking from data queue

Page 54: Understanding VoIP

Intserv: Integrated Services Guaranteed Service (RFC 2212)

Mathematically provable bounds on end-to-end datagram queuing delay/bandwidth

Controlled Load Service (RFC 2211) Approximate QoS from an unloaded network for

delay/bandwidth Describe traffic with a “TSPEC”

r= token bucket rateb= token bucket depthp= peak transmission ratem= minimum (policed) packet sizeM= maximum packet size

Describe endpoints with a « FlowSpec » Source/Destination IP addresses, ports, protocol

RSPEC/FSPEC provides the policy to the queuing/scheduling algorithms

Page 55: Understanding VoIP

RSVP Design

Signaling distinct from routing (modularity, deployability, evolvability)

Soft state (robustness, simplicity) Transparent operation across non-RSVP routers

(deployability) Support shared and distinct reservations Applies to unicast & multicast applications Simplex & receiver-oriented.

Page 56: Understanding VoIP

RSVP protocol

PATH : Source Destination Traffic parameters of source Collects info on network capabilities Detects current route

RESV: Source Destination Receiver selected Int-Serv service Traffic parameters of receiver selected reservation Follows route detected by PATH Reservation actually nailed in network

RSVP messages carried over IP Can also be carried over UDP but few people do that

pathSrc Dest.resv

Page 57: Understanding VoIP

RSVP: Admission Control

Route Selection

Interface 1

Interface N

RoutingProtocol

Routing Database

Packets InPackets Out

Packets Out

AdmissionControl

Resource UtilizationDatabase

Switching

Routing

Queuing Policy Database

Flow Request

ReservationProtocol

Packet Scheduler

Packet Scheduler

Page 58: Understanding VoIP

Intserv/RSVP Acceptance

Time

Enthusiasm

TodayISP

Intserv/RSVP will solvethe world’s QoS

Cool thing to say:“RSVP does not scale”

vBNS RSVP over ATM transparently transport RSVP

Realvalue

TodayEnterprise

RSVP for VoIP in Enterprise

Page 59: Understanding VoIP

IP Precedence & Diffserv “Poor man’s” approach to QoS Set IP Precedence/DSCP higher on voice packets

This puts them in a different queue, resulting in isolation from best effort traffic

Can be done by endpoint, proxy, or in routers through heuristics

Scales better than RSVP – Keeps QoS control “local” Pushes work to the edges and boundaries Can provide bulk QoS by customer or network

No admission control Too much high-precedence traffic can still swamp the

network

Page 60: Understanding VoIP

Diffserv Architectural Model Clouds — regions of relative

homogeneity: Administrative control Technology Bandwidth

Within a cloud, QoS managed by local rules

Hard work confined to boundaries of clouds: Classification Conditioning/Policing

QoS information exchange limited to boundaries Bi-lateral, not multi-lateral Not necessarily symmetric

MeMeNot Me

Not Me

Also Not Me

Also Not Me

Far Away

Far Away

Page 61: Understanding VoIP

Diffserv Scalability Fundamental assumptions:

Relatively small number of feasible queuing/scheduling algorithms for high link speeds

Number of individual flows is large Many different rules, often policy driven

Group packets explicitly by the “Per-hop behavior (PHB)” they are to get Queue service Shaping/policing

Nodes in the middle of a cloud only have to deal with traffic aggregates

Page 62: Understanding VoIP

Diffserv Forwarding via PHBs

PHBs map to DSCPs (Diffserv Code Points) Values chosen for backward-compatibility with

IPv4 TOS byte including IP Precedence (RFC 2474)

Packets with different DSCPs may be re-ordered

Forwarding resources partitioned by PHB/DSCP

Page 63: Understanding VoIP

Assured Forwarding PHB(AF*) Four independent classes Within each class, three levels of drop

precedence A congested AF node discards packets with

higher drop preference first Packets with lowest drop preference must be

within the subscribed profile

*RFC2597

Page 64: Understanding VoIP

Expedited Forwarding PHB(EF*)

Targeted at VoIP and “virtual leased lines” Roughly equivalent to priority queuing,

with a safety measure to prevent starvation

Implications: No more than 50% of a link can be EF

see RFC3247,3248 for interesting mathematical analyses

Worst case jitter at each hop is max of: number of EF microflows in the aggregate, or a single MTU packet of some other aggregate

*RFC3246

Page 65: Understanding VoIP

Diffserv Traffic Conditioner

Classifier: selects a packet in a traffic stream based on the content of some portion of the packet header

Meter: checks compliance to traffic parameters (e.g. Token Bucket) and passes result to marker and shaper/dropper to trigger particular action for in/out-of-profile packets

Marker: writes/rewrites DSCP Shaper: delay some packets for them to be compliant with

the profile

Packets

Shaped

Dropped

Meter

Classifier Marker

Shaper /

Dropper

Page 66: Understanding VoIP

Diffserv Acceptance

Time

Enthusiasm

today

Diffserv will solvethe world’s QoS

Diffserv Engineering?Diffserv SLA ?Internet e2e SLA?

Diffserv Design & Deploymentintra Domain

Realvalue

Inter-SP Diffserv and end-to-endInternet QoS need furtherstandardisation and commercialarrangements

Page 67: Understanding VoIP

Mixing Intserv & Diffserv: Aggregation

Host signals with RSVP Edge or transit domains

Aggregate reservations mark packets using DSCP

In transit domains Blindly transfer end to end

reservations using another IP Protocol Number - change at edge

Routers detect egress of reservation (deaggregation) on transfer from an interior or aggregator interface to an exterior (deaggregating) interface

Aggregate reservation size varies with load

Edge

Edge

Backbone

Page 68: Understanding VoIP

RTP Compression

20ms @ 8kbit/s yields 20 byte payload

IP header 20; UDP header 8; RTP header 12 Twice size of

payload! Header compression:

40 bytes to 2-4 most of the time

Hop-by-hop: use only on the slow links

Page 69: Understanding VoIP

Sample Delay Budget (G.711 - 64kbps)

Delay Source (G.711) Budget (ms)Device Sample Capture .1Encode Delay (Algorithmic Delay + Processing Delay) 2.5Packetization/Fr aming 10 Move to Output Queue/ Queue Delay .5 Access (up) Link Transmission 30 Backbone Network Transmission 5 Access (down) Link Transmission 10 I nput Queue to Application .5 J itter Buf fer 35 Decode Processing Delay .5 Device Playout Delay .5

Total 94.6

Page 70: Understanding VoIP

Sample Delay Budget (G.729 - 8kbps)

Delay Source (G.729) Budget (ms)Device Sample Capture .1Encode Delay (Algorithmic Delay + Processing Delay) 17.5Packetization/Fr aming 20 Move to Output Queue/ Queue Delay .5 Access (up) Link Transmission 30 Backbone Network Transmission 5 Access (down) Link Transmission 10 I nput Queue to Application .5 J itter Buf fer 35 Decode Processing Delay 5 Device Playout Delay .5

Total 119.1

Page 71: Understanding VoIP

Signaling: SIP

Page 72: Understanding VoIP

SIP is one of Many

ITU H.323 Originally for video conferencing The first standard protocol for VoIP Still in wide usage, but negative growth

MGCP Dumb phones controlled by smart server “Softswitch” – PSTN emulation view

Megaco/H.248 Standard version of MGCP

Page 73: Understanding VoIP

Core SIP Functions Establishment of peer to peer sessions Management of peer to peer sessions

Keepalives Graceful and Non-graceful termination

Rendezvous Forking Search

Policy Based Routing Loose Routing Mobility

Limited terminal mobility Device Mobility

Page 74: Understanding VoIP

Core SIP Functions

Secure User Identification Exchange and Management of Media

Session data User registration Capability declaration Capability query Reliability

Page 75: Understanding VoIP

SIP Technology Community

SIPRFC3261

DNS3263

Events3265

Rel3262

O/A3264

RTPSDP

SIMPLE

SigComp

SIP ExtensionsENUM

MIDCOM

STUN

ROHC

Page 76: Understanding VoIP

SIP Design Philosophy

Patterned after other Successful Internet Standards HTTP

Don’t Reinvent the PSTN General Purpose

Functionality Do Not Dictate

Architectures or Services

It needs to work on any IP Network

Leverage the Best of Existing Standards

URLs MIME RFC822

Scalability Push state to the edge

Page 77: Understanding VoIP

Basic Design

Request/Response Protocol SIP is a Peer Protocol – all

entities send requests and receive requests

Modelled after HTTP Each request invokes

method Main purpose of request

Messages contain bodies

Agent Agent

request

response

Page 78: Understanding VoIP

Transactions Fundamental unit of

messaging exchange Request Zero or more provisional

responses Usually one final response Maybe ACK

All signaling composed of independent transactions

Identified by Cseq Sequence number Method tag

INVITE

100200

ACK

BYE

200

First Transaction

Second Transaction

Cseq: 1

Cseq: 2

Page 79: Understanding VoIP

Session Independence Body of SIP message

used to establish call describes the session

Session could be Audio Video Game

SIP operation is independent of type of session

SIP Bodies are MIME objects MIME = Multipurpose

Internet Mail Extensions Mechanisms for

describing and carrying opaque content

Used with HTTP and email

Page 80: Understanding VoIP

Protocol Components

User Agent End systems Hard and soft phones PSTN Gateways Phone Adaptors Media Servers Anything that

originates or terminates SIP calls

Proxy SIP server responsible for relaying

and processing requests between user agents

Main job: where to send request next?

Back-to-Back User Agent (B2BUA) SIP server that terminates and re-

originates SIP SBCs, Call Agents, etc.

Page 81: Understanding VoIP

SIP Addressing SIP addresses are URL’s URL contains several components

Scheme (sip) Username Hostname Optional port Parameters Headers and Body

SIP allows any URI type tel URIs http URLs for redirects mailto URLs leverage vast URI

infrastructure

sip:[email protected]:5061; user=host?Subject=foo

Page 82: Understanding VoIP

The SIP Trapezoid

a.com b.com

SIP

RTP

Page 83: Understanding VoIP

SIP Methods

INVITE Invites a participant to a

session idempotent - reINVITEs for

session modification BYE

Ends a client’s participation in a session

CANCEL Terminates a search

OPTIONS Queries a participant

about their media capabilities, and finds them, but doesn’t invite

ACK For reliability and call

acceptance REGISTER

Informs a SIP server about the location of a user

Page 84: Understanding VoIP

SIP ArchitectureRequest

Response

Media

1

2

3

45

67

8

9

1011

12

Corp DB

13

14

[email protected]

sp.com

b.com

[email protected]

[email protected]

Page 85: Understanding VoIP

SIP Message Syntax

Many header fields from http

Payload contains a media description SDP - Session

Description Protocol

INVITE sip:[email protected] SIP/2.0From: J. Rosenberg <sip:[email protected]> ;tag=76ahSubject: Conference CallTo: John Smith <sip:[email protected]>Via: SIP/2.0/UDP 1.2.3.4;branch=z9hG4bK74bf9Call-ID: [email protected]: application/sdpCSeq: 4711 INVITEContent-Length: 187

v=0o=user1 53655765 2353687637 IN IP4 1.2.3.4s=Salesc=IN IP4 1.2.3.4t=0 0m=audio 3456 RTP/AVP 0

Page 86: Understanding VoIP

SIP Address Fields

Request-URI Contains address of

next hop server Rewritten by proxies

based on result of Location Service

To Address of original

called party Contains optional

display name From

Address of calling party

Optional display name

INVITE sip:[email protected] SIP/2.0From: J. Rosenberg <sip:[email protected]> ;tag=76ahSubject: Conference CallTo: John Smith <sip:[email protected]>Via: SIP/2.0/UDP 1.2.3.4;branch=z9hG4bK74bf9Call-ID: [email protected]: application/sdpCSeq: 4711 INVITEContent-Length: 187

v=0o=user1 53655765 2353687637 IN IP4 1.2.3.4s=Salesc=IN IP4 1.2.3.4t=0 0m=audio 3456 RTP/AVP 0

Page 87: Understanding VoIP

SIP Responses

Look much like requests Headers, bodies

Differ in top line Status Code

Numeric, 100 - 699 Meant for computer processing Protocol behavior based on

100s digit Other digits give extra info

Reason Phrase Text phrase for humans Can be anything

Status Code Classes 100 - 199 (1XX): Informational 200 - 299 (2XX): Success 300 - 399 (3XX): Redirection 400 - 499 (4XX): Client Error 500 - 599 (5XX): Server Error 600 - 699 (6XX): Global Failure

Two groups 100 - 199: Provisional

Not reliable 200 - 699: Final, Definitive

Example 200 OK 180 Ringing

Page 88: Understanding VoIP

Example SIP Response

Note how only difference is top line

Rules for generating responses Call-ID, To, From, Cseq

are mirrored in response

Branch parameter used as transaction ID

Tag added to To field to identify dialog

SIP/2.0 200 OKFrom: J. Rosenberg <sip:[email protected]> ;tag=76ahTo: John Smith <sip:[email protected]> ;tag=112Via: SIP/2.0/UDP 1.2.3.4;branch=z9hG4bK74bf9Call-ID: [email protected]: application/sdpCSeq: 4711 INVITE

Page 89: Understanding VoIP

SIP Transport

SIP Messages over UDP or TCP/TLS or SCTP

Reliability mechanisms defined for UDP

UDP More Widely Used Faster No connection state

TCP preferred these days NAT Larger SIP messages

Reliability mechanisms depend on SIP request method INVITE anything except INVITE

Reason: optimized for phone calls

Page 90: Understanding VoIP

Registrations

REGISTER creates mapping in server from one URI to another

REGISTER properties UA location in Contact Registrar identified in Request

URI Identifies registered user in To

and From field Expires header indicates desired

lifetime Can be different for each

Contact Registrations are soft-state

REGISTER sip:example.com SIP/2.0To: sip:[email protected];user=phoneFrom: sip:[email protected];user=phoneCall-ID: [email protected]: 123 REGISTERContact: sip:[email protected]: 3600

sip:[email protected]

sip:[email protected]

Page 91: Understanding VoIP

Registration Handling

Registrar is logical function handling REGISTER

Registrar steps: Authenticate Authorize Add Binding Lower expiration Return all currently

registered UA (can be more than one)

SIP/2.0 200 OKTo: sip:[email protected];user=phoneFrom: sip:[email protected];user=phoneCall-ID: [email protected]: 123 REGISTERContact: sip:[email protected];expires=3600Contact: sip:[email protected];expires=524

Page 92: Understanding VoIP

Forking

A proxy may have more than one address for a user Happens when more than one SIP

URL is registered for a user Can happen based on static routing

configuration In this case, proxy may fork Forking is when proxy sends request

to more than one proxy at once First 200 OK that is received is

forwarded upstream All other unanswered requests

cancelled

[email protected]

INVIT

E 8902

3077

@1.2

.3.4

INVITE [email protected]

Page 93: Understanding VoIP

Routing of Subsequent Requests

Initial SIP request sent through many proxies

No need per se for subsequent requests to go through proxies

Each proxy can decide whether it wants to receive subsequent requests Inserts Record-Route header

containing its address For subsequent requests, users

insert Route header Contains sequence of proxies (and

final user) that should receive request

Proxy

Proxy

Proxy

UA1

UA2

INVITE

BYE

Page 94: Understanding VoIP

Setting up the Session

INVITE contains the Session Description Protocol (SDP) in the body

SDP conveys the desired session from the callers perspective Session consists of a number of

media streams Each stream can be audio,

video, text, application, etc. Also contains information

needed about the session codecs addresses and ports

SDP also conveys other information about session Time it will take place Who originated the

session subject of the session URL for more information

SDP origins are multicast sessions on the mbone Originator of INVITE is

not originator of session

Page 95: Understanding VoIP

Anatomy of SDP SDP contains informational

headers version (v) origin(o) - unique ID information (I)

Time of the session Followed by a sequence of media

streams Each media stream contains an

m line defining port transport codecs

Media Stream also contains c line Address information

v=0o=user1 53655765 2353687637 IN IP4 128.3.4.5s=Mbone Audioi=Discussion of Mbone Engineering [email protected]=0 0m=audio 3456 RTP/AVP 0 78c=IN IP4 1.2.3.4a=rtpmap:78 G723m=video 4444 RTP/AVP 86c=IN IP4 1.2.3.4a=rtpmap:86 H263

Page 96: Understanding VoIP

Negotiating the Session Called party receives SDP offered

by caller Each stream can be

accepted rejected

Accepting involves generating an SDP listing same stream port number and address of called

party subset of codecs from SDP in request

Rejecting indicated by setting port to zero

Resulting SDP returned in 200 OK Media can now be exchanged

v=0o=user2 16255765 8267374637 IN IP4 4.3.2.1t=0 0m=audio 3456 RTP/AVP 0 c=IN IP4 4.3.2.1m=video 0 RTP/AVP 86c=IN IP4 4.3.2.1

Audio stream accepted, PCMU only.Video stream rejected

Page 97: Understanding VoIP

Changing Session Parameters

Once call is started, session can be modified

Possible changes Add a stream Remove a stream Change codecs Change address information

Call hold is basically a session change

Accomplished through a re-INVITE Same session negotiation as

INVITE, except in middle of call Rejected re-INVITE - call still active!

INVITE

200ACK

INVITE

200ACK

reINVITE

Page 98: Understanding VoIP

Hanging Up

How to hang up depends on when and who

After call is set up either party sends BYE request

From caller, before call is accepted send CANCEL BYE is bad since it may not reach

the same set of users that got INVITE

If call is accepted after CANCEL, then send BYE

From callee, before accepted Reject with 486 Busy Here

C S

INVITE

100

Hangup AcceptCANCEL

200 OK

200 OK

ACK

BYE

200 OK

Page 99: Understanding VoIP

Call Flow for basic call: UA to proxy to UA

Call setup 100 trying hop by hop 180 ringing 200 OK acceptance

Call parameter modification re-INVITE Same as initial INVITE,

updated session description Termination

BYE method

INVITE

100 Trying

INVITE

100 Trying

180 Ringing180 Ringing

200 OK200 OK

ACK

BYE

200 OK

RTP

Page 100: Understanding VoIP

Privacy and Identity

RFC 3325: A Private Extension for Asserted Identity in Trusted Networks

RFC 3323: A Privacy Mechanism for SIP RFC 4474: SIP Identity

Page 101: Understanding VoIP

RFC3325 Asserted Identity

Trust Domain

AuthenticatesCaller and verifiesidentity. Adds PAID.

INVITEP-Asserted-Identity: sip:[email protected]

Page 102: Understanding VoIP

RFC3323 – SIP Privacy

Trust Domain

INVITEP-Asserted-Identity: sip:[email protected]: anonymous

INVITEPrivacy: idFrom: anonymous

AnonymousCaller

INVITEFrom: anonymous

Page 103: Understanding VoIP

4474: SIP Identity

AuthenticatesCaller and verifiesidentity. Signs Request.

INVITEFrom: sip:[email protected]: asd87f7as66sda8z

INVITEFrom: sip:[email protected]

VerifiesSignature

Only useful for user@domain addresses!

Page 104: Understanding VoIP

Transfers and Dialog Movement: REFER (RFC 3515)

Joe

Alice

Bob

REFERRefer-To: Bob

INVITE

INVITE

INVITE BobReferred-By: Joe1

2

3

4

Page 105: Understanding VoIP

Third Party Call Control (3pcc): RFC 3725

RTP

INVITEno SDP

200SDP A

INVITESDP A

200SDP B

ACKSDP B

1

2

3

4

5

6

Page 106: Understanding VoIP

SIP and Quality of Service RFC 3312: Integration of Resource

Management with SIP Problem

How to make sure phone doesn’t ring unless resources are reserved

Solution SIP does not do resource

reservation! SIP INVITE tells far side not to ring Both sides do regular QoS

reservations RSVP PDP context activation

UPDATE to change state

INVITE w. Preconditions

183 Progress

QoS Reservations

UPDATE w. Preconditions

180 Ringing

200 OK

ACK

Page 107: Understanding VoIP

Security

Page 108: Understanding VoIP

VoIP Security

The only totally secure system I know of is a rock

- Tony Lauck, circa 1985

Page 109: Understanding VoIP

But Even Rocks can be Insecure..

Page 110: Understanding VoIP

It Had a Great User Interface

Page 111: Understanding VoIP

But it had a serious security vulnerability…

Page 112: Understanding VoIP

VoIP AttacksAttack Solution

Free Calls aka Toll Fraud User Authentication

Impersonation User Authentication, Secure Caller ID

Learning Private Information (calling patters, PIN codes)

SIP Encryption, Media Encryption

Steal Calls SIP Encryption, Media Encryption

DoS ICE, Others

Page 113: Understanding VoIP

SIP User Authentication

RTP

We want this SIP server to authenticatethis user

and this SIP server to authenticatethis user

Page 114: Understanding VoIP

SIP Digest Authentication

Hi, I’d liketo SIPREGISTER

401 –OK, tryagain. Nonce=a7szh1

REGISTER Nonce=a7szh1Username=joeDigest=z0v88a6

Digest= Hash(joe, a7szh1,myPassword)

OK, done!

Digest= Hash(joe, a7szh1,myPassword) = z0v88a6

Page 115: Understanding VoIP

Offline Dictionary Attack

REGISTER Nonce=a7szh1Username=joeDigest=z0v88a6

Digest= Hash(joe, a7szh1,alligator)

OK, done!

Digest= Hash(joe, a7szh1,alligator) =

Aardvark 9z8v77aAbacus lkf88z7Abate 8z77x…….Alligator z0v88a6

Word Hash(joe, a7szh1,word)

Page 116: Understanding VoIP

Solution: Digest over TLS

Digest= Hash(joe, a7szh1,alligator)

Digest= Hash(joe, a7szh1,alligator) =

TLSArmor

This is howWeb Security works!

Page 117: Understanding VoIP

Even Stronger: Mutual TLS for Devices

TLSArmor

MAC8x7a6

a.com

Phone has aCertificatewhich identifiesit

Page 118: Understanding VoIP

SIP Encryption

RTP

We want each SIP hop to beEncyprted so only the SIPservers and endpoints see thesignaling.

Page 119: Understanding VoIP

SIP Encryption: TLS

RTP

Mutual TLSAuthentication

a.com

b.com

Page 120: Understanding VoIP

Media Encryption Countermeasure against:

Eavesdropping Barge-in Modification

Two useful techniques IPSEC SRTP

Complications Key management Legal intercept (who has the keys) Firewall and NAT issues (covered later)

Page 121: Understanding VoIP

Alternative: Secure RTP Authentication and encryption of RTP and RTCP packets

timestamp

PV X CC M PT sequence number

synchronization source (SSRC) identifier

contributing sources (CCRC) identifiers…

RTP extension (optional)

RTP payload

SRTP MKI -- 0 bytes for voice

Authentication tag -- 4 bytes for voice

Authenticated portionEncrypted portion

Page 122: Understanding VoIP

SRTP Advantages

Provides both Privacy via encryption and authentication via message integrity check

Very little bandwidth overhead Does not break header compression schemes like cRTP For very low-rate channels (e.g. cellular) can sacrifice authentication

and have no packet expansion. Uses modern strong crypto suites: AES counter mode for

encryption and HMAC for message integrity Disadvantages

Needs key management End-to-end versus hop-by-hop trust tradeoffs in protecting keys Yet another security mechanism to ensure is implemented and

deployed correctly

Page 123: Understanding VoIP

NAT Traversal

Page 124: Understanding VoIP

What is NAT? Network Address Translation

(NAT) Creates address binding

between internal private and external public address

Modifies IP Addresses/Ports in Packets

Benefits Avoids network renumbering on

change of provider Allows multiplexing of multiple

private addresses into a single public address ($$ savings)

Maintains privacy of internal addresses

ClientNAT

NAT

S: 1.2.3.4:8877D: 67.22.3.1:80

Binding Table

Internal External10.0.1.1:6554 -> 1.2.3.4:8877

S: 10.0.1.1:6554D: 67.22.3.1:80

IP Pkt IP Pkt

Page 125: Understanding VoIP

Problem: Getting SIP Through NATs

NAT

INVITE sip:[email protected]

m=audio 3456 RTP/AVP 0 c=IN IP4 10.0.1.1

RTP to 10.0.1.1

Page 126: Understanding VoIP

Solution Space

Application Layer Gateways (ALGs) Session Border Controllers (SBC) Simple Traversal of UDP Through NAT

(STUN) Traversal Using Relay NAT (TURN) Interactive Connectivity Establishment (ICE)

Page 127: Understanding VoIP

Application Layer Gateway

NAT

INVITE sip:[email protected]

m=audio 3456 RTP/AVP 0 c=IN IP4 10.0.1.1

RTP to 10.0.1.1

INVITE sip:[email protected]

m=audio 1234 RTP/AVP 0 c=IN IP4 19.1.3.2

ALG

NAT also modifies SIPmessages to fix them up!

Page 128: Understanding VoIP

ALG Benefits and Drawbacks

Drawbacks Doesn’t work when security

turned on Hard to diagnose problems Requires network upgrade to

support new app Frequent implementation

problems (lack of expertise) Incentives mismatched

Benefits No change to clients or

servers

Page 129: Understanding VoIP

Session Border Controller

NAT

INVITE sip:[email protected]

m=audio 3456 RTP/AVP 0 c=IN IP4 10.0.1.1 SBC

9.8.7.6INVITE sip:[email protected]

m=audio 3225 RTP/AVP 0 c=IN IP4 9.8.7.6

RTP to9.8.7.6

SBC relaysRTP back tosource

Page 130: Understanding VoIP

SBC Benefits and Drawbacks

Drawbacks Expensive media relaying Interferes with some SIP

extensions Breaks more advanced SIP

security

Benefits No change to clients or

NATs Works with basic SIP

security mechanisms Easier to diagnose

Page 131: Understanding VoIP

Simple Traversal of UDP Through NAT (STUN)

NAT

What is my IP addressand port please?

STUNServer

9.8.7.6

INVITE sip:[email protected]

m=audio 3472 RTP/AVP 0 c=IN IP4 1.2.3.4

RTP to1.2.3.4

1.2.3.4

Its 1.2.3.4:3472

Page 132: Understanding VoIP

STUN Benefits and Drawbacks

Drawbacks Doesn’t always work

Benefits No change to servers or

NATs Works with all SIP

security mechanisms Can support non-VoIP

apps (e.g., games)

Page 133: Understanding VoIP

Traversal Using Relay NAT (TURN)

NAT

Give me an IP addressand port please?

TURNServer

9.8.7.6

INVITE sip:[email protected]

m=audio 2376 RTP/AVP 0 c=IN IP4 9.8.7.6

RTP to1.2.3.4

1.2.3.4

9.8.7.6:2376

Page 134: Understanding VoIP

TURN Benefits and Drawbacks

Drawbacks Expensive Media Relaying

Benefits No change to servers or

NATs Works with all SIP

security mechanisms Can support non-VoIP

apps (e.g., games)

Page 135: Understanding VoIP

Interactive Connectivity Establishment(ICE) Hybrid of STUN and

TURN P2P NAT Traversal Widely Deployed on

Internet Popular with

Application Providers

Page 136: Understanding VoIP

ICE Step 1: Allocation Before Making a Call, the

Client Gathers Candidates Each candidate is a

potential address for receiving media

Three different types of candidates Host Candidates Server Reflexive Candidates

(STUN) Relayed Candidates (TURN)

TURN

HostCandidates resideon the agent itself

STUN candidatesare addresses residing on a NAT

NAT

NAT

TURN candidates reside on a TURN server

STUN

Page 137: Understanding VoIP

ICE Step 2: Create Offer Each candidate is

placed into an a=candidate attribute of the offer

Each candidate line has IP address and port plus other info needed for ICE

c=IN IP4 192.0.2.3 t=0 0 m=audio 45664 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=candidate:1 1 UDP 2130706178 10.0.1.1 8998 typ host a=candidate:2 1 UDP 1694498562 192.0.2.3 45664 typ srflx raddr 10.0.1.1 rport 8998

Page 138: Understanding VoIP

ICE Step 3: Send INVITE

Caller sends a SIP INVITE as normal

No ICE processing by SIP servers

SIPServer

INVITE

Page 139: Understanding VoIP

ICE Step 4: Allocation Called party does

exactly same processing as caller and obtains its candidates

Recommended to not yet ring the phone!

TURN

NAT

NAT

STUN

Page 140: Understanding VoIP

ICE Step 5: Provisional Response Callee sends a

provisional response containing its SDP with candidates

As with INVITE, no processing by proxies

Phone has still not rung yet

SIPProxy

1xx

Page 141: Understanding VoIP

ICE Step 6: Verification Each agent pairs up its

candidates (local) with its peers (remote) to form candidate pairs

Each agent sends a STUN-based ping on each pair, starting at highest priority

If a response is received the check has succeeded and we know media can flow on that pair!

TURNServer

NAT

NAT

TURNServer

NAT

NAT

1

2

3

45

Page 142: Understanding VoIP

ICE Benefits and Drawbacks

Drawbacks Requires client changes Requires other side to

support it

Benefits Always Works No change to servers or

NATs Works with all SIP security

mechanisms Minimum Media Relaying Can support non-VoIP apps

(e.g., games) Built-In Anti-DOS Eliminates Ghost Rings

Page 143: Understanding VoIP

That’s it!

Questions?

Page 144: Understanding VoIP

GlossaryAI N Advanced I ntelligent Network ADPCM Adaptive PCM BGP Border Gateway Protocol CALEA Communication Access f or Law

Enforcement Act CBR Constant Bit Rate CELP Code Excited Linear Prediction CODEC Coder/ Decoder COPS Common Open Policy Service CRTP Compressed RTP CSRC Contributing Source CTI Computer-Telephony

I ntegration DSCP Diff serv Code Point DSL Digital Subscriber Line DSP Digital Signal Processor DTMF Dual Tone Multi-Frequency ERL Echo Return Loss ERLE ERL Enchancement HFC Hybrid Fiber/ Coax

I N I ntelligent Network I SDN I ntegrated Services Digital

Network I SUP I SDN User Part J TAPI J ava Telephony API LDAP Lightweight Directory Access

Protocol MCML Multi-class Multi-link PPP MGCP Media Gateway Control

Protocol MOS Mean Opinion Score MPLS Multi-protocol Label Switching NLP Non-linear Processing NTP Network Time Protocol PCM Pulse Coded Modulation PPP Point-to-point Protocol PHB Per-hop Behavior PQ Priority Queueing PSTN Public Switched Telephony

Network

Page 145: Understanding VoIP

Glossary (2)QoS Quality of Service RED Random Early Detect (or Drop) RTCP Realtime Transport Control

Protocol RTP Realtime Transport Protocol SCP Service Control Point SIP Session I nvitation Protocol SS7 Signaling System Number 7 SSRC Synchronization Source TAPI Telephony API TDM Time Division Multiplexed TRIP Telephony Routing I nformation

Protocol TSPEC Transmission Specification WFQ Weighted Fair Queueing

Page 146: Understanding VoIP

Thanks

Enjoy Interop!

to contact me: [email protected]