CENG4430 (Spring 2011) 3-1
Lecture 3: Application Layer
What to learn? HTTP Proxy FTP – file transfer protocol DNS – domain name system Packet capture tools
References: Ch. 2.2, 2.3, 2.5
CENG4430 (Spring 2011) 3-2
Lecture 3: Roadmap
HTTP Proxy FTP – file transfer protocol DNS – domain name system Packet capture tools
CENG4430 (Spring 2011) 3-3
Recall: HTTP overview
HTTP: hypertext transfer protocol
Web’s application layer protocol
client/server model client: browser that
requests, receives, “displays” Web objects
server: Web server sends objects in response to requests
PC runningExplorer
Server running
Apache Webserver
Mac runningNavigator
HTTP request
HTTP request
HTTP response
HTTP response
CENG4430 (Spring 2011) 3-4
Web caches (proxy server)
user sets browser: Web accesses via cache
browser sends all HTTP requests to cache object in cache: cache
returns object else cache requests
object from origin server, then returns object to client
Goal: satisfy client request without involving origin server
client
Proxyserver
client
HTTP request
HTTP response
HTTP request HTTP request
origin server
origin server
HTTP response HTTP response
CENG4430 (Spring 2011) 3-5
Why Web Caching?
Assume: cache is “close” to client (e.g., in same network)
smaller response time: cache “closer” to client
decrease traffic to distant servers link out of
institutional/local ISP network often bottleneck
originservers
public Internet
institutionalnetwork 10 Mbps LAN
1.5 Mbps access link
institutionalcache
CENG4430 (Spring 2011) 3-6
Caching example
Assumptions average object size = 1,000,000
bits avg. request rate from institution’s
browsers to origin servers = 15/sec
delay from institutional router to any origin server and back to router = 2 sec
Consequences utilization on LAN = 15% utilization on access link = 100% total delay = Internet delay + access
delay + LAN delay
= 2 sec + minutes + milliseconds
originservers
public Internet
institutionalnetwork 100 Mbps LAN
15 Mbps access link
institutionalcache
CENG4430 (Spring 2011) 3-7
Caching example (cont)
possible solution increase bandwidth of access
link to, say, 100 Mbps
consequence utilization on LAN = 15% utilization on access link = 15% Total delay = Internet delay +
access delay + LAN delay
= 2 sec + msecs + msecs often a costly upgrade
originservers
public Internet
institutionalnetwork 100 Mbps LAN
100 Mbps access link
institutionalcache
CENG4430 (Spring 2011) 3-8
Caching example (cont)
possible solution: install cache
suppose hit rate is 0.4
consequence 40% requests will be satisfied
almost immediately 60% requests satisfied by
origin server utilization of access link
reduced to 60%, resulting in negligible delays (say 10 msec)
total avg delay = Internet delay + access delay + LAN delay = .6*(2.01) secs + .4*milliseconds < 1.4 secs
originservers
public Internet
institutionalnetwork 100 Mbps LAN
15 Mbps access link
institutionalcache
CENG4430 (Spring 2011) 3-9
User-server interaction: conditional GET
Goal: don’t send object if client has up-to-date stored (cached) version
client: specify date of cached copy in http requestIf-modified-since:
<date> server: response contains no
object if cached copy up-to-date: HTTP/1.0 304 Not
Modified
client server
http request msgIf-modified-since:
<date>
http responseHTTP/1.0
304 Not Modified
object not
modified
http request msgIf-modified-since:
<date>
http responseHTTP/1.1 200 OK
…
<data>
object modified
CENG4430 (Spring 2011) 3-10
Lecture 3: Roadmap
HTTP Proxy FTP – file transfer protocol DNS – domain name system Packet capture tools
CENG4430 (Spring 2011) 3-11
FTP: the file transfer protocol
transfer file to/from remote host client/server model
client: side that initiates transfer (either to/from remote) server: remote host
ftp: RFC 959 ftp server: port 21
file transfer FTPserver
FTPuser
interface
FTPclient
local filesystem
remote filesystem
user at host
CENG4430 (Spring 2011) 3-12
FTP: separate control, data connections
FTP client contacts FTP server at port 21, TCP is transport protocol
client authorized over control connection
client browses remote directory by sending commands over control connection.
when server receives file transfer command, server opens 2nd TCP connection (for file) to client
after transferring one file, server closes data connection.
FTPclient
FTPserver
TCP control connection
port 21
TCP data connectionport 20
server opens another TCP data connection to transfer another file.
control connection: “out of band”
FTP server maintains “state”: current directory, earlier authentication
CENG4430 (Spring 2011) 3-13
FTP commands, responses
Sample commands: sent as ASCII text over
control channel USER username PASS password LIST return list of file in
current directory RETR filename retrieves
(gets) file STOR filename stores
(puts) file onto remote host
Sample return codes status code and phrase (as
in HTTP) 331 Username OK,
password required 125 data connection
already open; transfer starting
425 Can’t open data connection
452 Error writing file
CENG4430 (Spring 2011) 3-14
FTP: Summary
Key idea of FTP: FTP is composed of two connections: data and
control.
The concept of separating data and control connections is used in many other applications: Streaming application:
• Control connection in TCP for high reliability• Data connection in UDP for good performance• Differences of TCP and UDP will be explained in Chapter 3.
CENG4430 (Spring 2011) 3-15
Lecture 3: Roadmap
HTTP Proxy FTP – file transfer protocol DNS – domain name system Packet capture tools
CENG4430 (Spring 2011) 3-16
DNS: Domain Name System
People: many identifiers: HKID, name, passport #
Internet hosts, routers: IP address (32 bit) - used for addressing datagrams “name”, e.g., ww.yahoo.com - used by humans
Name and address are both important: name: enable humans/applications to identify service address: identify network routes
Q: map between address and name ?
Analogy: people’s name to telephone numberWe need YellowPages
CENG4430 (Spring 2011) 3-17
DNS: Domain Name System
Domain Name System (DNS) A name-address mapping service application-layer protocol: host, routers, name
servers to communicate to resolve names (address/name translation)
What is the DNS server (or name server) that your machine is using?
• In Windows, run in command prompt: ipconfig /all• In Linux/Unix, check /etc/resolv.conf
How to check the mapping on the client side?• Use nslookup in Linux/Unix
CENG4430 (Spring 2011) 3-18
Properties of DNS
distributed database implemented in hierarchy of many name servers
Why not centralized DNS? single point of failure traffic volume can be huge distant centralized database maintenance is difficult
In short, centralized DNS doesn’t scale
CENG4430 (Spring 2011) 3-19
Properties of DNS
host aliasing Each host is associated with an official hostname (or
canonical hostname), which might be complicated A host can have one or more alias names, which are
typically more mnemonic than the canonical hostname Host aliasing refers an alias name to the right canonical
hostname. E.g., in Linux/Unix, run nslookup www.cuhk.edu.hk
Non-authoritative answer:Name: www.bj2.cuhk.edu.hkAddress: 137.189.11.73Aliases: www.cuhk.edu.hk
A similar idea: mail server aliasing
CENG4430 (Spring 2011) 3-20
Properties of DNS
load distribution A set of IP addresses is associated with one
canonical hostname When clients make a DNS query for the name, the
DNS server responds with the entire set of IP addresses, but rotates the ordering of the addresses within each reply.
E.g., try nslookup www.google.com and you’ll see different IP addresses at different times.
CENG4430 (Spring 2011) 3-21
Root DNS Servers
com DNS servers org DNS servers edu DNS servers
poly.eduDNS servers
umass.eduDNS servers
yahoo.comDNS servers
amazon.comDNS servers
pbs.orgDNS servers
Distributed, Hierarchical Database
Client wants IP for www.amazon.com; 1st approx: client queries a root server to find com DNS server client queries com DNS server to get amazon.com DNS
server client queries amazon.com DNS server to get IP
address for www.amazon.com
CENG4430 (Spring 2011) 3-22
DNS: Root name servers contacted by local name server that cannot resolve name root name server:
contacts authoritative name server if name mapping not known gets mapping returns mapping to local name server
13 root name servers worldwideb USC-ISI Marina del Rey, CA
l ICANN Los Angeles, CA
e NASA Mt View, CAf Internet Software C. Palo Alto, CA (and 36 other locations)
i Autonomica, Stockholm (plus 28 other locations)
k RIPE London (also 16 other locations)
m WIDE Tokyo (also Seoul, Paris, SF)
a Verisign, Dulles, VAc Cogent, Herndon, VA (also LA)d U Maryland College Park, MDg US DoD Vienna, VAh ARL Aberdeen, MDj Verisign, ( 21 locations)
CENG4430 (Spring 2011) 3-23
TLD and Authoritative Servers
Top-level domain (TLD) servers: responsible for com, org, net, edu, etc, and all top-
level country domains uk, fr, ca, jp. Network Solutions maintains servers for com TLD Educause for edu TLD
Authoritative DNS servers: organization’s DNS servers, providing authoritative
hostname to IP mappings for organization’s servers (e.g., Web, mail).
can be maintained by organization or service provider
CENG4430 (Spring 2011) 3-24
Local Name Server
does not strictly belong to hierarchy each ISP (residential ISP, company,
university) has one. also called “default name server”
when host makes DNS query, query is sent to its local DNS server acts as proxy, forwards query into hierarchy
CENG4430 (Spring 2011) 3-25
requesting hostcis.poly.edu
www.umass.edu
root DNS server
local DNS serverdns.poly.edu
1
23
4
5
6
authoritative DNS serverdns.umass.edu
78
TLD DNS server
DNS name resolution example
Host at cis.poly.edu wants IP address for www.umass.edu
iterated query: contacted server
replies with name of server to contact
“I don’t know this name, but ask this server”
.edu
CENG4430 (Spring 2011) 3-26
requesting hostcis.poly.edu
www.umass.edu
root DNS server
local DNS serverdns.poly.edu
1
2
45
6
authoritative DNS serverdns.umass.edu
7
8
TLD DNS server
3recursive query: puts burden of name
resolution on contacted name server
heavy load?
DNS name resolution example
.edu
CENG4430 (Spring 2011) 3-27
DNS: caching and updating records
once (any) name server learns mapping, it caches mapping cache entries timeout (disappear) after some
time TLD servers typically cached in local name
servers• Thus root name servers not often visited
update/notify mechanisms under design by IETF RFC 2136 http://www.ietf.org/html.charters/dnsind-charter.html
CENG4430 (Spring 2011) 3-28
DNS records
DNS: distributed db storing resource records (RR)
Type=NS name is domain (e.g. foo.com) value is hostname of
authoritative name server for this domain
RR format: (name, value, type, ttl)
Type=A name is hostname value is IP address
Type=CNAME name is alias name for some
“canonical” (the real) name www.ibm.com is really servereast.backup2.ibm.com value is canonical name
Type=MX value is name of
mailserver associated with name
CENG4430 (Spring 2011) 3-29
DNS protocol, messagesDNS protocol : query and reply messages, both with same message format
msg header identification: 16 bit #
for query, reply to query uses same #
flags: query or reply recursion desired recursion available reply is authoritative
CENG4430 (Spring 2011) 3-30
DNS protocol, messages
Name, type fields for a query
RRs in responseto query
records forauthoritative servers
additional “helpful”info that may be used
CENG4430 (Spring 2011) 3-31
Performance notes of DNS
Sluggish DNS access can slow all communications How do you assure fast performance?
Mis-configured DNS servers can cause failures How do you ascertain the DNS database is
consistent with reality? DNS server can become a communication
bottleneck Example: DNS server resides behind a congested
router
CENG4430 (Spring 2011) 3-32
Security notes of DNS
How do you find who is accessing your system? DNS helps authenticate IP addresses
E.g., Rlogin servers recognized trusted hosts through name in .rhost file
The server can use PTR query to validate an IP address (match with .rhost)
Security considerations Queries can be used to compromise info about domain resources Responses can be used to corrupt cache (e.g., falsify identity) An intruder can pretend to be a primary and update secondary
databases Brute force: denial of service attacks on the root servers
Extensions for dynamic changes increase risks
CENG4430 (Spring 2011) 3-33
Additional notes of DNS
Mobile/dynamic computing environment: Dynamic address changes Dynamic (vs. persistent) resource availability
DNS management Using DNS for load balancing Expanding the name space Generalizing resource discovery:
Search: “find any server that provides xxx” (e.g., available print server within 1000ft)
P2P networks; search engines….
CENG4430 (Spring 2011) 3-34
DNS Summary
Name/directory services are of central importance DNS provides a name service that is
Scalable & distributed Supports multi-domain federated management of name
space Permits performance tuning and control Somewhat extensible
The challenges: Secure sharing of resource data Handling dynamic environments Generalizing to broad resource discovery Decentralization
CENG4430 (Spring 2011) 3-35
Lecture 3: Roadmap
HTTP Proxy FTP – file transfer protocol DNS – domain name system Packet capture tools
CENG4430 (Spring 2011) 3-36
Packet Capture Why do we learn packet capture?
Have a better understanding on how different protocol layers work
Useful to debug your Programming Assignments
Idea: Capture raw traffic, including Ethernet header, IP header,
TCP/UDP header, application payload
Packet Capture Tools: tcpdump is a command-line version of packet capture tool
on Linux/Unix Wireshark is a graphical version of tcpdump
CENG4430 (Spring 2011) 3-37
Using tcpdump
To download and install tcpdump: You may get the source on http://www.tcpdump.org, or run
Here, we only focus on capturing traffic on the local machine that runs tcpdump Capturing traffic on a network won’t be discussed.
If you capture traffic on a network interface (e.g., eth0), you must have the root access. Opening a file with captured traffic doesn’t require root access
# apt-get install tcpdump
CENG4430 (Spring 2011) 3-38
Using tcpdump
Capture traffic on eth0:
Capture traffic on eth0, and save traffic to a file out.pcap:
# tcpdump –i eth0 –s 0
“-s 0” means to capture the whole packet(i.e., don’t truncate)
# tcpdump –i eth0 –w out.pcap –s 0
CENG4430 (Spring 2011) 3-39
Using tcpdump
You may apply a filtering rule to let you focus on a subset of packets
Capture traffic on HTTP
Capture traffic on DNS
Capture traffic on FTP
# tcpdump –i eth0 –w http.pcap –s 0 ‘port 80’
# tcpdump –i eth0 –w dns.pcap –s 0 ‘port 53’
# tcpdump –i eth0 –w ftp.pcap –s 0 ‘port 20 and port 21’
CENG4430 (Spring 2011) 3-40
Using wireshark
Once you save the traffic to a file using tcpdump, you can open the file on wireshark.
Wireshark provides a very nice breakdown on every captured packet
Download wireshark: http://www.wireshark.org Windows / Linux versions available
See demo.