Upload
fancy
View
116
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Proxies. Herng-Yow Chen. Outline. Explain HTTP proxies, contrasting them to web gateways and illustrating how proxies are deployed. Show some of the ways proxies are helpful. How proxies are deployed in real networks and how traffic is directed to proxy servers. - PowerPoint PPT Presentation
Citation preview
1
Proxies
Herng-Yow Chen
2
Outline Explain HTTP proxies, contrasting them to web
gateways and illustrating how proxies are deployed.
Show some of the ways proxies are helpful. How proxies are deployed in real networks and
how traffic is directed to proxy servers. How to configure your browser to use a proxy. Demonstrate HTTP proxy requests, how they
differ from server requests, and how proxies can subtly change the behavior of browsers.
3
Outline (cont.) Explain how you can record the path of your
messages through chains of proxy servers, using Via headers and the TRACE method.
Describe proxy-based HTTP access control.
Explain how proxies can interoperate between clients and servers, each of which may support different features and versions.
4
Web intermediaries
Web proxy servers are middlemen that fulfill transactions on the client’s behalf.
Without a web proxy, HTTP clients (e.g., a browser) talk directly to HTTP servers.
HTTP proxy servers are both web servers and web clients.
5
A proxy must be both a server and a client
serverclient Proxy
Request
Request
Response
Response
Proxies act like SERVERSto web clients.
Proxies act like CLIENTto web servers.
6
Private and Shared Proxies Public proxies (Shared proxies)
A proxy server can be shared among numerous clients. E.g., caching servers.
Private proxies A proxy server can be dedicated to a single client. E.g., some browser assistant products, as well as
some ISP services, run small proxies directly on the user’s PC in order to extend browser features, improve performance, or host advertising for free ISP services.
7
Proxies Versus Gateways Proxies connect two or more applications
that speak the same protocol.
A gateway acts as a “protocol converter,” allowing a client to complete a transaction with a server, even when the client and server speak different protocols.
8
Browser Email server
Browser Web server
(b)HTTP/POP gateway
(a)HTTP/HTTP Proxy
Web proxy
HTTP HTTP
HTTP POP
Web/email gateway
Proxies Versus Gateways
9
Why Use Proxies? Child filter Document access controller Security firewall Web cache Surrogate Content router Transcoder Anonymizer
10
Child-safe Internet filter
School’s filtering proxy
Internet
server
server
ok
DENY
Child user
Child user
Site contains adult content
11
Document access controller Proxy servers can be used to implement a
uniform access-control strategy across a large set of web servers and web resources and to create an audit trail.
All the access controls can be configured on the centralized proxy server, without requiring the access controls to be updated frequently on numerous web servers.
Maintain “blacklists” in order to identity and restrict access to objectionable content.
12
Centralized document access control
Internet
Client 1
Client 2
Client 3
Server A
Server B
Access control proxy
What is the password for the financial data?
General newsGeneral
news
Secret financial dataIntended
request to server B blocked
Local area network
To the Internet
13
Security firewall Network security engineers often use proxy
servers to enhance security. Proxy servers restrict which application-
level protocols flow in and out of an organization, at a single secure point in the network.
They also can provide hooks to examine the traffic, as used by virus-eliminating web and email proxies.
14
Internet
Security firewall
Client
Client
Client
Server
Server
Server
Filtering router
Firewall proxy
Filtering router
Virus
Firewall
Firewall
15
Web cache Proxy caches maintain local copies of
popular documents and serve them on demand, reducing slow and costly Internet communication.
16
Web cache
17
Surrogate Proxies can masquerade as web servers. These so-called surrogates or reverse proxies
receive real web server requests, but, unlike web servers, they many initiate communicate with other servers to locate the requested content on demand.
Surrogate (server accelerator) may be used to improve the performance of slow web servers for common content.
Surrogates also can be used in conjunction with content-routing functionality to create distributed networks of on-demand replicated content.
18
serverclient
Internet
Surrogate(also know as a
reverse proxy or a server accelerator)
Surrogate
19
Content router Proxy servers can act as “content routers,”
directing requests to particular web servers based on Internet traffic conditions and type of content.
Content routers also can be used to implement various service-level offerings.
For example, content routers forward requests to nearby replica caches (if the user has paid for higher performance), or route HTTP requests through filtering proxies (if the user has signed up for a filtering service).
20
Content routing
21
Transcoder Proxy servers can modify the body format
of content before delivering it to clients. This transparent translation between data representation is called transcoding.
For example, convert GIF images into JPEG images, compress files, summarize web content as a compact form, Language translation
22
Content transcoder
Summer Beach ShirtsYou’ll get lots of smiles and winks when you wear out summer beach shirt.
White
Black
Sunrise orange
Players de VeranoObtendra mchas sonrisas yguinios cuando use nuestras players de verano. Blanco Negro Naranja amanecer
Summer Beach Shirts
You’ll get lots of smiles and winks when you wear out summer beach shirt.
1) White
2) Black
3) Sunrise orange
Spanish-speaking
client
Web-enabledmobile phone
OriginserverTranscoding
proxy
23
Anonymizer Anonymizer proxies provide heightened privacy a
nd anonymity, by actively removing identifying information from HTTP messages.
Removed information, e.g., client IP, From header, Referer header, cookies, URI session IDs.
However, because identifying information is removed, the quality of the user’s browsing experience may be diminished, and some web sites may not function properly.
24
Anonymizer
GET /something/file.html HTTP/1.0
Date: Thu, 25 Sep 2003 12:55:23 GMT
User-Agent: Mozilla/4.0 (Windows NT 5.0)
From: [email protected]
Referer: http://www.csie.ncnu.edu.tw/tax-audits.html
Cookie: profile="fotbal,litte beer"
Cookie: income-braket="30k-45k"
GET /something/file.html HTTP/1.0
Date: Thu, 25 Sep 2003 12:55:23 GMT
User-Agent: Mozilla/4.0
Anonymized message doesn't contain the common identifying information headers
serverclientAnonymizing proxy
25
Proxy server deployment Egress proxy
Located at the exist points of local networks to control the traffic flow between LAN and the greater Internet.
E.g. Firewall protection, to reduce bandwidth charges and improve performance of Internet traffic.
Access (ingress) proxy placed at ISP access points, processing the aggregate requests from the
customers. E.g., ISPs use caching proxies to improve access performance.
Surrogates Located at the edge of the network, in front of web servers, where they
can field all of the requests directed at the web server and ask the web server for resources only when necessary.
Add security features to web servers, improve slower web server’s performance.
Network exchange proxy Placed in the Internet peering exchanging points between networks, to
alleviate congestion at Internet junctions through caching and to monitor traffic flows. (e.g. for national security concerns).
26
Private LAN egress proxy
server
client
client
InternetLocal network
Proxy
(a)Private Lan egress proxy
27
ISP access proxy
server
client
client
Internet
Proxy
(b)ISP access proxy
28
Surrogate
server
client
client
InternetLocal network
Proxy
(c)Surrogate
29
Network exchange proxy
serverclient
Network 1
Proxy
(d)Network exchange proxy
Network 2
Router Router
30
Proxy Hierarchies (e.g. 3-level)
Proxy 1 Proxy 3Proxy 2
serverclient (Child of proxy 2)
(Child of proxy 3 and parent of proxy 1)
(parent of proxy 2)
Proxies can be cascaded in chains called proxy hierarchies.This hierarchy is static.
31
Dynamic hierarchy, changing for each request
clientInternet
Caching proxy
Web servers around the globe
Access proxy
Compressor proxy
Dedicated cache server for specially-subscribed objects
32
Examples of dynamic parent selection
Load balancing Geographic proximity routing Protocol/type routing Subscription-based routing
33
How Proxies Get Traffic
(a)Client configured to use proxy
(b) Network intercepts and redirects traffic to proxy
(c) Surrogate stands in for web server
(d) Server redirects HTTP requests to proxy
serverclient serverclient
serverclient server
client
proxy proxy
proxy
Router
proxy
(Assuming the web server’s name)
34
Client Proxy Settings Manual configuration
Explicitly set a proxy to use. Browser preconfiguration
The browser vendor manually preconfigures the proxy setting of the browser before delivering it to customers.
Proxy auto-configuration (PAC) Provide a URI to a JavaScript proxy auto-configuration (PAC) file
s. The browser fetches the JavaScript file and runs it to decide whic
h proxy to use. WPAD proxy discovery
Some browser support the Web Proxy Autodiscovery Protocol (WPAD), which automatically detects a “configuration server” from which the browser can download an auto-configuration file. (e.g. in I.E.)
35
PAC files get http://proxy.ncnu.edu.tw/ncnu.pac .pac suffix and the MIME type “application/
x-ns-proxy-autoconfig.” Each PAC file must define a function called
FindProxyForURL (url, host) that computes the proper proxy server to use for accessing the URI.
DIRECT // connections should be made directly
PROXY host:port // the specified proxy should be used
36
Web Proxy Autodiscovery Protocol (WPAD)
A client that implements the WPAD will: Use WPAD to find the PAC URI. Fetch the PAC file given in the URI. Execute the PAC file to determine the proxy
server. Use the proxy server for requests.
37
WPAD (cont.) WPAD uses a series of resource-discovery techn
iques, one by one until it succeeds, to determine the proper PAC file.
Multiple discovery techniques are used, because not all organizations can use all techniques. Dynamic Host Discovery Protocol (DHCP) Service Location Protocol (SLP) DNS well-known hostnames DNS SRV records DNS service URIs in TXT records.
38
Proxy URLs Differ from Server URLs
Origin serverclient
(a)Server request
client
(b)Explicit proxy request
Origin server
GET /index.html HTTP/1.0
User-agent: SuperBrowser v1.3
GET http://www.ncnu.edu.tw/index.html HTTP/1.0
User-agent: SuperBrowser v1.3
Proxy Server(Proxy explicitly configured)
39
Proxy URLs Differ from Server URLs
Origin serverclient
(c)Surrogate(reverse proxy) request
client
(d) Intercepting proxy request
Origin server
(Server hostname points to the surrogate proxy)
GET /index.html HTTP/1.0
User-agent: SuperBrowser v1.3
Surrogate
GET /index.html HTTP/1.0
User-agent: SuperBrowser v1.3
Intercepting proxy
40
URL Resolution Without a Proxy
DNS server
www.ncnu.edu.tw
(2a)Browser looks up host “ncnu” via DNS
(2b)Failed , host unknown
(3b)Browser looks up host “www.ncnu.edu.tw” via DNS
(3c)Success!Get IP addresses back
(4a)Browser tries to connect to IP addresses, one by one until connect successful(4b)Success;connection established
(5a)Browser sends HTTP request
(5b)Browser gets HTTP response(3a)The browser does auto-expansion, converting ”ncnu” into “www.ncnu.edu.tw”
(1)User types”ncnu” into browser’s URI location window
41
URL Resolution with an Explicit Proxy
DNS server
www.ncnu.edu.tw
(2a)Proxy is explicitly configured, so the browser looks up the address of the proxy server using DNS
(2b)Success!Get proxy server IP addresses
(3a)Browser tries to connect to proxy
(3b)Success;connection established
(4a)Browser sends HTTP request(3a)The browser does auto-expansion, converting”ncnu” into “www.ncnu.edu.tw”
(1)User types ”ncnu” into browser’s URI location window
GET http://ncnu/ HTTP/1.0
Proxy-connection: keep-Alive
User-Agent: Mozilla/4.0
Host: ncnu
Accept: */*
Accept-encoding: gzip
Accept-language: en
Accept-charset: iso-8859-1,*,utf-8
proxy
(4b)Proxy gets a partial hostname in the request, because the client did not auto-expand it.
42
URL Resolution with an Intercepting Proxy
www.ncnu.edu.tw
Interceptor
DNS server
proxy
(1)
(3a)
(2a) (2b
) (3b) (3c
)
(4a)(4b)(5a)
(5a)
Client
43
Tracing Messages
clientInternet
ISP proxy
Web server
Surrogate cache bank
Today, it’s not uncommon for web requests to go through a chain of two or more proxies on their way from the client to the server.
It’s important to trace the flow of messages across proxies and to detect any problems.
44
The Via Header Is used to track the forwarding of messages,
diagnose message routing loops, and identify the protocol capabilities of all senders along the request/response chain.
Lists information about each intermediate node (proxy or gateway) through which a message passes.
Each time a message goes through another node, the intermediate node must be added to the end of the Via list.
45
The Via Header
serverclient
proxy1.ncnu.edu.tw
(HTTP/1.1)
proxy2.ncnu.edu.tw
(HTTP/1.0)
GET /index.html HTTP/1.0Accept: text/htmlHost: www.csie.ncnu.edu.twVia: 1.1 proxy1.ncnu.edu.tw, 1.0 proxy2.ncnu.edu.tw
Request message (as received by server)
46
The response Via is usually the reverse of the request Via
serverclient
A B C
Request Via headerVia: 1.1 A, 1.1 B, 1.1 C
Response Via headerVia: 1.1 C, 1.1 B, 1.1 A
47
Via and gateways Some proxies provide gateway functionality
to servers that speak non-HTTP protocols.
The Via header records these protocol conversions, so HTTP applications can be aware of protocol capabilities and conversions along the proxy chain.
48
Via and gateways
www.ncnu.edu.tw
clientproxy1.ncnu.edu.tw
(HTTP/1.1)
HTTP request message sent to proxyGET ftp://www.ncnu.edu.tw/pub/welcome.txt HTTP/1.0
FTP request
FTP response
HTTP response messageHTTP/1.0 200 OKDate: Sun, 12 Dec 2003 21:01:59 GMTVia: FTP/1.0 proxy.ncnu.edu.tw (Traffic-Server/5.0.1-17882[cMsf])Last-modified: sun, 12 Dec 2003 21:05:24 GMTContent-type: text/plain
Hi there. This is an FTP server.
49
The Server and Via headers The Server response header field describes the
software used by the origin server. Server: Apache/1.3.14 (UNIX) PHP/4.0.4 Server: Netscape-Enterprise/4.1 Server: Microsoft-IIS/5.0
If a response message is being forwarded through a proxy, make sure the proxy does not modify the Server header.
The Server header is meant for the origin server. Instead, the proxy should add a Via entry.
50
Privacy and security implications of Via
There are some cases when we don’t want exact hostnames in the Via string.
For example, when a proxy server is part of a network firewall it should not forward the names and ports of hosts behind the firewall, because knowledge of network architecture behind a firewall might be of use to a malicious party.
Proxy can disable the Via node-name forwarding, replacing the hostname with an appropriate pseudonym.
For strong privacy requirements, a proxy may combine an ordered sequence of Via waypoint entries (with same protocol version) into a single, joined entry.
Via: 1.0 foo, 1.1 devirus.com, 1.1 access-logger.com Via: 1.0 foo, 1.1 concealed-stuff
51
The TRACE method Proxy servers can change messages as the messages are
forwarded. Headers are added, modified, and removed, and bodies can be converted to different formats.
As proxies become more sophisticated, and more vendors deploy proxy products, interoperability problems increase.
We need a way to watch how messages are changed, hop by hop.
HTTP/1.1’s TRACE method is for this purpose. It is very useful for debugging proxy flows.
It can trace a request message through a chain of proxies, observing what proxies the message passes through and how each proxy modifies the request message.
52
The TRACE method When the TRACE request reaches the
destination server, the entire request message is reflected back to the sender, bundled up in the body of an HTTP response.
When the TRACE response arrives, the client can examine the exact message the server received and the list of proxies through which it passed (in the Via header).
The TRACE response has Content-Type: message/http And a 200 OK status
53
The TRACE Method
Serverwww.ncnu.edu.tw
client
Proxy1(proxy.ncnu.edu.tw)
Proxy2(proxy2.ncnu.edu.tw)
Proxy3(proxy3.ncnu.edu.tw)
TRACE requestTRACE /index.html HTTP/1.1Host: www.ncnu.edu.twAccept: text/html
TRACE response
HTTP/1.1 200 OKContent-Type: message/httpContent-Length: 269Via: 1.1 proxy3.ncnu.edu.tw, 1.1 proxy2.ncnu.edu.tw, 1.1 proxy1.ncnu.edu.twTRACE /index.html HTTP/1.1
Host: www.ncnu.edu.twAccept: text/htmlVia: 1.1proxy.ncnu.edu.tw, 1.1 proxy2.ncnu.edu.tw, 1.1 proxy3.ncnu.edu.twX-Magic-CDN-Thingy: 134-AF-003Cookie: accept-isp=“hinet’s ISP, Puli”Client-ip: 163.22.3.4
Received request
54
Max-Forwards Normally, TRACE messages travel all the way to the
destination server, regardless of the number of intervening proxies.
We can use the Max-Forwards header to limit the number of proxy hops for TRACE and OPTIONS requests, which is useful for
Testing a chain of proxies forwarding messages in an infinite loop.
Checking the effects of particular proxy in the middle of a chain.
If Max-Forwards value is zero, the receiver must reflect the TRACE message back toward the client (The same mechanism likes TTL in IP datagram). Otherwise Max-Forwards value should be decremented by one.
55
Max-Forwards
client
Proxy1(proxy.ncnu.edu.tw)
Proxy2(proxy2.ncnu.edu.tw)
Proxy3(proxy3.ncnu.edu.tw)
HTTP/1.1 200 OKContent-Type: message/httpContent-Length: 269Via: 1.1 proxy2.ncnu.edu.tw, 1.1 proxy1.ncnu.edu.tw
TRACE /index.html HTTP/1.1Host: www.ncnu.edu.twAccept: text/htmlVia: 1.1proxy.ncnu.edu.tw, 1.1 proxy2.ncnu.edu.twX-Magic-CDN-Thingy: 134-AF-003Cookie: accept-isp=“hinet’s ISP, Puli”Client-ip: 163.22.3.4
Received request
TRACE requestTRACE /index.html HTTP/1.1Host: www.ncnu.edu.twMax-Forward: 2Accept: text/html Max-
Forward=1
Max-Forward=0
Serverwww.ncnu.edu.tw
TRACE response
56
Proxy Authentication Proxies can serve as access-control
devices.
HTTP defines mechanism called proxy authentication that blocks requests for content until the user provides valid access permission credentials to the proxy.
We will talk more about HTTP authentication in later lectures (chap12).
57
Proxy Authentication
client server
(a)
Access control proxy
GET http://www.ncnu.edu.tw/secret.jpg HTTP/1.0
clientserver
(b)
Access control proxy
HTTP/1.0 407 Proxy Authorization RequiredProxy-Authenticate: Basic realm=“Secure Stuff”
client server
(c)
Access control proxy
GET http://server.com/secret.jpg HTTP/1.0Proxy-Authorization: Basic YadNfddZws==
58
Proxy Authentication
client serverAccess control proxy
HTTP/1.0 200 okContent-type: image/jpeg…<image data included>…
(d)
Super secret image
59
Proxy Interoperation Client, servers, and proxies are built by multiple ven
dors, to different versions of HTTP specification. Proxy servers need to intermediate between client-side and server-side devices, which may impl
ement different protocols and have different bugs (quirks).
Handling unsupported Headers and Methods Must forward unrecognized header fields and must maint
ain the relative order of header fields with the same name. OPTIONS method is use to discover optional featur
e support
60
OPTIONS:Discovering Optional Feature Support
serverclient
OPTIONS * HTTP/1.1
HTTP/1.1 200 OK
Allow: GET,PUT,POST,HEAD,TRACE,OPTIONS
Proxy
61
OPTIONS If the URI is a real resource, the OPTIONS
request inquires about the features available to that particular resource.
OPTIONS http://www.joes-heardware.com/index.html HTTP/1.1
62
For More Information http://www.w3.org/Protocols/rfc2616/rfc2616.txt
“Hypertext Transfer Protocol” by R. Fielding,J. Gettys,J. Mogul,H. Frystyk,L. Masinter,P. Leach,T. Berners-Lee
ftp://ftp.rfc-editor.org/in-notes/rfc3040.txt Internet Web Replication and Caching Taxonomy
Web Proxy servers Ari Luotonen,Prentice Hall Computer Books.
ftp://ftp.rfc-editor.org/in-notes/rfc3143.txt Known HTTP Proxy/Caching Problems
Web Caching Duane Wessels ,O’Reilly & Associates,Inc