62
1 Proxies Herng-Yow Chen

Proxies

  • Upload
    fancy

  • View
    116

  • Download
    0

Embed Size (px)

DESCRIPTION

Proxies. Herng-Yow Chen. Outline. Explain HTTP proxies, contrasting them to web gateways and illustrating how proxies are deployed. Show some of the ways proxies are helpful. How proxies are deployed in real networks and how traffic is directed to proxy servers. - PowerPoint PPT Presentation

Citation preview

Page 1: Proxies

1

Proxies

Herng-Yow Chen

Page 2: Proxies

2

Outline Explain HTTP proxies, contrasting them to web

gateways and illustrating how proxies are deployed.

Show some of the ways proxies are helpful. How proxies are deployed in real networks and

how traffic is directed to proxy servers. How to configure your browser to use a proxy. Demonstrate HTTP proxy requests, how they

differ from server requests, and how proxies can subtly change the behavior of browsers.

Page 3: Proxies

3

Outline (cont.) Explain how you can record the path of your

messages through chains of proxy servers, using Via headers and the TRACE method.

Describe proxy-based HTTP access control.

Explain how proxies can interoperate between clients and servers, each of which may support different features and versions.

Page 4: Proxies

4

Web intermediaries

Web proxy servers are middlemen that fulfill transactions on the client’s behalf.

Without a web proxy, HTTP clients (e.g., a browser) talk directly to HTTP servers.

HTTP proxy servers are both web servers and web clients.

Page 5: Proxies

5

A proxy must be both a server and a client

serverclient Proxy

Request

Request

Response

Response

Proxies act like SERVERSto web clients.

Proxies act like CLIENTto web servers.

Page 6: Proxies

6

Private and Shared Proxies Public proxies (Shared proxies)

A proxy server can be shared among numerous clients. E.g., caching servers.

Private proxies A proxy server can be dedicated to a single client. E.g., some browser assistant products, as well as

some ISP services, run small proxies directly on the user’s PC in order to extend browser features, improve performance, or host advertising for free ISP services.

Page 7: Proxies

7

Proxies Versus Gateways Proxies connect two or more applications

that speak the same protocol.

A gateway acts as a “protocol converter,” allowing a client to complete a transaction with a server, even when the client and server speak different protocols.

Page 8: Proxies

8

Browser Email server

Browser Web server

(b)HTTP/POP gateway

(a)HTTP/HTTP Proxy

Web proxy

HTTP HTTP

HTTP POP

Web/email gateway

Proxies Versus Gateways

Page 9: Proxies

9

Why Use Proxies? Child filter Document access controller Security firewall Web cache Surrogate Content router Transcoder Anonymizer

Page 10: Proxies

10

Child-safe Internet filter

School’s filtering proxy

Internet

server

server

ok

DENY

Child user

Child user

Site contains adult content

Page 11: Proxies

11

Document access controller Proxy servers can be used to implement a

uniform access-control strategy across a large set of web servers and web resources and to create an audit trail.

All the access controls can be configured on the centralized proxy server, without requiring the access controls to be updated frequently on numerous web servers.

Maintain “blacklists” in order to identity and restrict access to objectionable content.

Page 12: Proxies

12

Centralized document access control

Internet

Client 1

Client 2

Client 3

Server A

Server B

Access control proxy

What is the password for the financial data?

General newsGeneral

news

Secret financial dataIntended

request to server B blocked

Local area network

To the Internet

Page 13: Proxies

13

Security firewall Network security engineers often use proxy

servers to enhance security. Proxy servers restrict which application-

level protocols flow in and out of an organization, at a single secure point in the network.

They also can provide hooks to examine the traffic, as used by virus-eliminating web and email proxies.

Page 14: Proxies

14

Internet

Security firewall

Client

Client

Client

Server

Server

Server

Filtering router

Firewall proxy

Filtering router

Virus

Firewall

Firewall

Page 15: Proxies

15

Web cache Proxy caches maintain local copies of

popular documents and serve them on demand, reducing slow and costly Internet communication.

Page 16: Proxies

16

Web cache

Page 17: Proxies

17

Surrogate Proxies can masquerade as web servers. These so-called surrogates or reverse proxies

receive real web server requests, but, unlike web servers, they many initiate communicate with other servers to locate the requested content on demand.

Surrogate (server accelerator) may be used to improve the performance of slow web servers for common content.

Surrogates also can be used in conjunction with content-routing functionality to create distributed networks of on-demand replicated content.

Page 18: Proxies

18

serverclient

Internet

Surrogate(also know as a

reverse proxy or a server accelerator)

Surrogate

Page 19: Proxies

19

Content router Proxy servers can act as “content routers,”

directing requests to particular web servers based on Internet traffic conditions and type of content.

Content routers also can be used to implement various service-level offerings.

For example, content routers forward requests to nearby replica caches (if the user has paid for higher performance), or route HTTP requests through filtering proxies (if the user has signed up for a filtering service).

Page 20: Proxies

20

Content routing

Page 21: Proxies

21

Transcoder Proxy servers can modify the body format

of content before delivering it to clients. This transparent translation between data representation is called transcoding.

For example, convert GIF images into JPEG images, compress files, summarize web content as a compact form, Language translation

Page 22: Proxies

22

Content transcoder

Summer Beach ShirtsYou’ll get lots of smiles and winks when you wear out summer beach shirt.

White

Black

Sunrise orange

Players de VeranoObtendra mchas sonrisas yguinios cuando use nuestras players de verano. Blanco Negro Naranja amanecer

Summer Beach Shirts

You’ll get lots of smiles and winks when you wear out summer beach shirt.

1) White

2) Black

3) Sunrise orange

Spanish-speaking

client

Web-enabledmobile phone

OriginserverTranscoding

proxy

Page 23: Proxies

23

Anonymizer Anonymizer proxies provide heightened privacy a

nd anonymity, by actively removing identifying information from HTTP messages.

Removed information, e.g., client IP, From header, Referer header, cookies, URI session IDs.

However, because identifying information is removed, the quality of the user’s browsing experience may be diminished, and some web sites may not function properly.

Page 24: Proxies

24

Anonymizer

GET /something/file.html HTTP/1.0

Date: Thu, 25 Sep 2003 12:55:23 GMT

User-Agent: Mozilla/4.0 (Windows NT 5.0)

From: [email protected]

Referer: http://www.csie.ncnu.edu.tw/tax-audits.html

Cookie: profile="fotbal,litte beer"

Cookie: income-braket="30k-45k"

GET /something/file.html HTTP/1.0

Date: Thu, 25 Sep 2003 12:55:23 GMT

User-Agent: Mozilla/4.0

Anonymized message doesn't contain the common identifying information headers

serverclientAnonymizing proxy

Page 25: Proxies

25

Proxy server deployment Egress proxy

Located at the exist points of local networks to control the traffic flow between LAN and the greater Internet.

E.g. Firewall protection, to reduce bandwidth charges and improve performance of Internet traffic.

Access (ingress) proxy placed at ISP access points, processing the aggregate requests from the

customers. E.g., ISPs use caching proxies to improve access performance.

Surrogates Located at the edge of the network, in front of web servers, where they

can field all of the requests directed at the web server and ask the web server for resources only when necessary.

Add security features to web servers, improve slower web server’s performance.

Network exchange proxy Placed in the Internet peering exchanging points between networks, to

alleviate congestion at Internet junctions through caching and to monitor traffic flows. (e.g. for national security concerns).

Page 26: Proxies

26

Private LAN egress proxy

server

client

client

InternetLocal network

Proxy

(a)Private Lan egress proxy

Page 27: Proxies

27

ISP access proxy

server

client

client

Internet

Proxy

(b)ISP access proxy

Page 28: Proxies

28

Surrogate

server

client

client

InternetLocal network

Proxy

(c)Surrogate

Page 29: Proxies

29

Network exchange proxy

serverclient

Network 1

Proxy

(d)Network exchange proxy

Network 2

Router Router

Page 30: Proxies

30

Proxy Hierarchies (e.g. 3-level)

Proxy 1 Proxy 3Proxy 2

serverclient (Child of proxy 2)

(Child of proxy 3 and parent of proxy 1)

(parent of proxy 2)

Proxies can be cascaded in chains called proxy hierarchies.This hierarchy is static.

Page 31: Proxies

31

Dynamic hierarchy, changing for each request

clientInternet

Caching proxy

Web servers around the globe

Access proxy

Compressor proxy

Dedicated cache server for specially-subscribed objects

Page 32: Proxies

32

Examples of dynamic parent selection

Load balancing Geographic proximity routing Protocol/type routing Subscription-based routing

Page 33: Proxies

33

How Proxies Get Traffic

(a)Client configured to use proxy

(b) Network intercepts and redirects traffic to proxy

(c) Surrogate stands in for web server

(d) Server redirects HTTP requests to proxy

serverclient serverclient

serverclient server

client

proxy proxy

proxy

Router

proxy

(Assuming the web server’s name)

Page 34: Proxies

34

Client Proxy Settings Manual configuration

Explicitly set a proxy to use. Browser preconfiguration

The browser vendor manually preconfigures the proxy setting of the browser before delivering it to customers.

Proxy auto-configuration (PAC) Provide a URI to a JavaScript proxy auto-configuration (PAC) file

s. The browser fetches the JavaScript file and runs it to decide whic

h proxy to use. WPAD proxy discovery

Some browser support the Web Proxy Autodiscovery Protocol (WPAD), which automatically detects a “configuration server” from which the browser can download an auto-configuration file. (e.g. in I.E.)

Page 35: Proxies

35

PAC files get http://proxy.ncnu.edu.tw/ncnu.pac .pac suffix and the MIME type “application/

x-ns-proxy-autoconfig.” Each PAC file must define a function called

FindProxyForURL (url, host) that computes the proper proxy server to use for accessing the URI.

DIRECT // connections should be made directly

PROXY host:port // the specified proxy should be used

Page 36: Proxies

36

Web Proxy Autodiscovery Protocol (WPAD)

A client that implements the WPAD will: Use WPAD to find the PAC URI. Fetch the PAC file given in the URI. Execute the PAC file to determine the proxy

server. Use the proxy server for requests.

Page 37: Proxies

37

WPAD (cont.) WPAD uses a series of resource-discovery techn

iques, one by one until it succeeds, to determine the proper PAC file.

Multiple discovery techniques are used, because not all organizations can use all techniques. Dynamic Host Discovery Protocol (DHCP) Service Location Protocol (SLP) DNS well-known hostnames DNS SRV records DNS service URIs in TXT records.

Page 38: Proxies

38

Proxy URLs Differ from Server URLs

Origin serverclient

(a)Server request

client

(b)Explicit proxy request

Origin server

GET /index.html HTTP/1.0

User-agent: SuperBrowser v1.3

GET http://www.ncnu.edu.tw/index.html HTTP/1.0

User-agent: SuperBrowser v1.3

Proxy Server(Proxy explicitly configured)

Page 39: Proxies

39

Proxy URLs Differ from Server URLs

Origin serverclient

(c)Surrogate(reverse proxy) request

client

(d) Intercepting proxy request

Origin server

(Server hostname points to the surrogate proxy)

GET /index.html HTTP/1.0

User-agent: SuperBrowser v1.3

Surrogate

GET /index.html HTTP/1.0

User-agent: SuperBrowser v1.3

Intercepting proxy

Page 40: Proxies

40

URL Resolution Without a Proxy

DNS server

www.ncnu.edu.tw

(2a)Browser looks up host “ncnu” via DNS

(2b)Failed , host unknown

(3b)Browser looks up host “www.ncnu.edu.tw” via DNS

(3c)Success!Get IP addresses back

(4a)Browser tries to connect to IP addresses, one by one until connect successful(4b)Success;connection established

(5a)Browser sends HTTP request

(5b)Browser gets HTTP response(3a)The browser does auto-expansion, converting ”ncnu” into “www.ncnu.edu.tw”

(1)User types”ncnu” into browser’s URI location window

Page 41: Proxies

41

URL Resolution with an Explicit Proxy

DNS server

www.ncnu.edu.tw

(2a)Proxy is explicitly configured, so the browser looks up the address of the proxy server using DNS

(2b)Success!Get proxy server IP addresses

(3a)Browser tries to connect to proxy

(3b)Success;connection established

(4a)Browser sends HTTP request(3a)The browser does auto-expansion, converting”ncnu” into “www.ncnu.edu.tw”

(1)User types ”ncnu” into browser’s URI location window

GET http://ncnu/ HTTP/1.0

Proxy-connection: keep-Alive

User-Agent: Mozilla/4.0

Host: ncnu

Accept: */*

Accept-encoding: gzip

Accept-language: en

Accept-charset: iso-8859-1,*,utf-8

proxy

(4b)Proxy gets a partial hostname in the request, because the client did not auto-expand it.

Page 42: Proxies

42

URL Resolution with an Intercepting Proxy

www.ncnu.edu.tw

Interceptor

DNS server

proxy

(1)

(3a)

(2a) (2b

) (3b) (3c

)

(4a)(4b)(5a)

(5a)

Client

Page 43: Proxies

43

Tracing Messages

clientInternet

ISP proxy

Web server

Surrogate cache bank

Today, it’s not uncommon for web requests to go through a chain of two or more proxies on their way from the client to the server.

It’s important to trace the flow of messages across proxies and to detect any problems.

Page 44: Proxies

44

The Via Header Is used to track the forwarding of messages,

diagnose message routing loops, and identify the protocol capabilities of all senders along the request/response chain.

Lists information about each intermediate node (proxy or gateway) through which a message passes.

Each time a message goes through another node, the intermediate node must be added to the end of the Via list.

Page 45: Proxies

45

The Via Header

serverclient

proxy1.ncnu.edu.tw

(HTTP/1.1)

proxy2.ncnu.edu.tw

(HTTP/1.0)

GET /index.html HTTP/1.0Accept: text/htmlHost: www.csie.ncnu.edu.twVia: 1.1 proxy1.ncnu.edu.tw, 1.0 proxy2.ncnu.edu.tw

Request message (as received by server)

Page 46: Proxies

46

The response Via is usually the reverse of the request Via

serverclient

A B C

Request Via headerVia: 1.1 A, 1.1 B, 1.1 C

Response Via headerVia: 1.1 C, 1.1 B, 1.1 A

Page 47: Proxies

47

Via and gateways Some proxies provide gateway functionality

to servers that speak non-HTTP protocols.

The Via header records these protocol conversions, so HTTP applications can be aware of protocol capabilities and conversions along the proxy chain.

Page 48: Proxies

48

Via and gateways

www.ncnu.edu.tw

clientproxy1.ncnu.edu.tw

(HTTP/1.1)

HTTP request message sent to proxyGET ftp://www.ncnu.edu.tw/pub/welcome.txt HTTP/1.0

FTP request

FTP response

HTTP response messageHTTP/1.0 200 OKDate: Sun, 12 Dec 2003 21:01:59 GMTVia: FTP/1.0 proxy.ncnu.edu.tw (Traffic-Server/5.0.1-17882[cMsf])Last-modified: sun, 12 Dec 2003 21:05:24 GMTContent-type: text/plain

Hi there. This is an FTP server.

Page 49: Proxies

49

The Server and Via headers The Server response header field describes the

software used by the origin server. Server: Apache/1.3.14 (UNIX) PHP/4.0.4 Server: Netscape-Enterprise/4.1 Server: Microsoft-IIS/5.0

If a response message is being forwarded through a proxy, make sure the proxy does not modify the Server header.

The Server header is meant for the origin server. Instead, the proxy should add a Via entry.

Page 50: Proxies

50

Privacy and security implications of Via

There are some cases when we don’t want exact hostnames in the Via string.

For example, when a proxy server is part of a network firewall it should not forward the names and ports of hosts behind the firewall, because knowledge of network architecture behind a firewall might be of use to a malicious party.

Proxy can disable the Via node-name forwarding, replacing the hostname with an appropriate pseudonym.

For strong privacy requirements, a proxy may combine an ordered sequence of Via waypoint entries (with same protocol version) into a single, joined entry.

Via: 1.0 foo, 1.1 devirus.com, 1.1 access-logger.com Via: 1.0 foo, 1.1 concealed-stuff

Page 51: Proxies

51

The TRACE method Proxy servers can change messages as the messages are

forwarded. Headers are added, modified, and removed, and bodies can be converted to different formats.

As proxies become more sophisticated, and more vendors deploy proxy products, interoperability problems increase.

We need a way to watch how messages are changed, hop by hop.

HTTP/1.1’s TRACE method is for this purpose. It is very useful for debugging proxy flows.

It can trace a request message through a chain of proxies, observing what proxies the message passes through and how each proxy modifies the request message.

Page 52: Proxies

52

The TRACE method When the TRACE request reaches the

destination server, the entire request message is reflected back to the sender, bundled up in the body of an HTTP response.

When the TRACE response arrives, the client can examine the exact message the server received and the list of proxies through which it passed (in the Via header).

The TRACE response has Content-Type: message/http And a 200 OK status

Page 53: Proxies

53

The TRACE Method

Serverwww.ncnu.edu.tw

client

Proxy1(proxy.ncnu.edu.tw)

Proxy2(proxy2.ncnu.edu.tw)

Proxy3(proxy3.ncnu.edu.tw)

TRACE requestTRACE /index.html HTTP/1.1Host: www.ncnu.edu.twAccept: text/html

TRACE response

HTTP/1.1 200 OKContent-Type: message/httpContent-Length: 269Via: 1.1 proxy3.ncnu.edu.tw, 1.1 proxy2.ncnu.edu.tw, 1.1 proxy1.ncnu.edu.twTRACE /index.html HTTP/1.1

Host: www.ncnu.edu.twAccept: text/htmlVia: 1.1proxy.ncnu.edu.tw, 1.1 proxy2.ncnu.edu.tw, 1.1 proxy3.ncnu.edu.twX-Magic-CDN-Thingy: 134-AF-003Cookie: accept-isp=“hinet’s ISP, Puli”Client-ip: 163.22.3.4

Received request

Page 54: Proxies

54

Max-Forwards Normally, TRACE messages travel all the way to the

destination server, regardless of the number of intervening proxies.

We can use the Max-Forwards header to limit the number of proxy hops for TRACE and OPTIONS requests, which is useful for

Testing a chain of proxies forwarding messages in an infinite loop.

Checking the effects of particular proxy in the middle of a chain.

If Max-Forwards value is zero, the receiver must reflect the TRACE message back toward the client (The same mechanism likes TTL in IP datagram). Otherwise Max-Forwards value should be decremented by one.

Page 55: Proxies

55

Max-Forwards

client

Proxy1(proxy.ncnu.edu.tw)

Proxy2(proxy2.ncnu.edu.tw)

Proxy3(proxy3.ncnu.edu.tw)

HTTP/1.1 200 OKContent-Type: message/httpContent-Length: 269Via: 1.1 proxy2.ncnu.edu.tw, 1.1 proxy1.ncnu.edu.tw

TRACE /index.html HTTP/1.1Host: www.ncnu.edu.twAccept: text/htmlVia: 1.1proxy.ncnu.edu.tw, 1.1 proxy2.ncnu.edu.twX-Magic-CDN-Thingy: 134-AF-003Cookie: accept-isp=“hinet’s ISP, Puli”Client-ip: 163.22.3.4

Received request

TRACE requestTRACE /index.html HTTP/1.1Host: www.ncnu.edu.twMax-Forward: 2Accept: text/html Max-

Forward=1

Max-Forward=0

Serverwww.ncnu.edu.tw

TRACE response

Page 56: Proxies

56

Proxy Authentication Proxies can serve as access-control

devices.

HTTP defines mechanism called proxy authentication that blocks requests for content until the user provides valid access permission credentials to the proxy.

We will talk more about HTTP authentication in later lectures (chap12).

Page 57: Proxies

57

Proxy Authentication

client server

(a)

Access control proxy

GET http://www.ncnu.edu.tw/secret.jpg HTTP/1.0

clientserver

(b)

Access control proxy

HTTP/1.0 407 Proxy Authorization RequiredProxy-Authenticate: Basic realm=“Secure Stuff”

client server

(c)

Access control proxy

GET http://server.com/secret.jpg HTTP/1.0Proxy-Authorization: Basic YadNfddZws==

Page 58: Proxies

58

Proxy Authentication

client serverAccess control proxy

HTTP/1.0 200 okContent-type: image/jpeg…<image data included>…

(d)

Super secret image

Page 59: Proxies

59

Proxy Interoperation Client, servers, and proxies are built by multiple ven

dors, to different versions of HTTP specification. Proxy servers need to intermediate between client-side and server-side devices, which may impl

ement different protocols and have different bugs (quirks).

Handling unsupported Headers and Methods Must forward unrecognized header fields and must maint

ain the relative order of header fields with the same name. OPTIONS method is use to discover optional featur

e support

Page 60: Proxies

60

OPTIONS:Discovering Optional Feature Support

serverclient

OPTIONS * HTTP/1.1

HTTP/1.1 200 OK

Allow: GET,PUT,POST,HEAD,TRACE,OPTIONS

Proxy

Page 61: Proxies

61

OPTIONS If the URI is a real resource, the OPTIONS

request inquires about the features available to that particular resource.

OPTIONS http://www.joes-heardware.com/index.html HTTP/1.1

Page 62: Proxies

62

For More Information http://www.w3.org/Protocols/rfc2616/rfc2616.txt

“Hypertext Transfer Protocol” by R. Fielding,J. Gettys,J. Mogul,H. Frystyk,L. Masinter,P. Leach,T. Berners-Lee

ftp://ftp.rfc-editor.org/in-notes/rfc3040.txt Internet Web Replication and Caching Taxonomy

Web Proxy servers Ari Luotonen,Prentice Hall Computer Books.

ftp://ftp.rfc-editor.org/in-notes/rfc3143.txt Known HTTP Proxy/Caching Problems

Web Caching Duane Wessels ,O’Reilly & Associates,Inc