56
HTTP IT Engineering Instructor: Rezvan Shiravi [email protected] 1

IT Engineering Instructor: Rezvan Shiravi [email protected] 1

Embed Size (px)

Citation preview

HTTP

HTTPIT Engineering

Instructor: Rezvan [email protected] 11QuestionsQ.1) How do web server and client browser talk toh each other?Q.2) What is the common protocol?Q.3) How are resources identified?Q.4) What are requests & responses?Q.5) Can/Should server know its clients?Q.6) Who can influence the communication between server & client?Q.7) Is everything public?

2Common ProtocolsIn order to 2 remote machines "understand" each other they shouldspeak the same languagecoordinate their talkThe solution is to use protocolsExamples:FTP File Transfer ProtocolSMTP Simple Mail Transfer ProtocolNNTP Network News Transfer ProtocolHTTP HyperText Transfer Protocol33Internet Protocol LayersThe Internet protocols are generally divided into four layers:Application - e.g., HTTPTransport - e.g., TCPNetwork - e.g., IPLink - e.g., Ethernet44What are the 4 layers?HTTPHyper-Text Transfer ProtocolAn application protocol for sending hypertext (e.g. HTML)Can transport non-hypertext dataHave you download files via http?A web browser is an HTTP clientThere are HTTP clients other than browsers - technical term is user agentA web server is an HTTP server55Other HTTP Clients?Key HTTP TermsUser agent: Browsers, spidersClient and Server systemsConnection: TCPMessageResource: Object or service. Identified by URI/URN/URLEntity: Representation of a resource with a header and body

6

6URI/URN/URL?An HTTP SessionA basic HTTP session has 4 phases:Client opens the connection (a TCP connection)Client makes a requestServer sends a responseServer closes the connection

77HTTP Transaction8

8Stateless ProtocolHTTP is a stateless protocolOnce a server has delivered the requested data to a client, the server retains no memory of what has just taken place (even if the connection is keep-alive)High performance & Low complexityWhat are the difficulties in working with a stateless protocol?How would you implement a site for buying some items?So why dont we have states in HTTP?99Stateless Protocol?TCP Connection for a HTTP connectionBefore systems can exchange HTTP messages, they must establish a TCP connection.Steps 1, 2, and 3Once the TCP connection is available, the client sends the server an HTTP request. The final two steps, 6 and 7, show the closing of the TCP connection.

10

10Persistent ConnectionsHTML pages often contain a number of elements (e.g. images) that will all be fetched from the serverEach one requires a separate HTTP requestOpening a TCP connection for each request is slow and taxes the client and serverTCP uses a three-way handshake when establishing a connection, so there is significant latency in establishing a connectionclient sends SYN, server replies ACK/SYN, client responds with ACKPersistent connections: multiple requests on one TCP connection1111...Persistent ConnectionsThe client uses the Connection header to request that the connection be kept openConnection: Keep-AliveAfter the last request, client requests connection to be closedConnection: CloseOtherwise, server will keep it open "forever"1212Persistent ConnectionsPersistent Connections: A client can issue many HTTP requests over a single TCP connection.

13

13PipeliningHTTP Pipelining allows the user agent to issue requests for multiple items without waiting for responses to arriveOverlap requests and responsesSupported in Firefox & IE ver. 7.0141415

File SystemProxy ServerWeb ServerHTTPRequestHTTPRequestHTTP ResponseHTTPResponsewww.therationaledge.com:80http://www.therationaledge.com

1516

DepartmentProxy ServerUniversityProxy ServerIranProxy ServerWeb Serverwww.therationaledge.com:80

16Intermediate HTTP SystemsProxy: For performance enhancements. Caches pages.Gateway: For security. Used as firewall boundary.Tunnel: Simple relayOrigin server: Holds original content1717Intermediate HTTP Systems18

18Proxy, Gateway?ResourcesA resource is a chunk of informationA resource can beA fileA dynamically created pageWhat we see on the browser can be a combination of some resourcesEach resource must be identified uniquely URI (Uniform Resource Identifier)Common practical URI is URL Uniform Resource Locator

1919Uniform Resource IdentifiersAlso called URL, where L=="Locator"Three parts:protocol: http, ftp, telnet, gopher, filehost: a DNS host name or IP addressfile: Unix syntax file nameProtocol and host are optional2020File? Optional?URL://:@:/?#

http://www.aut.ac.irftp://kernel.org/pubhttp://www.bing.com/search?q=web&go=&qs=n&form=QBLH&pq=web&sc=0-0&sp=-1file://c:\windows\ file:///home/bahador/work

21URLScheme: the application layer protocolHTTP: The web protocolHTTPS: Secure HTTPFTP: File transfer protocolFile: Access to a local filemailto: Send email to given address

22URLPath: the path of the object on the specified host with respect to web server (document) root directory

E.g. web server root directory: /var/www/ http://www.example.com/1.html /var/www/1.htmlhttp://www.example.com/1/2/3.jpg/var/www/1/2/3.jpg

23URLQuery: a mechanism to pass information from client to active pages or formsFill information in a university registration formAsk Google to search a phraseStarts with ?& is the border between multiple parametershttp://www.example.com/submit.php?name=ali&famility=karimi

24URLFrag: A name for a part of resourceA section in a documenthttp://www.example.com/paper.html#resultsHandled by browserBrowser gets whole resource (doc) from severIn display time, it jumps to the specified part

25HTTP ProtocolA complete request contains one or more lines of textFirst line (request-line) contains the request:Method Path ProtocolGET /index.html HTTP/1.1Additional lines are headers (later)The response consists of a status line, headers, and the contentThe first line (the status-line) looks like:Protocol status-code messageHTTP/1.1 200 OK2626HTTP Request MethodsGET: gets a documentHEAD: like GET BUT just headersPOST: Transfers a block of data with requestDELETE: To remove the resourcePUT: Add message body as the specified resource to serverTRACE: Server echo back the received messageFor troubleshooting & debugging

2727HTTP Request MethodsOPTIONS: Request the list of supported methods by server on the resourceCONNECT: Create HTTP tunnelClient asks server (which is proxy/gateway) to create TCP connection to the specified destination After TCP connection establishment, all data sent on TCP connection between client & server are copied to the established new TCP connection

28GET RequestA request to get a resource from the WebThe most frequently used methodThe request has no message bodyParameters can be sent in the request URLhttp://www.google.com/search?sourceid=navclient-ff&ie=UTF-8&rls=GGGL,GGGL:2006-30,GGGL:en&q=iran2929HEAD RequestA HEAD request asks the server to return the response headers only, and not the actual resource (No message body)Useful for checking characteristics of a resource without actually downloading it, thus saving bandwidthFile sizeTesting hypertext links for ValidityAccessibilityRecent modifications3030Post RequestPOST request can send data to the serverPOST is mostly used in form-fillingThe data filled into the form are translated by the browser into a special format and sent to the program on the server using the POST command3131Post Request (cont.)Sending a block of data with the request, in the message body Usually extra headers to describe this message bodylike Content-Type: and Content-LengthURL is a URL of a program to handle the sent data, not a simple html fileThe HTTP response is normally the output of a program, not a static file3232Post ExampleHere's a typical form submission, using POST: POST /path/register.cgi HTTP/1.0From: [email protected]: MyOwnHTTPTool-MyBrowser/1.0.0.8.9Content-Type: application/x-www-form-urlencodedContent-Length: 35

home=Tehran+No.13&favorite+flavor=flies3333Content-Type?Common HTTP StatusesSuccessful 2xx200- OK201 - Created202 - AcceptedRedirection 3xx 302- Temporarily moved (redirect)Client Error 4xx 404- Not Found401 - Unauthorized3434Common HTTP Statuses (cont.)Server Error 5xx500 Internal Server ErrorCommon with broken CGI programs

502 Bad GatewayThe server, while acting as a gateway or proxy, received an invalid response from the upstream server it accessed in attempting to fulfill the request.

503 Service UnavailableThe server is currently unable to handle the request due to a temporary overloading or maintenance of the server.3535Bad Gateway?Sample RequestGET /tbassemi/ HTTP/1.1Host: pages.google.comUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5Accept-Language: en-us,en;q=0.5Accept-Encoding: gzip,deflateAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Keep-Alive: 300Proxy-Connection: keep-aliveCookie: __utmz=76351879.1158993258.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); __utma=76351879.1503535206.1158993258.1159624300.1159625890.4; PHPSESSID=g7h5h7jfav0gvjcq9mai5sbao3

3636http request? q? deflate?Sample ResponseHTTP/1.1 200 OKProxy-Connection: Keep-AliveConnection: Keep-AliveDate: Tue, 10 Oct 2006 11:17:09 GMTServer: Apache/2.0X-Powered-By: PHP/5.1.2Content-Length: 3489Content-Type: text/html; charset=UTF-8

Personal Page

3737HTTP RequestThree components:Request line: method path versionHeaders (optional)General headersRequest headers - info about clientEntity headers - info about document being sentBlank line (CRLF)Entity body (optional)383839methodspURLspversionheadercrlf:valuecrlfheader:valuecrlfcrlfEntity BodyheaderlinesFormat of a Request39Request ExampleGET /index.html HTTP/1.1Accept: image/gif, image/jpegUser-Agent: Mozilla/4.0Host: ce.aut.ac.ir:80Connection: Keep-Alive[blank line here] 40methodrequest URLversionheaders40HTTP ResponseThree components:Status line: protocol status-code status-messageHeaders (optional)General headersRequest headers - info about clientEntity headers-info about document being sentBlank line (CRLF)Entity body (optional)4141Headers?RedirectionSometimes the web server redirects the client to visit another page insteadDifferent from HTML redirect with META tag

This is just a normal HTTP responseStatus code 3xxLocation header tells us the new locationThe client requests the new location4242Sample Redirection ResponseHTTP/1.1 302 Moved TemporarilyServer: Netscape-Enterprise/6.0Date: Tue, 05 Apr 2005 20:02:38 GMTLocation: http://www.aut.ac.ir/Content-length: 0Content-type: text/htmlConnection: close4343Cookies44CookiesHTTP is a stateless protocolServer does not remember its clientsHow to personalize pages ( personal portal)?Use http header: Client-ip, From, Is not usually sent by browsersFind client IP address from TCP connectionNATNetwork Address Translation(NAT) is the process of modifyingIP addressinformation inIPv4 headerswhile in transit across a trafficrouting device

45Identifier assigned by server to each user/sessionCookies are simply chunks of data sent from a web server to a client with the expectation that the client will resubmit the data in any subsequent request to the serverThe client browser isnt intended to read or understand the data - it just passes it back

46CookiesTypesSession cookies: To identify a sessionPersistent cookies: To identify a clientHow it worksServer asks client to remember the IDSet-Cookie header in response messageClient gives back the ID to server in each requestCookie header in request messagesServer customizes responses according to cookieLimitation: JavaScript code may attempt to read the contents of cookies - a security concern4747CookiesIf the client "supports cookies", then it sends Cookie header in subsequent requestsSpecifies name-value pairCookie: name=B Assemi";phone="21-6454"; PHPSESSID=g7h5h7jfav0gvjcq9mai5sbao3

A server can set any number of cookies in a response

The client sends them all in subsequent requests484849

49Cookie ExampleServer response:HTTP/1.0 200 OKSet-Cookie: acct=04382374;domain=.amazon.com;Expires=Sun, 16-Feb-2006 04:38:14 GMT;Path=/

Client request:GET /order.pl HTTP/1.0Cookie: acct=043823745050Caching51proxy server52

52Multiple proxies53

53Cache serversCache servers are proxy servers thatrelay requests and responses. Keep a local copy of any responses they receive.54

54CachingBenefitsReduce redundant data transferReduce network bottleneckReduce load on serverReduce delaySome things shouldnt be cachedClient and server cooperate on caching5555CachingSome things may be cached for a limited timeObjects life-time specified by serverExpire header: Absolute expiration timeCache-Control: max-age: Relative expiration timeIf requested object is not expiredCache server gives it to clientIf requested object is expired Its freshness must be checked

56