51
Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html 1 of 101 4/10/2007 9:53 AM Table of Contents | All Slides | Link List | CSCI E-12 Hypertext Transfer Protocol April 10, 2007 Harvard University Division of Continuing Education Extension School Course Web Site: http://cscie12.dce.harvard.edu/ Copyright 1998-2007 David P. Heitmeyer Instructor email: [email protected] Course staff email: [email protected] Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html 2 of 101 4/10/2007 9:53 AM HyperText Transfer Protocol GET / view plain print ? minerva% telnet www.npr.org 80 1. Trying 216.35.221.77... 2. Connected to www.npr.org. 3. Escape character is '^]'. 4. GET / HTTP/1.1 5. Host: www.npr.org 6. 7. HTTP/1.1 200 OK 8. Date: Tue, 10 Apr 2007 20:07:33 GMT 9. Server: Apache 10. Set-Cookie: Apache=140.247.197.240.289451144786054516; path=/ 11. Cache-Control : max-age=0 12. Expires: Tue, 10 Apr 2007 20:07:33 GMT 13. Transfer-Encoding : chunked 14. Content-Type : text/html 15. 16. 76c 17. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 18. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 19. <html xmlns="http://www.w3.org/1999/xhtml"> 20. <head> 21. <title>NPR - National Public Radio - News, Arts, World, US.</title> 22. <!-- content removed --> 23. </html> 24. 25. 26.

Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

1 of 101 4/10/2007 9:53 AM

Table of Contents | All Slides | Link List | CSCI E-12

Hypertext Transfer ProtocolApril 10, 2007

Harvard University Division of Continuing Education

Extension School

Course Web Site: http://cscie12.dce.harvard.edu/

Copyright 1998-2007 David P. Heitmeyer

Instructor email: [email protected] Course staff email: [email protected]

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

2 of 101 4/10/2007 9:53 AM

HyperText Transfer Protocol

GET /

view plain print ?

minerva% telnet www.npr.org 80 1.Trying 216.35.221.77... 2.Connected to www.npr.org. 3.Escape character is '^]'. 4.GET / HTTP/1.1 5.Host: www.npr.org 6. 7.HTTP/1.1 200 OK 8.Date: Tue, 10 Apr 2007 20:07:33 GMT 9.Server: Apache 10.Set-Cookie: Apache=140.247.197.240.289451144786054516; path=/ 11.Cache-Control: max-age=0 12.Expires: Tue, 10 Apr 2007 20:07:33 GMT 13.Transfer-Encoding: chunked 14.Content-Type: text/html 15. 16.76c 17.<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 18. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 19.<html xmlns="http://www.w3.org/1999/xhtml"> 20.<head> 21.<title>NPR - National Public Radio - News, Arts, World, US.</title> 22.<!-- content removed --> 23.</html> 24. 25. 26.

Page 2: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

3 of 101 4/10/2007 9:53 AM

The Internet

(adapted from The Internet Book, 2nd edition by Douglas E. Comer)

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

4 of 101 4/10/2007 9:53 AM

TCP/IP: Transmission Control Protocol/Internet Protocol

The TCP/IP Internet Protocol Suite

IP (Internet Protocol): provides basic communication rules

TCP (Transmission Control Protocol): provides additional facilities

IP Addresses

The name www.fas.harvard.edu resolves to the IP address of 140.247.34.66. You can use the tool host or dig to lookup the IP to name or name to IP number.

view plain print ?

[dheitmey@minerva dheitmey]$ host www.fas.harvard.edu 1.www.fas.harvard.edu has address 140.247.34.66 2.[dheitmey@minerva dheitmey]$ host minerva.dce.harvard.edu 3.minerva.dce.harvard.edu has address 140.247.197.240 4.[dheitmey@minerva dheitmey]$ host www.npr.org 5.www.npr.org has address 216.35.221.77 6. 7.

Page 3: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

5 of 101 4/10/2007 9:53 AM

Hostnames

cscie12.dce.harvard.edu

.eduharvard.edudce.harvard.educscie12.dce.harvard.edu

Numbers

Internet Domain Survey reports 433,193,199 hosts in the Domain Name Service (Source: InternetSoftware Consortium ( http://www.isc.org/).

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

6 of 101 4/10/2007 9:53 AM

Clients and Servers

client-server computing The interaction between two programs when they communicate across a network. A program at onesite sends a request to a program at another site and awaits a response. The requesting program iscalled a client; the program satisfying the request is called the server. (definition from The Internet Book,2nd edition by Douglas E. Comer)

Page 4: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

7 of 101 4/10/2007 9:53 AM

Application Layer

HTTP (default port 80)FTP (port 21)SMTP (port 25)telnet (port 23)ssh (port 22)

(adapted from The Internet Book, 2nd edition by Douglas E. Comer)

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

8 of 101 4/10/2007 9:53 AM

HyperText Transfer Protocol

Specifies the grammar of a conversation between an HTTP-client (Web Browser) and an HTTP-server(Web Server) is to take place.

GET /

view plain print ?

minerva% telnet www.npr.org 80 1.Trying 216.35.221.77... 2.Connected to www.npr.org. 3.Escape character is '^]'. 4.GET / HTTP/1.1 5.Host: www.npr.org 6. 7.HTTP/1.1 200 OK 8.Date: Tue, 10 Apr 2007 20:07:33 GMT 9.Server: Apache 10.Set-Cookie: Apache=140.247.197.240.289451144786054516; path=/ 11.Cache-Control: max-age=0 12.Expires: Tue, 10 Apr 2007 20:07:33 GMT 13.Transfer-Encoding: chunked 14.Content-Type: text/html 15. 16.76c 17.<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 18. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 19.<html xmlns="http://www.w3.org/1999/xhtml"> 20.<head> 21.<title>NPR - National Public Radio - News, Arts, World, US.</title> 22.<!-- content removed --> 23.</html> 24.

Page 5: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

9 of 101 4/10/2007 9:53 AM

HTTP

Example Request and Loading of a Web Page

As an example, an XHTML page with 7 images and an external CSS file and an external Javascriptfile, the client would make 10 separate requests (1 request for the XHTML resource, 1 request foreach of the seven images, 1 request for the CSS, and 1 request for the JS).

Client Server

HTTP Request for "example.html"

HTTP Response with "example.html" content

Parses through XHTML and determines what other requests it needs to make.

HTTP Request for CSS document

HTTP Response for CSS document

Parses through CSS and determines what other requests it needs to make.

HTTP Request for Javascript document

HTTP Response for Javascript document

HTTP Request for image 1

HTTP Response for image 1

HTTP Request for image 2

HTTP Response for image 2

HTTP Request for image 3

HTTP Response for image 3

HTTP Request for image 4

HTTP Response for image 4

HTTP Request for image 5

HTTP Response for image 5

HTTP Request for image 6

HTTP Response for image 6

HTTP Request for image 7

HTTP Response for image 7

Render page

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

10 of 101 4/10/2007 9:53 AM

HTTP is Stateless

Each requested resource is a separate, independent, request to the server -- it is a stateless protocol.

Page 6: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

11 of 101 4/10/2007 9:53 AM

HTTP Versions

W3C and Internet Engineering Task Force (IETF) oversees the Hypertext Transfer Protocol.

HTTP 1.0 (1996)HTTP 1.1 (1999)Extensions to HTTP

WebDAV

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

12 of 101 4/10/2007 9:53 AM

An HTTP Conversation

Client RequestMETHOD Resource HTTP VersionClient Generated HeadersRequest Body

Server ResponseStatus LineServer Generated HeadersData

Page 7: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

13 of 101 4/10/2007 9:53 AM

HTTP 1.1 Methods

GETPOSTHEADPUTDELETETRACEOPTIONS

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

14 of 101 4/10/2007 9:53 AM

HTTP Response Codes

Code Range Meaning

100's Informational

200's Success

300's Redirected

400's Request Incomplete

500's Server Error

HTTP 1.1 status codes

Common ones:

200 OK301 Moved permanently302 Moved temporarily304 Not modified403 Forbidden404 Not found500 Internal server error

The complete list:

100 Continue101 Switching protocols200 OK201 Created202 Accepted203 Non-authoritative information204 No content205 Reset content206 Partial content300 Multiple choices301 Moved permanently302 Moved temporarily303 See other304 Not modified305 Use proxy400 Bad request401 Unauthorized402 Payment required403 Forbidden404 Not found405 Method not allowed406 Not acceptable407 Proxy authentication required408 Request timeout409 Conflict410 Gone411 Length required412 Precondition failed413 Request entity too large414 Request-URI too long415 Unsupported media type500 Internal server error501 Not implemented502 Bad gateway503 Service unavailable504 Gateway timeout505 HTTP version not supported

Page 8: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

15 of 101 4/10/2007 9:53 AM

Common Request Headers

AcceptAccept-LanguageAuthorizationCookieHostIf-Modified-SinceRefererUser-Agent

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

16 of 101 4/10/2007 9:53 AM

Sample Request Header Values

View some of the Headers your browser is sending to the server

Mozilla Firefox

HTTP_ACCEPT text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=

HTTP_ACCEPT_CHARSET ISO-8859-1,utf-8;q=0.7,*;q=0.7

HTTP_ACCEPT_ENCODING gzip,deflate

HTTP_ACCEPT_LANGUAGE en-us,en;q=0.5

HTTP_CONNECTION keep-alive

HTTP_COOKIE __utma=76898816.1019426635.1173814303.1173814303.1173814303.1__utmz=76898816.1173814303.1.1.utmccn=(referral)|utmcsr=localhost:8nde-textsize=16px

HTTP_HOST cscie12.dce.harvard.edu

HTTP_KEEP_ALIVE 300

HTTP_REFERER http://localhost:8080/cocoon/projects/cscie12/slides/20070410/slide16.htm

HTTP_USER_AGENT Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/2007

Opera 9

HTTP_ACCEPT text/html, application/xml;q=0.9, application/xhtml+xml, image/png,image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1

HTTP_ACCEPT_CHARSET iso-8859-1, utf-8, utf-16, *;q=0.1

HTTP_ACCEPT_ENCODING deflate, gzip, x-gzip, identity, *;q=0

HTTP_ACCEPT_LANGUAGE en-US,en;q=0.9

HTTP_CONNECTION Keep-Alive

HTTP_HOST cscie12.dce.harvard.edu

HTTP_USER_AGENT Opera/9.01 (Windows NT 5.1; U; en)

Internet Explorer 7

HTTP_ACCEPT image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,application/x-shockwave-flash, application/vnd.ms-excel,application/vnd.ms-powerpoint, application/msword, application/xaml+xml, application/vnd.ms-xpsdocument,application/x-ms-xbap, application/x-ms-application, */*

HTTP_ACCEPT_ENCODING gzip, deflate

HTTP_ACCEPT_LANGUAGE en-us

HTTP_CONNECTION Keep-Alive

HTTP_HOST cscie12.dce.harvard.edu

HTTP_UA_CPU x86

HTTP_USER_AGENT Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)

Amaya

Page 9: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

17 of 101 4/10/2007 9:53 AM

HTTP_ACCEPT */*;q=0.1,image/svg+xml,application/mathml+xml,application/xhtml+xml

HTTP_ACCEPT_ENCODING *,gzip

HTTP_CONNECTION TE,Keep-Alive

HTTP_HOST cscie12.dce.harvard.edu

HTTP_TE trailers,deflate

HTTP_USER_AGENT amaya/9.51 libwww/5.4.0

Lynx

HTTP_ACCEPT text/html, text/plain, audio/mod, image/*, application/msword,application/pdf, application/postscript, application/x-java-jnlp-file,text/sgml, video/mpeg, */*;q=0.01

HTTP_ACCEPT_LANGUAGE en

HTTP_HOST cscie12.dce.harvard.edu

HTTP_USER_AGENT Lynx/2.8.5dev.16 libwww-FM/2.14 SSL-MM/1.4.1 OpenSSL/0.9.7a

wget

HTTP_ACCEPT */*

HTTP_CONNECTION Keep-Alive

HTTP_HOST cscie12.dce.harvard.edu

HTTP_USER_AGENT Wget/1.9+cvs-stable (Red Hat modified)

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

18 of 101 4/10/2007 9:53 AM

Experimenting with HTTP

Viewing HTTP Request and Response Headers

Firefox Extension - Live HTTP Headers

Command Line

telnetlwp-request Documentation:

minerva% man lwp-request

minerva% lwp-request -h

An example:

minerva% lwp-request -USed http://www.harvard.edu/

Page 10: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

19 of 101 4/10/2007 9:53 AM

HTTP Method: HEAD

http://cscie12.dce.harvard.edu/http/raspberry.gifcscie12.dce.harvard.edu is 140.247.197.240we must explicitly use port 80Note the two "returns" after we are done with the HTTP request

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/raspberry.gif HTTP/1.1 5.Host: cscie12.dce.harvard.edu <return> 6.<return> 7.HTTP/1.1 200 OK 8.Date: Tue, 10 Apr 2007 20:23:14 GMT 9.Server: Apache/2.0.51 (Fedora) 10.Last-Modified: Wed, 06 Apr 2005 19:30:42 GMT 11.ETag: "461fb8-348c-a0f67c80" 12.Accept-Ranges: bytes 13.Content-Length: 13452 14.Connection: close 15.Content-Type: image/gif 16. 17.Connection closed by foreign host. 18. 19.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

20 of 101 4/10/2007 9:53 AM

Response 404

Sometimes a file is not there...

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/blueberry.gif HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6. 7.HTTP/1.1 404 Not Found 8.Date: Tue, 10 Apr 2007 20:24:27 GMT 9.Server: Apache/2.0.51 (Fedora) 10.Connection: close 11.Content-Type: text/html; charset=iso-8859-1 12. 13.Connection closed by foreign host. 14. 15.

Page 11: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

21 of 101 4/10/2007 9:53 AM

HTTP Header: Host

Problem: "Infinite" domain names; finite IP addresses.

Solution: "Virtual Hosts"

Example: cscie12.dce.harvard.edu, cscie153.dce.harvard.edu, cscisl.dce.harvard.edu andminerva.dce.harvard.edu resolve to the same IP address (140.247.197.240).

Host Header

This is required for HTTP 1.1 requests.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

22 of 101 4/10/2007 9:53 AM

HTTP: Authenticate

HTTP Basic Authentication requires a username and password.

http://cscie12.dce.harvard.edu/http/private/

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/private/ HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6. 7.HTTP/1.1 401 Authorization Required 8.Date: Tue, 10 Apr 2007 20:33:55 GMT 9.Server: Apache/2.0.51 (Fedora) 10.WWW-Authenticate: Basic realm="Private Area" 11.Connection: close 12.Content-Type: text/html; charset=iso-8859-1 13.

Page 12: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

23 of 101 4/10/2007 9:53 AM

HTTP: Authentication/Authorization

The username:password is sent MIME BASE 64 encoded (not encrypted). The username:passwordis guest:knockknock.

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/private/ HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Authorization: BASIC Z3Vlc3Q6a25vY2trbm9jaw== 7. 8.HTTP/1.1 200 OK 9.Date: Tue, 10 Apr 2007 20:35:56 GMT 10.Server: Apache/2.0.51 (Fedora) 11.Accept-Ranges: bytes 12.Connection: close 13.Content-Type: text/html; charset=utf-8 14. 15.Connection closed by foreign host. 16. 17.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

24 of 101 4/10/2007 9:53 AM

HTTP Redirect

http://www.fas.harvard.edu/http://www.fas.harvard.edu/home/

view plain print ?

minerva% telnet www.fas.harvard.edu 80 1.Trying 140.247.34.66... 2.Connected to www.fas.harvard.edu. 3.Escape character is '^]'. 4.HEAD / HTTP/1.1 5.Host: www.fas.harvard.edu 6. 7.HTTP/1.1 301 Moved Permanently 8.Date: Wed, 06 Apr 2005 20:11:43 GMT 9.Server: Apache/1.3.26 (Unix) mod_ssl/2.8.10 OpenSSL/0.9.6g mod_perl/1.24 10.Location: http://www.fas.harvard.edu/home/ 11.Content-Type: text/html; charset=iso-8859-1 12. 13. 14.

Page 13: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

25 of 101 4/10/2007 9:53 AM

Media Types (MIME Types)

How a Browser Knows What Kind of File it is Getting

Multipurpose Internet Mail Extensions (media types). Server will return a media type to client.Client will handle the media appropriately. Some common media types are:

text/htmltext/cssimage/jpegimage/pngimage/gifapplication/pdfapplication/mswordapplication/vnd.ms-excelAll media types listed in /etc/mime.types on minerva

More information about MIME Types is available.

Questions:

How does the server know the media type?How does the client know the media type?How does the client know "what to do with" the file?

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

26 of 101 4/10/2007 9:53 AM

Content Negotiation

Resource can be in multiple formats and languages. Client preferences can determine which resourceis returned.

Content Negiation Resources

ApacheWeek: Content Negotiation Explained http://www.apacheweek.com/features/negotiationApache: Content Negotiation http://httpd.apache.org/docs/content-negotiation.html

Page 14: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

27 of 101 4/10/2007 9:53 AM

Content Negotiation: MIME Types

A file listing :

raspberry.gifraspberry.jpgraspberry.pngraspberry

The HTTP Transaction:

view plain print ?

minerva% ls -l raspberry* 1.16 -rw-r--r-- 1 e12 e12 13452 Apr 6 15:30 raspberry.gif 2.16 -rw-r--r-- 1 e12 e12 16255 Apr 6 15:30 raspberry.jpg 3.12 -rw-r--r-- 1 e12 e12 8899 Apr 6 15:30 raspberry.png 4.

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/raspberry HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6. 7.HTTP/1.1 200 OK 8.Date: Tue, 10 Apr 2007 20:39:20 GMT 9.Server: Apache/2.0.51 (Fedora) 10.Content-Location: raspberry.png 11.Vary: negotiate,accept 12.TCN: choice 13.Last-Modified: Wed, 06 Apr 2005 19:30:42 GMT 14.ETag: "461fba-22c3-a0f67c80;4edcb400" 15.Accept-Ranges: bytes 16.Content-Length: 8899 17.Connection: close 18.Content-Type: image/png 19. 20.Connection closed by foreign host. 21.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

28 of 101 4/10/2007 9:53 AM

Content Negotiation: MIME Types

Client specifies MIME Types it accepts through HTTP "Accept" header.

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/raspberry HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Connection: close 7.Accept: image/jpeg 8. 9.HTTP/1.1 200 OK 10.Date: Tue, 10 Apr 2007 20:40:17 GMT 11.Server: Apache/2.0.51 (Fedora) 12.Content-Location: raspberry.jpg 13.Vary: negotiate,accept 14.TCN: choice 15.Last-Modified: Wed, 06 Apr 2005 19:30:42 GMT 16.ETag: "461fb9-3f7f-a0f67c80;4edcb400" 17.Accept-Ranges: bytes 18.Content-Length: 16255 19.Connection: close 20.Content-Type: image/jpeg 21.

Page 15: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

29 of 101 4/10/2007 9:53 AM

Content Negotiation: Language

lang.htmlEnglish VersionFrench VersionGerman Version

A file listing:

HTTP Transaction:

view plain print ?

minerva% ls -l lang* 1.-rw-r--r-- 1 e12 e12 191 Apr 6 15:30 lang.html.de 2.-rw-r--r-- 1 e12 e12 193 Apr 6 15:30 lang.html.en 3.-rw-r--r-- 1 e12 e12 191 Apr 6 15:30 lang.html.fr 4.

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/lang HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Connection: close 7. 8.HTTP/1.1 200 OK 9.Date: Tue, 10 Apr 2007 20:41:02 GMT 10.Server: Apache/2.0.51 (Fedora) 11.Content-Location: lang.html.en 12.Vary: negotiate,accept-language 13.TCN: choice 14.Accept-Ranges: bytes 15.Content-Length: 193 16.Connection: close 17.Content-Type: text/html; charset=UTF-8 18.Content-Language: en 19. 20.Connection closed by foreign host. 21.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

30 of 101 4/10/2007 9:53 AM

Content Negotiation: Language

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /http/lang HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Connection: close 7.Accept-Language: de 8. 9.HTTP/1.1 200 OK 10.Date: Tue, 10 Apr 2007 20:44:16 GMT 11.Server: Apache/2.0.51 (Fedora) 12.Content-Location: lang.html.de 13.Vary: negotiate,accept-language 14.TCN: choice 15.Accept-Ranges: bytes 16.Content-Length: 191 17.Connection: close 18.Content-Type: text/html; charset=UTF-8 19.Content-Language: de 20. 21.Connection closed by foreign 22.

Page 16: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

31 of 101 4/10/2007 9:53 AM

Content Negotiation Gotcha

Permissions must be set to rwxr-xr-x on directories.

Why?

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

32 of 101 4/10/2007 9:53 AM

HTTP: Moved

Attempt to access: http://www.fas.harvard.edu/~cscie12/

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.>HEAD /~cscie12/ HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6. 7.HTTP/1.1 301 Moved Permanently 8.Date: Tue, 10 Apr 2007 20:47:19 GMT 9.Server: Apache/1.3.26 (Unix) mod_ssl/2.8.10 OpenSSL/0.9.6g mod_perl/1.24 10.Location: http://www.courses.fas.harvard.edu/~cscie12/ 11.Content-Type: text/html; charset=iso-8859-1 12.

Page 17: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

33 of 101 4/10/2007 9:53 AM

HTTP: trailing 'slash' for directories

view plain print ?

minerva% telnet 140.247.197.240 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /images HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6. 7.HTTP/1.1 301 Moved Permanently 8.Date: Tue, 10 Apr 2007 20:48:42 GMT 9.Server: Apache/2.0.51 (Fedora) 10.Location: http://cscie12.dce.harvard.edu/images/ 11.Content-Type: text/html 12. 13.HEAD /images/ HTTP/1.1 14.Host: cscie12.dce.harvard.edu 15. 16.HTTP/1.1 200 OK 17.Date: Tue, 10 Apr 2007 20:48:42 GMT 18.Server: Apache/2.0.51 (Fedora) 19.Connection: close 20.Content-Type: text/html; charset=UTF-8 21.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

34 of 101 4/10/2007 9:53 AM

HTTP 1.1 Improvements over HTTP 1.0

New request methodsPersistent ConnectionsChunked EncodingByte Range OperationsContent NegotiationDigest AuthenticationCaching

Page 18: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

35 of 101 4/10/2007 9:53 AM

Connection: keep-alive

Allows multiple HTTP requests to be made over the same TCP/IP connection.

view plain print ?

[dheitmey@minerva dheitmey]$ telnet www.fas.harvard.edu 80 1.Trying 140.247.197.240... 2.Connected to www.fas.harvard.edu. 3.Escape character is '^]'. 4.HEAD /home/ HTTP/1.1 5.Host: www.fas.harvard.edu 6.Connection: keep-alive 7. 8.HTTP/1.1 200 OK 9.Date: Tue, 13 Apr 2004 19:24:16 GMT 10.Server: Apache/1.3.26 (Unix) mod_ssl/2.8.10 OpenSSL/0.9.6g mod_perl/1.24 11.Keep-Alive: timeout=15, max=100 12.Connection: Keep-Alive 13.Content-Type: text/html 14. 15.HEAD /home/images/ HTTP/1.1 16.Host: www.fas.harvard.edu 17.Connection: close 18. 19.HTTP/1.1 200 OK 20.Date: Tue, 13 Apr 2004 19:24:34 GMT 21.Server: Apache/1.3.26 (Unix) mod_ssl/2.8.10 OpenSSL/0.9.6g mod_perl/1.24 22.Connection: close 23.Content-Type: text/html 24. 25.Connection closed by foreign host. 26.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

36 of 101 4/10/2007 9:53 AM

Caching Related Headers

Local cache and Proxy-server cache

If-Modified-SinceAgeExpiresLast-ModifiedCache-ControlETag

Page 19: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

37 of 101 4/10/2007 9:53 AM

HTTP Example

view plain print ?

minerva% telnet www.apache.org 80 1.Trying 192.87.106.226... 2.Connected to www.APACHE.org. 3.Escape character is '^]'. 4.HEAD /images/asf_logo.gif HTTP/1.1 5.Host: www.apache.org 6.Connection: close 7. 8.HTTP/1.1 200 OK 9.Date: Tue, 10 Apr 2007 20:52:43 GMT 10.Server: Apache/2.2.0 (Unix) 11.Last-Modified: Sun, 20 May 2001 02:19:30 GMT 12.ETag: "c558a6-1c6f-bf85080" 13.Accept-Ranges: bytes 14.Content-Length: 7279 15.Cache-Control: max-age=86400 16.Expires: Wed, 12 Apr 2006 20:52:43 GMT 17.Connection: close 18.Content-Type: image/gif 19.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

38 of 101 4/10/2007 9:53 AM

If-Modified Since

view plain print ?

minerva% telnet www.harvard.edu 80 1.Trying 128.103.60.55... 2.Connected to zooey.harvard.edu. 3.Escape character is '^]'. 4.GET /images/global/banner.gif HTTP/1.1 5.Host: www.harvard.edu 6.Keep-Alive: 300 7.Connection: keep-alive 8.If-Modified-Since: Thu, 12 Sep 2002 21:51:13 GMT 9. 10.HTTP/1.x 304 Not Modified 11.Date: Tue, 10 Apr 2007 20:57:00 GMT 12.Server: Apache/1.3.28 (Unix) PHP/4.3.2 mod_perl/1.28 13.Connection: Keep-Alive 14.Keep-Alive: timeout=15, max=100 15.Etag: "ff-2f98-3d810c51" 16.

Page 20: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

39 of 101 4/10/2007 9:53 AM

Proxy Servers

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

40 of 101 4/10/2007 9:53 AM

Persistent Connection Advantages

For HTTP/1.1, "Connection: keep-alive" is now the default. Multiple HTTP requests can be sentthrough the same TCP/IP connection.

Empirical Measurements:

10 documents; 1.7 kB each5.3 seconds; each request a new connection1.2 seconds; one connection

http://metalab.unc.edu/mdma-release/http-prob.html

Page 21: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

41 of 101 4/10/2007 9:53 AM

HTTP Cookies

HTTP is a stateless protocol. Cookies provide a mechanism to "maintain state".

Cookie Central: The Unofficial Cookie FAQ http://www.cookiecentral.com/faq/ http://www.cookiecentral.com/

Maintaining State with Cookies

HTTP State Management Mechanism http://www.ics.uci.edu/pub/ietf/http/rfc2109.txtCookie Central: The Unofficial Cookie FAQ http://www.cookiecentral.com/faq/ http://www.cookiecentral.com/Persistent Client State HTTP Cookies http://www.netscape.com/newsref/std/cookie_spec.html

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

42 of 101 4/10/2007 9:53 AM

Cookie Example

Server returns cookie to HTTP client ("Set-Cookie" response header)HTTP client returns cookie to server ("Cookie" request header)

ESPN Cookies

Set-Cookie:SWID=C8F9AF31-F170-42BF-9471-50A95DA24C17;path=/;expires=Tue, 10-Apr-2027 03:20:59 GMT;domain=.go.com;

Set-Cookie:DE2=dXNhO21hO2NhbWJyaWRnZTt0MTs1OzQ7NDs1MDY7MDQyLjM4MDstMDcxLjEzNTspath=/; expires=Tue, 17 Apr 2007 03:00:00 GMT; domain=.go.com

view plain print ?

minerva% lwp-request -USed http://www.espn.com/ 1.GET http://espn.go.com/ 2.User-Agent: lwp-request/2.07 3. 4.GET http://www.espn.com/ --> 301 Moved Permanently 5.GET http://espn.go.com/ --> 200 OK 6.Cache-Control: no-cache 7.Date: Tue, 10 Apr 2007 03:20:58 GMT 8.Pragma: no-cache 9.From: SPORTBARWEB08 10.Accept-Ranges: bytes 11.ETag: "802e571f7bc71:1762" 12.Server: Microsoft-IIS/5.0 13.Vary: Accept-Encoding 14.Content-Length: 122217 15.Content-Type: text/html; charset=iso-8859-1 16.Content-Type: text/html; charset=windows-1252 17.Last-Modified: Tue, 10 Apr 2007 03:19:21 GMT 18.Cache-Expires: Tue, 10 Apr 2007 03:24:22 GMT 19.Client-Date: Tue, 10 Apr 2007 03:21:02 GMT 20.Client-Peer: 198.105.193.43:80 21.Client-Response-Num : 1 22.P3P: CP="CAO DSP COR CURa ADMa DEVa TAIa PSAa PSDa IVAi IVDi CONi OUR SAMo OTRo BUS PHY ONL 23.Refresh: 3600 24.Set-Cookie: SWID=C8F9AF31-F170-42BF-9471-50A95DA24C17; path=/; expires=Tue, 10-Apr-2027 03:225.Set-Cookie: DE2=dXNhO21hO2NhbWJyaWRnZTt0MTs1OzQ7NDs1MDY7MDQyLjM4MDstMDcxLjEzNTs4NDA7MjI7ODg526. 27.

Page 22: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

43 of 101 4/10/2007 9:53 AM

Cookie Properties/Attributes

nameexpiresdomainpathsecure

HTTP State Management Mechanism, RFC 2965

RFC 2109, February 1997RFC 2965, October 2000

namecommentcomment URLdiscarddomainmax-agepathportsecureversion

Additional Cookie Notes

Client: 300 total cookies4 kb per cookie20 cookies per server or domain

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

44 of 101 4/10/2007 9:53 AM

Cookie Example: Server Sets a Cookie

Form that will set a Cookie: http://cscie12.dce.harvard.edu/http/cookie.cgi

Set-Cookie HTTP Response Header:

Set-Cookie: YourName=David%20P.%20Heitmeyer; domain=cscie12.dce.harvard.edu; path=/http/;expires=Fri, 13-May-2005 18:05:04 GMT

view plain print ?

minerva% telnet 140.247.197.240 80 1. Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.GET /http/cookie.cgi?name=David%20P.%20Heitmeyer HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Connection: close 7. 8.HTTP/1.1 200 OK 9.Connection: close 10.Date: Wed, 13 Apr 2005 18:05:04 GMT 11.Server: Apache/2.0.49 (Fedora) 12.Content-Type: text/html; charset=ISO-8859-1 13.Client-Date: Wed, 13 Apr 2005 18:05:04 GMT 14.Client-Peer: 140.247.197.240:80 15.Client-Response-Num : 1 16.Client-Transfer-Encoding : chunked 17.Set-Cookie: YourName=David%20P.%20Heitmeyer; \ 18. domain=cscie12.dce.harvard.edu; \ 19. path=/http/; \ 20. expires=Fri, 13-May-2005 18:05:04 GMT 21. 22.<?xml version="1.0" encoding="iso-8859-1"?> 23.<!DOCTYPE html 24. PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 25. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 26.<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US"><head><title>Form</27.</head><body> 28.<h1>Hello, David P. Heitmeyer</h1> 29.</body></html> 30.Connection closed by foreign host. 31. 32.

Page 23: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

45 of 101 4/10/2007 9:53 AM

Cookie Example: Returning a Cookie

Form that will set a Cookie: http://cscie12.dce.harvard.edu/http/cookie.cgi

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.GET /http/cookie.cgi HTTP/1.1 5.Cookie: YourName=David%20P.%20Heitmeyer 6.Host: cscie12.dce.harvard.edu 7.Connection: close 8. 9.HTTP/1.1 200 OK 10.Connection: close 11.Date: Wed, 13 Apr 2005 18:11:40 GMT 12.Server: Apache/2.0.49 (Fedora) 13.Content-Type: text/html; charset=ISO-8859-1 14.Client-Date: Wed, 13 Apr 2005 18:11:40 GMT 15.Client-Peer: 140.247.197.240:80 16.Client-Response-Num : 1 17.Client-Transfer-Encoding : chunked 18. 19.<?xml version="1.0" encoding="iso-8859-1"?> 20.<!DOCTYPE html 21. PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 22. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 23.<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US"> 24.<head><title>Form</title></head><body> 25.<h1>Hello, David P. Heitmeyer</h1> 26. 27.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

46 of 101 4/10/2007 9:53 AM

Your Cookies

Firefox Webdeveloper Toolbar has a "Cookies" section.

Mozilla Cookie Manager

Page 24: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

47 of 101 4/10/2007 9:53 AM

Cookies and Session IDs

A UserID or SessionID (a long character/number string that is uniquely assigned) is often stored incookie. The SessionID is used as the key or identifier when storing information about the user orsession.

For example, a user logs in to a site. If the username and password match, the server sets a cookie("Set-Cookie") in the browser that contains a session id; the server also makes an entry in websitedatabase that maps the session id to the username. When the cookie is returned, the session id isread and the username is looked up in the database.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

48 of 101 4/10/2007 9:53 AM

Google Cookie Example

Using Google's "Preference" page and setting:

Search Language preference to: English, French, GermanSafeSearch Filtering: Strict FilteringNumber of Results: 50

The Cookie name is: PREF The Value is:ID=bb504f37cd318aa9:FF=1:LR=lang_en|lang_fr|lang_de:LD=en:NR=50:TM=1113416195:LM=11134

This cookie contains a session id as well as the values of certain preferences in a colon-separateddata structure.

Page 25: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

49 of 101 4/10/2007 9:53 AM

Cookies and Ad Tracking

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

50 of 101 4/10/2007 9:53 AM

Method: POST

Form that will set a Cookie: http://cscie12.dce.harvard.edu/http/cookie.cgi

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.POST /http/cookie.cgi HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Content-Length: 10 7.Content-Type: application/x-www-form-urlencoded 8. 9.name=David 10.HTTP/1.1 200 OK 11.Date: Wed, 13 Apr 2005 19:31:11 GMT 12.Server: Apache/2.0.49 (Fedora) 13.Set-Cookie: YourName=David; domain=cscie12.dce.harvard.edu; path=/http/; expires=Fri, 13-May14.Content-Length: 319 15.Connection: close 16.Content-Type: text/html; charset=ISO-8859-1 17. 18.<?xml version="1.0" encoding="iso-8859-1"?> 19.<!DOCTYPE html 20. PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 21. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 22.<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US"> 23.<head><title>Form</title> 24.</head><body> 25.<h1>Hello, David</h1> 26.

Page 26: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

51 of 101 4/10/2007 9:53 AM

WebDAV: an extension of HTTP

Web-based Distributed Authoring and Versioning

WebDAV Resources http://www.webdav.org/From the WebDAV Resources :

WebDAV stands for "Web-based Distributed Authoring and Versioning". It is a set ofextensions to the HTTP protocol which allows users to collaboratively edit andmanage files on remote web servers.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

52 of 101 4/10/2007 9:53 AM

HTTP Resources

W3C HTTP http://www.w3.org/Protocols/HTTP Pocket Reference http://www.oreilly.com/catalog/httppr/ by Clinton Wong (O'Reilly).Illustrated Guide to HTTP http://www.manning.com/hethmon/ by Paul Hethmon (Manning Publications; ISBN 0138582262) see sample chapters and resources online.

Other Readings:

W3C Recommendations Reduce 'World Wide Wait' http://www.w3.org/Protocols/NL-PerfNote.htmlApache Week: HTTP version 1.1 http://www.apacheweek.com/features/http11WebTechniques: HTTP 1.1: What's in it for Me? http://www.webtechniques.com/archives/1997/08/webm/Cookie Central: The Unofficial Cookie FAQ http://www.cookiecentral.com/faq/ http://www.cookiecentral.com/

Page 27: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

53 of 101 4/10/2007 9:53 AM

Apache HTTP Server

Apache Software FoundationApache HTTP Server Project

Apache 1.3Apache 2.x

Apache ModulesPHPPerlPythonmany, many others

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

54 of 101 4/10/2007 9:53 AM

Apache: The Most Popular Web Server on the Internet

Netcraft Web Server Survey

Page 28: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

55 of 101 4/10/2007 9:53 AM

Tonight: Configuring Apache with .htaccess files

Custom Error DocumentsRedirectRewriteDirectory IndexSetting HTTP Headers

ExpiresHeaders

Access ControlRequiring a Secure Connection (SSL)

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

56 of 101 4/10/2007 9:53 AM

Apache Configuration Overview

Server Configuration (httpd.conf) Unless you are the server administrator, you generally will not have access to this account. Onthe FAS systems, you do not have read or write access to this file. Server configuration is readat server start or restart.Per Directory (.htaccess) Certain configuration directives for Apache can be placed within per-directory .htaccess files. .htaccess file is read on a per request basis.

Page 29: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

57 of 101 4/10/2007 9:53 AM

.htaccess File Example

document root: /home/e12/htdocs filename: .htaccess location: /home/e12/htdocs/apache/.htaccess contents:

filename: status404.html location: /home/e12/htdocs/apache/status404.html

http://cscie12.dce.harvard.edu/apache/ZZZ.html

ErrorDocument 404 status404.html 1. 2. 3.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

58 of 101 4/10/2007 9:53 AM

Scope of .htaccess files

Directives within .htaccess files apply to the directory that contains the .htaccess file and all its descendants.

Directives within the file, /home/e12/htdocs/.htaccess would apply to all files within and "under" the public_html directory for the user cscie12.

Directives within the file, /home/e12/htdocs/assignments/.htaccess would apply to all files within and "under" the public_html/assignments directory for the user cscie12.

Page 30: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

59 of 101 4/10/2007 9:53 AM

Problems You Will Have with .htaccess files

Internal Server ErrorCan't "see" the fileIncorrect Permissions

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

60 of 101 4/10/2007 9:53 AM

Problems You will encounter when using .htaccess files

500 Internal Server Error If you see begin seeing 500 Internal Server Error responses from the server after you have createdor edited an .htaccessfile, the most likely cause of the problem is incorrect permissions and/or an error in the directivesyntax.

Permissions on the .htaccessfile are not set correctly. Just like HTML and image files, the server must be able to read the.htaccess file. The simplest way to allow that is to make your .htaccess file readable by "other".

Syntax Error. An error in the syntax of a directive the .htaccess file will result in a 500 Internal Server Error. In addition, correct usage of a directive that is not allowed in the .htaccess file will result in a 500status code. Whether or not a directive is allowed depends upon the server configuration file(httpd.conf; AllowOverride) and the directive itself.

view plain print ?

minerva% pwd 1./home/courses/j/h/jharvard/public_html 2.minerva% ls -l .htaccess 3.-rw------- 1 jharvard founder 349 Nov 27 00:03 .htaccess 4.minerva% chmod o+r .htaccess 5.minerva% ls -l ~/public_html/.htaccess 6.-rw----r-- 1 jharvard founder 349 Nov 27 00:03 .htaccess 7.

Page 31: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

61 of 101 4/10/2007 9:53 AM

Problems You will encounter when using .htaccess files

You can't "see" your .htaccess file.

HTTP The web server is typically configured to deny requests for .htaccess files. For example, the file corresponding to the URL, http://cscie12.dce.harvard.edu/.htaccess exists and is readable by the Web server, but if we try to follow the link, we get a 403 Forbidden response.UNIX The ls command will not list files or directories that begin with a '.' (dot). In order to see the .htaccess file when you do a directory listing, use the -a (all) option:SFTP Sometimes your SFTP program will hide the "dot" files unless explicitly told to show them.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

62 of 101 4/10/2007 9:53 AM

Apache Configuration Sections

Configuration directives can be limited by using "sections", such as

DirectoryLocationFilesVirtualHostDirectoryMatchLocationMatchFilesMatch

Within .htaccess

Note that only Files and FilesMatch can be used within .htaccess files.

Examples:

Examples:

<Files .htaccess> 1. Order allow,deny 2. Deny from all 3.</Files> 4. 5.

# deny access to any tilde backup files 1.<Files *~> 2. Order allow,deny 3. Deny from all 4.</Files> 5.

Page 32: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

63 of 101 4/10/2007 9:53 AM

Configuring Apache with .htaccess files

Custom Error DocumentsRedirectRewriteDirectory IndexSetting HTTP Headers

ExpiresHeaders

Access Control

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

64 of 101 4/10/2007 9:53 AM

Custom Error Documents

.htaccess

ErrorDocument directive http://www.apache.org/docs/2.0/mod/core.html#errordocumentCustom Error Responses http://www.apache.org/docs/2.0/custom-error.html

ErrorDocument 401 /apache/status401.html 1.ErrorDocument 403 /apache/status403.html 2.ErrorDocument 404 /apache/status404.html 3. 4.

Page 33: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

65 of 101 4/10/2007 9:53 AM

HTTP Redirect

Fight Linkrot!

Do your part to Fight Linkrot! (Jakob Nielson's Alertbox, http://www.useit.com/alertbox/980614.html )

RedirectRewriteMeta http-equiv refresh

Redirecting Requests

HTTP Status Codes: 301 Moved permanently 302 Moved temporarily

Redirecting client requests can be very useful:

URL moves to a new location Do your part to Fight Linkrot! (Jakob Nielson's Alertbox,http://www.useit.com/alertbox/980614.html )

resource removedsite structure is reorganized

Provide "friendly" or additional URLs to access a resource

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

66 of 101 4/10/2007 9:53 AM

Redirect

Redirect directive http://www.apache.org/docs/2.0/mod/mod_alias.html#redirect

.htaccess

Try it:

http://cscie12.dce.harvard.edu/apache/dce.htmlhttp://cscie12.dce.harvard.edu/apache/church_st

Redirect 302 /apache/dce.html http://www.dce.harvard.edu/ 1.Redirect 301 /apache/church_st http://map.harvard.edu/level3.cfm?mapname=camb_allston&am2. 3. 4.

view plain print ?

minerva% telnet cscie12.dce.harvard.edu 80 1.Trying 140.247.197.240... 2.Connected to cscie12.dce.harvard.edu. 3.Escape character is '^]'. 4.GET /apache/dce.html HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Connection: close 7. 8.HTTP/1.1 302 Found 9.Date: Wed, 13 Apr 2005 20:03:10 GMT 10.Server: Apache/2.0.49 (Fedora) 11.Location: http://www.dce.harvard.edu/ 12.Content-Length: 302 13.Connection: close 14.Content-Type: text/html; charset=iso-8859-1 15. 16. 17.

view plain print ?

minerva% lwp-request -USed http://cscie12.dce.harvard.edu/apache/dce.html 1.GET http://www.dce.harvard.edu/ 2.User-Agent: lwp-request/2.06 3. 4.GET http://cscie12.dce.harvard.edu/apache/dce.html --> 302 Found 5.GET http://www.dce.harvard.edu/ --> 200 OK 6.Connection: Close 7.Date: Wed, 13 Apr 2005 20:01:26 GMT 8.Accept-Ranges: bytes 9.Server: Orion/2.0.6 10.Content-Length: 3619 11.Content-Type: text/html 12.Content-Type: text/html; charset=iso-8859-1 13.Last-Modified: Wed, 27 Oct 2004 18:45:00 GMT 14.Client-Date: Wed, 13 Apr 2005 20:01:49 GMT 15.Client-Peer: 140.247.198.100:80 16.Client-Response-Num : 1 17. 18.

Page 34: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

67 of 101 4/10/2007 9:53 AM

Rewrite

mod_rewrite http://www.apache.org/docs/2.0/mod/mod_rewrite.htmlA Users Guide to URL Rewriting with the Apache Webserver http://www.engelschall.com/pw/apache/rewriteguide/

Rewrite uses regular expressions to match on a pattern and rewrite to a new location. For example,the Derek Bok Center site used to be a "user" account and had the "~bok_cen/" base. When moved toits own virtual host, all of the "~bok_cen" requests could be rewritten to the new site with a singlerewrite rule.

# rewrite for Bok Center 1.RewriteRule ^/~?bok_cen(.*) http://bokcenter.fas.harvard.edu$1 [R=301] 2. 3.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

68 of 101 4/10/2007 9:53 AM

Examples of Rewrite Uses

Provide a standard mechanism to access course Web sites within HarvardCollege.

http://www.courses.harvard.edu/<4 digit catalog number>

For example, Chemistry 7 has a catalog number of 5118, so the URL for the course Web site can be reached through:

http://www.courses.harvard.edu/5118

The "real" location of the site is:

http://my.harvard.edu/icb/icb.do?course=fas-chem7

HASCS Site Restructure

Dozens of rewrite directives were put in place when the HASCS site was restructured so that links todocuments within the previous site would get redirected to the appropriate page in the new site.

Page 35: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

69 of 101 4/10/2007 9:53 AM

Rewrite: Can be conditional

Rewrite rules can conditional (match against almost any environment variable).

Here we match on host and user-agent to deliver an error page explaining that their browser is not supported.

RewriteEngine On 1.RewriteCond %{HTTP_USER_AGENT} ^Lynx 2.RewriteRule ^(index.html)?$ text/ [R=302] 3.

# rewrite rule to catch IE Mac browsers since 1.# the PIN Service does not support them as of 10/16/2006 2.RewriteCond %{HTTP_HOST} "^login.icommons.harvard.edu$" 3.RewriteCond %{HTTP_USER_AGENT} "MSIE 5.*\; Mac_PowerPC" 4.RewriteRule ^/pinproxy.* /pin_error_ie_mac.html [R,L] 5.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

70 of 101 4/10/2007 9:53 AM

An aside: Text-only sites and "link"

Meta-information can be used to describe alternate content.

W3C Web Content Accessibility Guidelines: alternate pages http://www.w3.org/TR/WAI-WEBCONTENT-TECHS/#alt-pages

In ~cscie12/public_html/index2.html

Lynx view of index2.html provides the text-only version as a

link:

<link title="Text-only version" 1. rel="alternate" 2. href="http://cscie12.dce.harvard.edu/text/index.html" 3. media="aural, braille, tty"> 4.

Page 36: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

71 of 101 4/10/2007 9:53 AM

Meta Refresh

Note: redirection may also be achieved on some browsers by using the http-equiv attribute of the <meta> element. More information and examples are provided athttp://www.fas.harvard.edu/~web/tutorial/meta/refresh/ . The recommended method is to do it at the server level.

<!-- in head --> 1.<!-- will redirect in 10 seconds --> 2.<meta http-equiv="Refresh" content="10; URL=http://www.harvard.edu/"> 3.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

72 of 101 4/10/2007 9:53 AM

Directory Index and Listings

Note: Remember the difference between a directory having rwx-----x and rwx---r-x permissions?

DirectoryIndex http://www.apache.org/docs/2.0/mod/mod_dir.html Would you prefer main.html or overview.html to be the default files returned when a directory isrequested?mod_autoindex http://www.apache.org/docs/2.0/mod/mod_autoindex.html Provides for automatic indexing of a directory.

DirectoryIndex

DirectoryIndex index.html main.html overview.html slide1.html 1.

Page 37: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

73 of 101 4/10/2007 9:53 AM

More Control over Directory Listings

mod_autoindex

Basic

Custom

The details:

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

74 of 101 4/10/2007 9:53 AM

view plain print ?

minerva% pwd /home10/c/s/cscie12/public_html/autoindex/ex2 1.minerva% ls -la 2.total 28 3.drwxr-xr-x 2 cscie12 courses 8192 Nov 27 13:28 . 4.drwxr-xr-x 6 cscie12 courses 8192 Nov 27 13:11 .. 5.-rw-r--r-- 1 cscie12 courses 207 Nov 27 13:12 .htaccess 6.-rw-r--r-- 1 cscie12 courses 147 Nov 27 13:09 HEADER.html 7.-rw-r--r-- 1 cscie12 courses 66 Nov 27 13:09 README.html 8.-rw-r--r-- 1 cscie12 courses 4168 Nov 27 12:58 client-server.gif 9.-rw-r--r-- 1 cscie12 courses 906 Nov 27 12:58 slide1.html 10.-rw-r--r-- 1 cscie12 courses 743 Nov 27 12:58 slide2.html 11.-rw-r--r-- 1 cscie12 courses 1208 Nov 27 12:58 slide3.html 12.minerva% cat .htaccess 13.IndexOptions FancyIndexing 14.IndexOptions IconsAreLinks IconHeight=22 IconWidth=20 \ 15. NameWidth=* ScanHTMLTitles SuppressLastModified \ 16. SuppressSize SuppressColumnSorting \ 17. SuppressHTMLPreamble 18.IndexIgnore *.gif .. 19.minerva% 20.

Page 38: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

75 of 101 4/10/2007 9:53 AM

Setting HTTP Headers

ExpiresHeaders

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

76 of 101 4/10/2007 9:53 AM

Expires

Module mod_expires http://www.apache.org/docs/2.0/mod/mod_expires.html

.htaccess

Or, expire based upon modification time of document:

From the Apache mod_expires documentation:

This module controls the setting of the Expires HTTP header in server responses. The expirationdate can set to be relative to either the time the source file was last modified, or to the time of theclient access.

The Expires HTTP header is an instruction to the client about the document's validity andpersistence. If cached, the document may be fetched from the cache rather than from the sourceuntil this time has passed. After that, the cache copy is considered "expired" and invalid, and a newcopy must be obtained from the source.

ExpiresActive On 1. 2.ExpiresByType text/html A3600 3.# HTML expires in 1 hour 4. 5.ExpiresByType image/gif A2592000 6.# GIF expires in 30 days 7. 8.ExpiresByType image/jpeg A2592000 9.# JPEG expires in 30 days 10. 11.ExpiresByType image/png A2592000 12.# PNG expires in 30 days 13. 14.# types not specified 15.ExpiresDefault "now plus 1 day" 16.# expires in 1 day 17.

ExpiresActive On 1.ExpiresByType text/html M86400 2.# HTML expires 1 day after it was last modified 3.ExpiresDefault M86400 4.

Page 39: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

77 of 101 4/10/2007 9:53 AM

Do not cache

If you do not want your page cached, set these HTTP response headers:

In .htaccess in Apache, this would translate to:

view plain print ?

Cache-control: no-cache 1.Pragma: no-cache 2.Expires: <set to now> 3.

view plain print ?

ExpiresDefault "now" 1.Header set Pragma "no-cache" 2.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

78 of 101 4/10/2007 9:53 AM

Headers

mod_headers http://www.apache.org/docs/2.0/mod/mod_headers.html

The optional headers module allows for the customization of HTTP response headers. Headers canbe merged, replaced or removed. The server will always add a "Server" and "Date" header to the HTTP response.

view plain print ?

Header set Author "David P. Heitmeyer" 1.

Page 40: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

79 of 101 4/10/2007 9:53 AM

Usertrack with Cookies

mod_usertrack

.htaccess file:

view plain print ?

minerva% lwp-request -USed http://cscie12.dce.harvard.edu/apache/usertrack/ 1.GET http://cscie12.dce.harvard.edu/apache/usertrack/ 2.User-Agent: lwp-request/1.38 3. 4.GET http://cscie12.dce.harvard.edu/apache/usertrack/ --> 200 OK 5.Connection: close 6.Date: Wed, 13 Apr 2005 20:32:54 GMT 7.Accept-Ranges: bytes 8.Server: Apache/2.0.49 (Fedora) 9.Content-Length: 59 10.Content-Type: text/html; charset=UTF-8 11.Client-Date: Wed, 13 Apr 2005 20:32:54 GMT 12.Client-Peer: 140.247.197.240:80 13.Client-Response-Num : 1 14.Set-Cookie2: MyCookie=140.247.197.240.1113424374035983; \ 15. path=/; max-age=2858400; \ 16. domain=.dce.harvard.edu; version=1 17. 18. 19.

CookieTracking on 1.CookieStyle RFC2965 2.CookieName MyCookie 3.CookieExpires "1 month 3 days 2 hours" 4.CookieDomain .dce.harvard.edu 5. 6.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

80 of 101 4/10/2007 9:53 AM

WWW Access Control

You can implement access control on all or part of your Web site so that:

users must provide a username and password (Basic Authentication);users' computers must be within a particular domain

Page 41: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

81 of 101 4/10/2007 9:53 AM

Basic Authentication: Warning

Basic Authentication alone does not provide the security and privacy to adequately protecttruly confidential or personal information.

Basic Authentication is analogous to simply "closing a door" to parts of your Web site. It will preventthe casual or polite users from "opening the door", but will not prevent someone mildly determined towalking in.

Two issues that contribute to the lack of security and privacy are:

the content is transmitted over the network in plaintextthe usernames and passwords (submitted with each HTTP request) is transmitted over thenetwork in plaintext

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

82 of 101 4/10/2007 9:53 AM

HTTP: Authenticate

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /apache/access/example1/ HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6. 7.HTTP/1.1 401 Authorization Required 8.Connection: close 9.Date: Wed, 13 Apr 2005 20:44:39 GMT 10.Server: Apache/2.0.49 (Fedora) 11.WWW-Authenticate: Basic realm="Basic Authentication Tutorial 1" 12.Content-Length: 492 13.Content-Type: text/html; charset=iso-8859-1 14.Client-Date: Wed, 13 Apr 2005 20:44:39 GMT 15.Client-Peer: 140.247.197.240:80 16.Client-Response-Num : 1 17.

Page 42: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

83 of 101 4/10/2007 9:53 AM

HTTP: Authentication/Authorization

The username:password is sent MIME BASE 64 encoded (not encrypted).

view plain print ?

minerva% telnet 140.247.197.240 80 1.Trying 140.247.197.240... 2.Connected to 140.247.197.240. 3.Escape character is '^]'. 4.HEAD /apache/access/example1/ HTTP/1.1 5.Host: cscie12.dce.harvard.edu 6.Authorization: Basic Z3Vlc3Q6Z3Vlc3Q= 7. 8.HTTP/1.1 200 OK 9.Connection: close 10.Date: Wed, 13 Apr 2005 20:47:53 GMT 11.Accept-Ranges: bytes 12.Server: Apache/2.0.49 (Fedora) 13.Content-Length: 124 14.Content-Type: text/html; charset=UTF-8 15.Client-Date: Wed, 13 Apr 2005 20:47:53 GMT 16.Client-Peer: 140.247.197.240:80 17.Client-Response-Num : 1 18. 19. 20.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

84 of 101 4/10/2007 9:53 AM

Access Control Documentation

Apache

Apache FAQ has a section on user authentication.Using User Authentication from Apache WeekRelevant Apache Module and Directive Documentation

mod_access modulemod_auth modulerequire directivesatisfy directive

Page 43: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

85 of 101 4/10/2007 9:53 AM

Implementing Access Control

To implement access control, you must create a file name '.htaccess' that contains with the properconfiguration instructions. You may also need to create a ".htpasswd" file using the utility "htpasswd"and a ".htgroup" file.

htpasswd program.htaccess filehtpasswd filehtgroup file

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

86 of 101 4/10/2007 9:53 AM

htpasswd file

.htpasswd This file contains usernames and encrypted passwords (username:enc_passwd). It is created andmanaged with the utility, "htpasswd", which can be run from the command line.

This file should notlie within your public_html. It should reside at the root level of your home directory (for example,/home/courses/j/h/jharvard/.htpasswd

This file needs to be readable by the Web Server.

Sample content:

view plain print ?

minerva% which htpasswd 1./usr/bin/htpasswd 2.minerva% htpasswd 3.Usage: htpasswd [-c] passwordfile username 4.The -c flag creates a new file. 5. 6.

view plain print ?

minerva% more ~e12/.htpasswd.demo 1.guest:79WeSn3vYGsKQ 2.guest2:wGcgIYLtHNIpM 3.guest3:j9VzpSX/C8Kr2 4.guest4:CjHmW1PWNFwXM 5.

Page 44: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

87 of 101 4/10/2007 9:53 AM

htgroup file

.htgroup This file contains group definitions (group_name:member1 member2 ...).

This file should notlie within your public_html. It should reside at the root level of your home directory (for example,/home/courses/j/h/jharvard/.htgroup

This file needs to be readable by the Web Server.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

88 of 101 4/10/2007 9:53 AM

Access Control Examples

For the examples given, the user "cscie12" is used. You should substitute your username andhome directory appropriately.

The following .htpasswd.demo and .htgroup.demo files are used:

/home/e12/.htpasswd.demo The .htpasswd.demo was generated by using the utility "htpasswd"

Password for "guest" (and all other entries) is "guest". Entries for guest2, guest3, and guest4 arecreated without the "-c" flag, since the .htpasswd.demo file already exists.

Contents of file:

.htgroup.demo Contents of file:

view plain print ?

minerva% htpasswd 1.Usage: htpasswd [-c] passwordfile username 2.The -c flag creates a new file. 3.minerva% htpasswd -c /home/e12/.htpasswd.demo guest 4.Adding password for guest 5.New password: ***** 6.Re-type password: ***** 7.

guest:79WeSn3vYGsKQ 1.guest2:PR4APgA.4CKO. 2.guest3:5DbCMPbSDstj2 3.guest4:htPnr8jT4bI5E 4.

view plain print ?

VIP: guest guest4 1. 2.

Page 45: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

89 of 101 4/10/2007 9:53 AM

Access Control Example 1

Any valid user in .htpasswd.demo is allowed access

The"AuthName" is the description that is displayed by the browser in the Basic Authentication dialogbox.

Contents of sample .htaccess file:

Demonstration of Example 1 You may login as any of the following users (username:password): guest:guest guest2:guest guest3:guest guest4:guest

AuthName "Basic Authentication Tutorial 1" 1.AuthType Basic 2.AuthUserFile /home/e12/.htpasswd.demo 3.require valid-user 4.

view plain print ?

minerva% lwp-request -USed -C"guest:iforgot" http://cscie12.dce.harvard.edu/apache/access/ex1.GET http://cscie12.dce.harvard.edu/apache/access/example1/ 2.Authorization: Basic Z3Vlc3Q6aWZvcmdvdA== 3.User-Agent: lwp-request/2.06 4. 5.GET http://cscie12.dce.harvard.edu/apache/access/example1/ --> 401 Authorization Required 6.GET http://cscie12.dce.harvard.edu/apache/access/example1/ --> 401 Authorization Required 7.Connection: close 8.Date: Wed, 13 Apr 2005 20:53:42 GMT 9.Server: Apache/2.0.49 (Fedora) 10.WWW-Authenticate: Basic realm="Basic Authentication Tutorial 1" 11.Content-Length: 492 12.Content-Type: text/html; charset=iso-8859-1 13.Client-Date: Wed, 13 Apr 2005 20:53:42 GMT 14.Client-Peer: 140.247.197.240:80 15.Client-Response-Num : 1 16.Client-Warning: Credentials for 'guest' failed before 17.Title: 401 Authorization Required 18.X-Pad: avoid browser bug 19. 20.minerva% lwp-request -USed -C"guest2:guest2" http://cscie12.dce.harvard.edu/apache/access/ex21.GET http://cscie12.dce.harvard.edu/apache/access/example1/ 22.Authorization: Basic Z3Vlc3QyOmd1ZXN0Mg== 23.User-Agent: lwp-request/2.06 24. 25.GET http://cscie12.dce.harvard.edu/apache/access/example1/ --> 401 Authorization Required 26.GET http://cscie12.dce.harvard.edu/apache/access/example1/ --> 200 OK 27.Connection: close 28.Date: Wed, 13 Apr 2005 20:59:05 GMT 29.Accept-Ranges: bytes 30.Server: Apache/2.0.49 (Fedora) 31.Content-Length: 124 32.Content-Type: text/html; charset=UTF-8 33.Client-Date: Wed, 13 Apr 2005 20:59:05 GMT 34.Client-Peer: 140.247.197.240:80 35.Client-Response-Num : 1 36.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

90 of 101 4/10/2007 9:53 AM

Access Control Example 2

Only certain users in .htpasswd.demo are allowed access

Contents of sample .htaccess file:

Demonstration of Example 2 Only guest2 and guest3 are authorized: guest2:guest guest3:guest

Unauthorized: guest:guest guest4:guest

AuthName "Basic Authentication Tutorial 2" 1.AuthType Basic 2.AuthUserFile /home/e12/.htpasswd.demo 3.require user guest2 guest3 4.

Page 46: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

91 of 101 4/10/2007 9:53 AM

Access Control Example 3

Only members of a particular group are allowed access

Contents of .htaccess file:

Contents of .htgroup.demo file:

Demonstration of Example 3 Only members of the group "VIP" (as defined by /home/e12/.htgroup.demo) are authorized (guest andguest4): guest:guest guest4:guest

Unauthorized: guest2:guest guest3:guest

AuthName "Basic Authentication Tutorial 3" 1.AuthType Basic 2.AuthUserFile /home/e12/.htpasswd.demo 3.AuthGroupFile /home/e12/.htgroup.demo 4.require group VIP 5.

view plain print ?

VIP: guest guest4 1. 2.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

92 of 101 4/10/2007 9:53 AM

Access Control Example 4

Only certain computers are allowed access

Contents of sample .htaccess file:

Demonstration of Example 4 Computers that are on the Harvard network (computers with hostnames ending in .harvard.edu orwith IP addreses beginning with 128.103 or 140.247) will have access, others will be denied.

order deny,allow 1.deny from all 2.allow from 140.247 3.allow from 128.103 4.allow from .harvard.edu 5.

Page 47: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

93 of 101 4/10/2007 9:53 AM

Access Control Example 5

Only certain computers are denied access

Contents of sample .htaccess file:

Demonstration of Example 5 Connections from within the domain 'fas.harvard.edu' will be denied.

order allow,deny 1.allow from all 2.deny from .fas.harvard.edu 3.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

94 of 101 4/10/2007 9:53 AM

Access Control Example 6

Certain computers are allowed in; others must provide a username and password

Contents of sample .htaccess file:

Demonstration of Example 6 Connection from within ".yale.edu" will be allowed; others must provide a valid username andpassword.

order deny,allow 1.deny from all 2.allow from .yale.edu 3.AuthType Basic 4.AuthUserFile /home/e12/.htpasswd.demo 5.AuthName "Basic Authentication Tutorial 6" 6.require valid-user 7.satisfy any 8.

Page 48: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

95 of 101 4/10/2007 9:53 AM

Access Control Example 7

Only certain computers are allowed in and users must provide a valid username andpassword.

Contents of sample .htaccess file:

Demonstration of Example 7 and satisfy all

order deny,allow 1.deny from all 2.allow from .harvard.edu 3.AuthType Basic 4.AuthUserFile /home/e12/.htpasswd.demo 5.AuthName "Basic Authentication Tutorial 7" 6.require valid-user 7.satisfy all 8.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

96 of 101 4/10/2007 9:53 AM

Requiring SSL (https://)

SSL (Secure Socket Layer) is a protocol that encrypts data between the client and the server. https is HTTP over SSL. More details in our last lecture on Security and Privacy.

Contents of sample .htaccess file:

Allowed: https://www.people.fas.harvard.edu/~heitmey/secure/index.htmlForbidden: http://www.people.fas.harvard.edu/~heitmey/secure/index.html

SSLRequireSSL 1.

Page 49: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

97 of 101 4/10/2007 9:53 AM

Details about enabling .htaccess and allowed directives

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

98 of 101 4/10/2007 9:53 AM

Legal Directives I: Context

Certain Apache directives are legal within .htaccess files. Some are not. See the Apache Documentation for details. Specifically, look at the Context line that is given for thedirective in question.

Apache Core Features http://www.apache.org/docs/2.0/mod/core.htmlApache Module List http://www.apache.org/docs/2.0/mod/standard Apache Directives http://www.apache.org/docs/2.0/mod/directives.html

The following is an excerpt from the Apache HTTP Server Version 1.3 documentation

ErrorDocument directive

Syntax: ErrorDocument error-code document Context: server config, virtual host, directory, .htaccess Status: core Override: FileInfo Compatibility: The directory and .htaccess contexts are only available in Apache 1.1 and later.

Also, the "a" indicator on the Apache Quick Reference Card indicates that the directive is valid within an .htaccess file.

Page 50: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

99 of 101 4/10/2007 9:53 AM

Legal Directives II: AllowOverride

Users are allowed to override certain aspects of the main server configuration. The main server configuration file (httpd.conf) contains an AllowOverride directive that determines which directives within .htaccess files Apache will process. The Override line that is given for eachdirective in the Apache documentationindicates which configuration directive must be active in order to use that directive with an .htaccessfile.

For the FAS system, the main server configuration file has the following directive in place for users'public_html directories:

The following is an excerpt from the Apache HTTP Server Version 1.3 documentation

ErrorDocument directive

Syntax: ErrorDocument error-code document Context: server config, virtual host, directory, .htaccess Status: core Override: FileInfo Compatibility: The directory and .htaccess contexts are only available in Apache 1.1 and later.

AllowOverride FileInfo AuthConfig Limit Indexes Options 1. 2.

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

100 of 101 4/10/2007 9:53 AM

Legal Directives III: Apache Modules

Apache is distributed with several modules. These modules may or may not be active within the Apache server with which you are working. The Core features will always be available.

For example, if the Rewrite Module (mod_rewrite) has not been activated, none of the Rewritedirectives will be available to use.

Refer to the Status and Modulelines in the documentation for each directive and to the documentation for the specific Apacheinstallation you are using.

Page 51: Hypertext Transfer Protocol - Harvard Universitycscie12.dce.harvard.edu/lecture_notes/2006-07/20070410.pdfApr 10, 2007  · The interaction between two programs when they communicate

Hypertext Transfer Protocol http://localhost:8080/cocoon/projects/cscie12/slides/20070410/handout.html

101 of 101 4/10/2007 9:53 AM

Apache Modules

On the Apache (Apache/2.0.51) minerva.dce.harvard.edu web server, the following Apache modulesare active:

LoadModule access_module modules/mod_access.soLoadModule auth_module modules/mod_auth.soLoadModule auth_anon_module modules/mod_auth_anon.soLoadModule auth_dbm_module modules/mod_auth_dbm.soLoadModule auth_digest_module modules/mod_auth_digest.soLoadModule ldap_module modules/mod_ldap.soLoadModule auth_ldap_module modules/mod_auth_ldap.soLoadModule include_module modules/mod_include.soLoadModule log_config_module modules/mod_log_config.soLoadModule env_module modules/mod_env.soLoadModule mime_magic_module modules/mod_mime_magic.soLoadModule cern_meta_module modules/mod_cern_meta.soLoadModule expires_module modules/mod_expires.soLoadModule deflate_module modules/mod_deflate.soLoadModule headers_module modules/mod_headers.soLoadModule usertrack_module modules/mod_usertrack.soLoadModule setenvif_module modules/mod_setenvif.soLoadModule mime_module modules/mod_mime.soLoadModule dav_module modules/mod_dav.soLoadModule status_module modules/mod_status.soLoadModule autoindex_module modules/mod_autoindex.soLoadModule asis_module modules/mod_asis.soLoadModule info_module modules/mod_info.soLoadModule dav_fs_module modules/mod_dav_fs.soLoadModule vhost_alias_module modules/mod_vhost_alias.soLoadModule negotiation_module modules/mod_negotiation.soLoadModule dir_module modules/mod_dir.soLoadModule imap_module modules/mod_imap.soLoadModule actions_module modules/mod_actions.soLoadModule speling_module modules/mod_speling.soLoadModule userdir_module modules/mod_userdir.soLoadModule alias_module modules/mod_alias.soLoadModule rewrite_module modules/mod_rewrite.soLoadModule proxy_module modules/mod_proxy.soLoadModule proxy_ftp_module modules/mod_proxy_ftp.soLoadModule proxy_http_module modules/mod_proxy_http.soLoadModule proxy_connect_module modules/mod_proxy_connect.soLoadModule cache_module modules/mod_cache.soLoadModule suexec_module modules/mod_suexec.soLoadModule disk_cache_module modules/mod_disk_cache.soLoadModule file_cache_module modules/mod_file_cache.soLoadModule mem_cache_module modules/mod_mem_cache.soLoadModule cgi_module modules/mod_cgi.soLoadModule dav_svn_module /usr/lib/httpd/modules/mod_dav_svn.soLoadModule authz_svn_module /usr/lib/httpd/modules/mod_authz_svn.so

Table of Contents | All Slides | Link List | CSCI E-12