58
06/06/22 HTTP, CGI and Cookies 1 Foundations of the Web: HTTP, CGI and Cookies Ethan Cerami New York University

Http Cgi Cookies

Embed Size (px)

Citation preview

Page 1: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 1

Foundations of the Web:HTTP, CGI and Cookies

Ethan CeramiNew York University

Page 2: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 2

Road Map HTTP Overview Example HTTP Session HTTP 1.0 v. 1.1 Structure of Client Requests/Server

Responses CGI Overview Cookies Overview

Page 3: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 3

HTTP Overview

Page 4: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 4

HTTP Overview HTTP: HyperText Transfer Protocol Developed by Tim Berners Lee, 1990 Enables web clients to request documents

from web servers Stateless Protocol

each HTTP request is completely independent. Web Servers do not retain any memory of

related requests. (Cookies are actually used to maintain state,

but more on this later…)

Page 5: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 5

HTTP Client/Server Client/Server Architecture Client: web browser that requests a

document. Examples: Microsoft Internet Explorer,

Netscape Navigator Server: web server that returns a

document Examples: Apache Web Server, Microsoft

IIS, etc.

Page 6: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 6

Http Client/Server

Client Web Browser

Web Server

Give me /index.html

Here you go...

Page 7: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 7

HTTP via Telnet You can run HTTP via the UNIX

Telnet command. Instructions

Log into your UNIX account telnet www.yahoo.com 80 GET /

Good method to learn the details of HTTP

Page 8: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 8

Sample Telnet Sessionbash-2.04$ telnet www.yahoo.com 80Trying 216.32.74.50...Connected to www.yahoo.akadns.net.Escape character is '^]'.

GET /HTTP/1.0 200 OKContent-Length: 15582Content-Type: text/html <html><head><title>Yahoo!</title><base href=http://www.yahoo.com/><meta

http-equiv="PICS-Label" content='(PICS-1.1 "http://www.rsac.org/ratingsv01.html" l gen true for "http://www.yahoo.com" r (n 0 s 0 v 0 l))'></head><body><center><formaction=http://search.yahoo.com/bin/search><map name=m><area coords="0,0,52,52" href=r/a1><area coords="53,0,121,52" href=r/p1><area coords="122,0,191,52" href=r/m1><area

...

Page 9: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 9

Example HTTP Session

Page 10: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 10

Example HTTP Session Client requests the following URL:

http://hypothetical.ora.com:80/ Anatomy of the Request:

http:// HyperText Transfer Protocol; other options: ftp, mailto, etc.

hypothetical.ora.com: host name :80: Port Number. 80 is reserved for

HTTP. Ports can range from: 1-65,535 / Root document

Page 11: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 11

The Client Request Actual Browser Request:GET / HTTP/1.1Accept: image/gif, image/x-xbitmap, image/ jpeg, image/pjpeg, */*

Accept-Language: en-usAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)

Host: hypothetical.ora.comConnection: Keep-Alive

Page 12: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 12

Anatomy of the Client Request GET / HTTP/1.1

Requests the root / document. Specifies HTTP version 1.1. HTTP Versions: 1.0 and 1.1 (more on this

later…) Accept: image/gif, image/x-xbitmap,

image/ jpeg, image/pjpeg, */* Indicates what type of media the browser

will accept.

Page 13: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 13

Anatomy of the Client Request Accept-Language: en-us

Browser’s preferred language Accept-Encoding: gzip, deflate

Accepts compressed data (speeds download times.)

User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) Indicates the browser type.

Page 14: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 14

Anatomy of the Client Request Host: hypothetical.ora.com

Required for HTTP 1.1 Optional for HTTP 1.0 A Server may host multiple

hostnames. Hence, the browser indicates the host name here.

Connection: Keep-Alive Enables “persistent connections”.

Faster performance (more later…)

Page 15: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 15

Server ResponseHTTP/1.1 200 OKDate: Mon, 24 Sept 2003 20:54:26 GMTServer: Apache/1.3.6 (Unix)Last-Modified: Mon, 24 Sept 2003 14:06:11 GMTContent-length: 327Connection: closeContent-type: text/html <title>Sample Homepage</title><img src="/images/oreilly_mast.gif"><h1>Welcome</h2>Hi there, this is a simple web page.

Granted, it may not be as elegant as some other web pages you've seen on the net, but there are some common qualities...

Page 16: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 16

Anatomy of Server Response HTTP/1.1 200 OK

Server Status Code Code 200: Document was found We will examine other status codes

shortly. Date: Mon, 24 Sept 2003 20:54:26

GMT Date on the server. GMT (Greenwich Mean Time)

Page 17: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 17

Anatomy of Server Response Last-Modified: Mon, 24 Sept 2003

14:06:11 GMT Indicates the time when the document

was last modified. Very useful for browser caching. If a browser already has the page in its

cache, it may not need to request the whole document again (more later…)

Page 18: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 18

Anatomy of Server Response Content-length: 327

Number of bytes in the document response.

Connection: close Indicates that the server will close the

connection. If the client wants to send another

request, it will need to open another connection to the server.

Page 19: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 19

Anatomy of Server Response Content-type: text/html

Indicates the MIME Type of the return document.

Multi-Purpose Internet Mail Extensions Enables web servers to return binary or text

files. Other MIME Categories:

audio, video, images, xml

Full list of MIME Types available online at: http://www.iana.org/assignments/media-types/

Page 20: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 20

Anatomy of Server Response

The actual HTML document:<title>Sample Homepage</title>

<img src="/images/oreilly_mast.gif">

<h1>Welcome</h2>Hi there, this is a simple web page. Granted, it may not

be as elegant as some other web pages you've seen on the net, but there are

some common qualities…

Page 21: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 21

HTTP 1.0 v. 1.1

Page 22: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 22

Getting Images Once a browser receives an HTML

page, it makes separate connections to retrieve the images.

Client Web Browser

Web Server

Give me /index.html

Here you go...

Now, give me logo.gif

Here you go...

Page 23: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 23

HTTP 1.0 v. 1.1 HTTP 1.0:

For each request, you must open a new connection with the server.

HTTP 1.1 For each request, the default action is to

maintain an open connection with the server.

Faster, Persistent Connections Supported by most browsers and

servers.

Page 24: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 24

Example: HTTP 1.0 v. 1.1 HTTP 1.0: Get HTML Page plus Images

Open Connection: GET /index.html Open Connection: GET /logo.gif Open Connection: GET /button.gif

HTTP 1.1: Get HTML Page plus Images Open Persistent Connection: GET

/index.html GET /logo.gif GET /button.gif

Page 25: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 25

Structure of Client Requests

Page 26: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 26

Client Requests Every client request includes three

parts: Method: Used to indicate type of

request, HTTP Version and name of requested document.

Header Information: Used to specify browser version, language, etc.

Entity Body: Used to specify form data for POST requests.

Page 27: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 27

Client Methods GET:

This is the same GET that we discussed for HTML forms.

POST: This is the same POST method that

we discussed for HTML forms. Data is sent in the entity portion of

the HTTP request.

Page 28: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 28

One More Client Method HEAD:

Similar to GET, except that the method requests only the header information.

Server will return date-modified, but will not return the data portion of the requested document.

Useful for browser caching. For example:

If browser contains a cached version of a page, it issues a head request.

If document has not been modified recently, use cached version.

Page 29: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 29

Structure of Server Responses

Page 30: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 30

Server Responses Every server response includes

three parts: Response line: HTTP version number,

three digit status code, and status message.

Header: Information about the server Entity Body: The actual data.

Page 31: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 31

Server Status Codes 100-199 Informational 200-299 Client Request

Successful 300-399 Client Request

Redirected 400-499 Client Request

Incomplete 500-599 Server Errors

Page 32: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 32

Some Important Status Codes 200: OK

Request was successful.

301: Moved Permanently Server redirects client to a new URL.

404 Not Found Document does not exist

500 Internal Server Error Error within the Web Server

All other status codes are available online at: http://www.w3.org/Protocols/HTTP/HTRESP.html

Page 33: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 33

Common Gateway InterfaceCGI Overview

Page 34: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 34

Common Gateway Interface What is CGI?

A general framework for creating server side web applications.

Instead of returning a static web document, web server returns the results of a program.

For example browser sends the parameter: name=Ethan. Web server passes the request to a Perl program. Perl Program returns HTML that says, Hello,

Ethan!

Page 35: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 35

CGI Overview

Web Browser

Web Server

C/PerlProgram

Name=Ethan Name=Ethan

Hello, Ethan!Hello, Ethan!

Page 36: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 36

Notes on CGI The first mechanism for creating

dynamic web sites. What languages can you create

CGI programs in? Just about any language: C/C++,

Perl, Java, etc.

Page 37: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 37

CGI Environment Variables CGI includes a number of environment

variables. REMOTE_ADDR: Address of client browser SERVER_NAME: The Server Host Name or IP

Address SERVER_SOFTWARE: Name and version of

the server software. QUERY_STRING: A String of GET or POST

Form Variables.

Page 38: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 38

Hello, World CGI#!/usr/bin/perl

print "Content-type: text/html\n\n";

print "Hello, World!\n";

Page 39: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 39

From CGI to Servlets… That’s all you are going to cover on

CGI? Yes, CGI still represents a good way to

create dynamic web applications. Nonetheless, Servlets represent a more

powerful architecture… If you want to get more information on

CGI, check out: CGI Programming with Perl (O’Reilly Press.)

Page 40: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 40

Cookies Overview

Page 41: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 41

What is a Cookie? Small piece of data generated by a web

server, stored on the client’s hard drive. Serves as an add-on to the HTTP

specification (remember, HTTP by itself is stateless.)

Still somewhat controversial, as it enables web sites to track web users and their habits…

Page 42: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 42

Example Cookie Use Web Site Acme.com wants to track the number of

unique visitors who access its site. If Acme.com checks the HTTP Server logs, it can

determine the number of “hits”, but cannot determine the number of unique visitors.*

That’s because HTTP is stateless. It retains no memory regarding individual users.

Cookies provide a mechanism to solve this problem.

* Actually, you could check the log files for IP addresses, but you would still have the problem of Internet proxies.

Page 43: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 43

Tracking Unique Visitors Step 1: Person A requests home page for acme.com Step 2: Acme.com Web Server generates a new

unique ID. Step 3: Server returns home page plus a cookie set

to the unique ID. Step 4: Each time Person A returns to acme.com,

the browser automatically sends the cookie along with the GET request.

Page 44: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 44

Cookie Conversation

Browser Server

Give me the home page!

Here’s the home page plusa cookie.

Now, give me the news page(cookie is sent automatically)

I’ve seen you before… Here’sthe news page.

Page 45: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 45

Cookie Notes Created in 1994 for Netscape 1.1 Cookies cannot be larger than 4K No domain (e.g. netscape.com,

microsoft.com) can have more than 20 cookies.

Cookies stay on your machine until: they automatically expire they are explicitly deleted

Cookies work the same on all browsers. No cross-browser problems here!

Page 46: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 46

Magical Cookies The term cookie comes from an old

programming hack, called Magical Cookies.

If a programmer couldn’t make two parts of a program communicate, he would create a “magical cookie”, a small text file containing data to transfer between program parts.

Page 47: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 47

Cookie Standards Version 0 (Netscape):

The original cookie specification Implemented by all browsers and servers We will focus on this Version

Version 1 A proposed standard of the Internet Engineering

Task Force (IETF) Request for Comment 2109 Unfortunately, not very widely used (hence, we will

stick to Version 0.)

Page 48: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 48

Why use Cookies? Tracking unique visitors Creating personalized web sites Shopping Carts Tracking users across your site:

e.g. do users that visit your sports news page also visit your sports store?

Page 49: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 49

Cookie Anatomy

Page 50: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 50

Cookie Anatomy Version 0 specifies six cookie parts:

Name Value Domain Path Expires Secure

Page 51: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 51

Cookie Parts: Name/Value Name

Name of your cookie (Required) Cannot contain white spaces, semicolons

or commas. Value

Value of your cookie (Required) Cannot contain white spaces, semicolons

or commas.

Page 52: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 52

Cookie Parts: Domain Only pages from the domain which created a cookie are allowed

to read the cookie. For example, amazon.com cannot read yahoo.com’s cookies

(imagine the security flaws if this were otherwise!) By default, the domain is set to the full domain of the web server

that served the web page. For example, myserver.mydomain.com would automatically

set the domain to .myserver.mydomain.com

Page 53: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 53

Cookie Parts: Domain Note that domains are always prepended with a dot.

This is a security precaution: all domains must have at least two periods.

You can however, set a higher level domain For example, myserver.mydomain.com can set the

domain to .mydomain.com. This way hisserver.mydomain.com and herserver.mydomain.com can all access the same cookies.

No matter what, you cannot set a domain other than your own.

Page 54: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 54

Cookie Parts: Path Restricts cookie usage within the site. By default, the path is set to the path of the

page that created the cookie. Example: user requests page from

mymall.com/storea. By default, cookie will only be returned to pages for or under /storea.

If you specify the path to / the cookie will be returned to all pages (a common practice.)

Page 55: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 55

Cookie Parts: Expires Specifies when the cookie will expire. Specified in Greenwich Mean Time (GMT):

Wdy DD-Mon-YYYY HH:MM:SS GMT

If you leave this value blank, browser will delete the cookie when the user exits the browser. This is known as a session cookies, as opposed to

a persistent cookie.

Page 56: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 56

Cookie Parts: Secure The secure flag is designed to encrypt

cookies while in transit. A secure cookie will only be sent over a

secure connection (such as SSL.) In other words, if a cookie is set to

secure, and you only connect via a non-secure connection, the cookie will not be sent.

Page 57: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 57

Example Cookie from GoogleHTTP/1.1 200 OKCache-control: privateContent-Type: text/htmlSet-Cookie: PREF=ID=11cebd117082ef7a:TM=1074966051:LM= 1074966051:S=CgHQLEJ57-U9oRXn; expires=Sun, 17-Jan- 2038 19:14:07 GMT; path=/; domain=.google.comContent-Encoding: gzipServer: GWS/2.1Content-length: 1216Date: Sat, 24 Jan 2004 17:40:51 GMT

Page 58: Http Cgi Cookies

04/08/23 HTTP, CGI and Cookies 58

Example from Amazon.com

HTTP/1.1 302Date: Sat, 24 Jan 2004 17:58:29 GMTServer: Stronghold/2.4.2 Apache/1.3.6 C2NetEU/2412 (Unix) amarewrite/0.1 mod_fastcgi/2.2.12Set-Cookie: session-id-time=1075536000; path=/; domain=.amazon.com; expires=Saturday, 31-Jan-2004 08:00:00 GMTSet-Cookie: session-id=103-0070896-9210277; path=/; domain=.amazon.com; expires=Saturday, 31-Jan-2004 08:00:00 GMTTransfer-Encoding: chunkedContent-Type: text/html