56
Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger slides are modified from Dave Hollinger

Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Embed Size (px)

Citation preview

Page 1: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Lecture 5

Dynamic Web Servers

CPE 401 / 601

Computer Network Systems

slides are modified from Dave Hollingerslides are modified from Dave Hollinger

Page 2: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Web Server

• Talks HTTP

• Looks at METHOD, URI to determine what the client wants.

• For GET, URI often is just the path of a file– relative to some directory on the web server

Dynamic Web Servers 2

Page 3: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

GET /foo/blah

Dynamic Web Servers 3

usr bin www etc

foo fun gif

/

blah

Page 4: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Dynamic Documents

• Dynamic Documents can provide:– automation of web site maintenance– customized advertising– database access– shopping carts– date and time service– …

Dynamic Web Servers 4

Page 5: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Web Programming

• Writing programs that create dynamic documents has become very important

• There are a number of general approaches:– Create custom server for each service desired

• Each is available on different port.

– Develop a real smart web server • Server Side Includes, scripting, server APIs

– Have web server run external programs

Dynamic Web Servers 5

Page 6: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Custom Server

• Write a TCP server that watches a “well known” port for requests

• Develop a mapping from http requests to service requests

• Send back HTML (or whatever) that is created/selected by the server process

• Have to handle http errors, headers, etc

Dynamic Web Servers 6

Page 7: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Drawbacks to Custom Server Approach

• We might have lots of ideas custom services– Each requires dedicated address (port)– Each needs to include:

• basic TCP server code• parsing HTTP requests• error handling• headers• access control

Dynamic Web Servers 7

Page 8: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Smart Web Server

• Take a general purpose Web server (that can handle static documents) and – have it process requested documents as it sends

them to the client

• The documents could contain commands that the server understands – the server includes some kind of interpreter

Dynamic Web Servers 8

Page 9: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Example Smart Server

• Have the server read each HTML file as it sends it to the client

• The server could look for this:<SERVERCODE> some command </SERVERCODE>

• The server doesn’t send this part to the client, instead it interprets the command and sends the result to the client

• Everything else is sent normally

Dynamic Web Servers 9

Page 10: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Server Side Includes

• Server Side Includes (SSI) provides a set of commands that a server will interpret

• Typically the server is configured to look for commands only in specially marked documents– so normal documents aren’t slowed down

• SSI commands are called directives– Directives are embedded in HTML comments

Dynamic Web Servers 10

Page 11: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

SSI Directives

• A comment looks like this:<!-- this is an HTML comment -->

• A directive looks like this:<!--#command parameter=“arg”-->

• SSI servers keep a number of useful things in environment variables:

DOCUMENT_NAME, DOCUMENT_URL

• echo: inserts the value of an environment variable into the page

This page is located at <!--#echo var=“DOCUMENT_URL”-->

Dynamic Web Servers 11

Page 12: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

SSI Directives

• include: inserts the contents of a text file.<!--#include file=“banner.html”>

• flastmod: inserts the time and date that a file was last modified.

Last modified:

<!--#flastmod file=“foo.html”>

• exec: runs an external program and inserts the output of the program.

Current users: <!--#exec cmd=“/usr/bin/who”>

Dynamic Web Servers 12

Danger! Danger! Danger!

Page 13: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

More Power

• Some servers support elaborate scripting languages

• Scripts are embedded in HTML documents, the server interprets the script:– Microsoft Active Server Pages (ASP)

• JScript, VBScript, PerlScript

– Netscape LiveWire• JavaScript, SQL connection library.

– Many others…Dynamic Web Servers 13

Page 14: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Server Mapping and APIs

• Some servers include a programming interface that allows to extend the capabilities of the server by writing modules

• Specific URLs are mapped to specific modules instead of to files

Dynamic Web Servers 14

Page 15: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

External Programs

• Another approach is to provide a standard interface between external programs and web servers– We can run the same program from any web

server– The web server handles all the http,

• we focus on the special service only

– It doesn’t matter what language we use to write the external program

Dynamic Web Servers 15

Page 16: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Common Gateway Interface

• CGI is a standard interface to external programs supported by most (if not all) web servers– CGI programs are often written in scripting

languages (perl, tcl, etc.),

• The interface that is defined by CGI includes:– Identification of the service (i.e.,external program)– Mechanism for passing the request to the external

programDynamic Web Servers 16

Page 17: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Common Gateway Interface

• CGI is a standard mechanism for:

– Associating URLs with programs that can be run by a web server

– A protocol (of sorts) for how the request is passed to the external program

– How the external program sends the response to the client

CGI 17

Page 18: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

CGI Programming

CGI 18

CLIENT

HTTPSERVER

CGI Program

http request

http response

setenv(), dup(),

fork(), exec(), ...

Page 19: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

CGI URLs

• There is mapping between URLs and CGI programs provided by a web sever – The exact mapping is not standardized

• web server admin can set it up

• Typically:– requests that start with /CGI-BIN/ , /cgi-bin/ or

/cgi/, etc.• not to static documents

CGI 19

Page 20: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

HTTP Server - CGI Interaction

CGI 20

HTTPSERVER

CGI Program

stdin

stdout

EnvironmentVariables

Page 21: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Environment Variables

• The web server sets some environment variables with information about the request

• The web server fork()s and the child process exec()s the CGI program

• The CGI program gets information about the request from environment variables

CGI 21

Page 22: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

STDIN, STDOUT

• Before calling exec(), the child process sets up pipes so that – stdin comes from the web server and – stdout goes to the web server

• In some cases part of the request is read from stdin

• Anything written to stdout is forwarded by the web server to the client

CGI 22

Page 23: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Request Method: Get

• GET requests can include a query string as part of the URL:

GET /cgi-bin/login?mgunes HTTP/1.0

CGI 23

RequestMethod

ResourceName

Delimiter

QueryString

Page 24: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Simple GET queries - ISINDEX

• You can put an <ISINDEX> tag inside an HTML document– The browser will create a text box that allows the

user to enter a single string

• If an ACTION is specified in the ISINDEX tag, when the user presses Enter, – a request will be sent to the server specified as

ACTION

CGI 24

Page 25: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

ISINDEX ExampleEnter a string:

<ISINDEX ACTION=http://foo.com/search.cgi>

Press Enter to submit your query.

• If you enter the string “blahblah”, – the browser will send a request to the http server

at foo.com that looks like this:

GET /search.cgi?blahblah HTTP/1.1

CGI 25

Page 26: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

What the CGI sees

• The CGI Program gets REQUEST_METHOD using getenv:

char *method;

method = getenv(“REQUEST_METHOD”);

if (method==NULL) … /* error! */

CGI 26

Page 27: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Getting the GET

• If the request method is GET:if (strcasecmp(method,”get”)==0)

• The next step is to get the query string from the environment variable QUERY_STRING

char *query;

query = getenv(“QUERY_STRING”);

CGI 27

Page 28: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Send back http Response and Headers

• CGI program can send back a http status line :printf(“HTTP/1.1 200 OK\r\n”);

• and headers:printf(“Content-type: text/html\r\n”);

printf(“\r\n”);

CGI 28

Page 29: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Important!

• CGI program doesn’t have to send a status line – HTTP server will do this for you if you don’t

• CGI program must always send back at least one header line indicating the data type of the content– usually text/html

• The web server will typically throw in a few header lines of it’s own – Date, Server, Connection

CGI 29

Page 30: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Security!!!

• It is a very bad idea to build a command line containing user input!

• What if the user submits: “ ; rm -r *;”

grep ; rm -r *; /usr/dict/words

CGI 30

Page 31: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Beyond ISINDEX - Forms

• Many Web services require more than a simple ISINDEX

• HTML includes support for forms:– lots of field types– entire contents of form must be stuck together

and put in QUERY_STRING by the Web server

CGI 31

Page 32: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Form Fields• Each field within form has a name and a value• The browser creates a query that

– includes a sequence of “name=value” substrings and– sticks them together separated by the ‘&’ character

• If user types in “Mehmet H.” as the name and “none” for occupation, – the query would look like this:“name=Mehmet+H%2E&occupation=none”

CGI 32

Page 33: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

HTML Forms

• Each form includes a METHOD that determines what http method is used to submit the request

• Each form includes an ACTION that determines where the request is made

CGI 33

Page 34: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

An HTML Form<FORM METHOD=GET ACTION=http://foo.com/signup.cgi>

Name:

<INPUT TYPE=TEXT NAME=name><BR>

Occupation:

<INPUT TYPE=TEXT NAME=occupation><BR>

<INPUT TYPE=SUBMIT>

</FORM>

CGI 34

Page 35: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

What a CGI will get

• query (from the environment variable QUERY_STRING) will be – a URL-encoded string containing the name,value

pairs of all form fields

• The CGI must decode the query and separate the individual fields

CGI 35

Page 36: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

HTTP Method: POST

• GET method delivers data as part of URI• POST method delivers data as the content of a

request<FORM METHOD=POST ACTION=…>

• If REQUEST_METHOD is a POST, – the query is coming in STDIN

• The environment variable CONTENT_LENGTH tells us how much data to read

CGI 36

Page 37: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Possible Problemchar buff[100];

char *clen = getenv(“CONTENT_LENGTH”);

if (clen==NULL)

/* handle error */

int len = atoi(clen);

if (read(0,buff,len)<0)

… /* handle error */

pray_for(!hacker);

CGI 37

Page 38: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

GET vs. POST

• When using forms it’s generally better to use POST:– there are limits on the maximum size of a GET

query string• environment variable

– a post query string doesn’t show up in the browser as part of the current URL

CGI 38

Page 39: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

CGI Sessions

Page 40: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Typical FORM CGI setup

• User fills out a form and presses submit

• CGI program gets a set of name,value pairs – one for each form field

• CGI decides what to do based on the name,value pairs– sometimes creates a new form based on the

submission

CGI Sessions 40

Page 41: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Sessions

• Many web sites allow you to establish a session

– you identify yourself to the system

– now you can visit lots of pages, add stuff to shopping cart, establish preferences, etc

CGI Sessions 41

Page 42: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

State Information

• Each HTTP request is unrelated to any other – as far as the Web server is concerned

• Each new request to a CGI program starts up a brand new copy of the CGI program

• Providing sessions requires keeping state information

CGI Sessions 42

Page 43: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Session Conversation

CGI Sessions 43

Client

Client

Hi! I'm Joe.Hi! I'm Joe. Server

Server

Hi Joe (it's him again)Welcome Back...Hi Joe (it's him again)Welcome Back...

I wanna buy a cookie.I wanna buy a cookie.

OK Joe, it will be there tomorrow.OK Joe, it will be there tomorrow.

CGI1

CGI2

Page 44: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Hidden Field Usage

• One way to propagate state information is to use hidden fields

• User identifies themselves to a CGI program– fills out a form

• CGI sends back a form that contains hidden fields that identify the user or session

CGI Sessions 44

Page 45: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Hidden does not mean secure!

• Anyone can look at the source of an HTML document– hidden fields are part of the document!

• If a form uses GET, all the name/value pairs are sent as part of the URI– URI shows up in the browser as the location of the

current page

CGI Sessions 45

Page 46: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Revised Conversation

• Initial form has field for user name

GET /cgi1?name=joe HTTP/1.0

• CGI1 creates order form with hidden field

GET/cgi2?name=joe&order=cookie HTTP/1.0

CGI Sessions 46

Page 47: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Session Keys

• Many Web based systems use hidden fields that identify a session

• When the first request arrives, system generates a unique session key and stores it in a database

• Session key can be included in all forms/links generated by the system – as a hidden field or embedded in a link

CGI Sessions 47

Page 48: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Session Key Properties

• Must be unique

• Should expire after a while

• Should be difficult to predict– typically use a pseudo-random number generator

seeded carefully

CGI Sessions 48

Page 49: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

HTTP Cookies

• A "cookie' is a name,value pair that a CGI program can ask the client to remember

• Client sends this name,value pair along with every request to the CGI

• We can also use "cookies" to propagate state information

CGI Sessions 49

Page 50: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Set-Cookie Header Options

• Cookies are set using HTTP headers

• The general form of the Set-Cookie header is:Set-Cookie: name=value; options

• The options include:– expires=...– domain=...– path=...

CGI Sessions 50

Page 51: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Set-Cookie Fields

• Many options can be specified– separated by ";"

Set-Cookie: a=blah; path=/; domain=.cse.unr.edu; expires=Thursday, 10-May-2010 12:00:00 2010

CGI Sessions 51

All must b

e on one li

ne!

Page 52: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

CGI cookie creation

• A CGI program can send back any number of HTTP headers– can set multiple cookies

• Content-Type is required!

printf("Content-Type: text/html\r\n");

printf("Set-Cookie: prefs=nofrms\r\n");

printf("Set-Cookie: Java=yes\r\n");

printf("\r\n");

• … now sends document contentCGI Sessions 52

Page 53: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Getting HTTP Cookies

• Browser sends each cookie as a header:

Cookie: prefs=nofrms

Cookie: Java=OK

• Web server gives cookies to CGI program via an environment variable– or STDIN

CGI Sessions 53

Page 54: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Multiple Cookies

• There can be more than one cookie• Web Server puts them all together

prefs=nofrms; Java=OK

• and puts this string in the environment variable: HTTP_COOKIE

• Each cookie can be up to 4k bytes• One "site" can store up to 20 cookies on a

user's machineCGI Sessions 54

Page 55: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Cookies and Privacy

• Cookies can't be used to:– send personal information to a web server

without the user knowing about it– be used to send viruses to a browser– find out what other web sites a user has visited* – access a user's hard disk

* although they can come pretty close to this!

CGI Sessions 55

Page 56: Lecture 5 Dynamic Web Servers CPE 401 / 601 Computer Network Systems slides are modified from Dave Hollinger

Some Issues

• Persistent cookies take up space on user's hard disk

• Can be used to track your behavior within a web site– This information can be sold or shared

• Cookies can be shared by cooperating sites– advertising agencies do this

CGI Sessions 56