CIT 383: Administrative Scripting

Preview:

DESCRIPTION

HTTP and HTML. CIT 383: Administrative Scripting. Topics. HTTP URLs Cookies Base64. Web Client/Server Interaction. Server. Browser. HTTP Request (form submission) ‏. User waits. Server processing. HTTP Response (new web page) ‏. User interaction. HTTP Request (form submission) ‏. - PowerPoint PPT Presentation

Citation preview

CIT 383: Administrative Scripting Slide #1

CIT 383: Administrative Scripting

HTTP and HTML

CIT 383: Administrative Scripting

Topics

1. HTTP

2. URLs

3. Cookies

4. Base64

CIT 383: Administrative Scripting

Web Client/Server Interaction

HTTP Request (form submission)

HTTP Response (new web page)Server processingUser waits

HTTP Request (form submission)User interaction

HTTP Response (new web page)User waits Server processing

Browser Server

CIT 383: Administrative Scripting Slide #4

HTTP: HyperText Transfer Protocol

Simple request/respond protocol– Request methods: GET, POST, HEAD, etc.– Protocol versions: 1.0, 1.1

Stateless– Each request independent of previous requests,

i.e. request #2 doesn’t know you auth’d in #1.– Applications responsible for handling state.

CIT 383: Administrative Scripting Slide #5

HTTP Request

GET http://www.google.com/ HTTP/1.1Host: www.google.comUser-Agent: Mozilla/5.0 (Windows NT 5.1) Gecko/20060909 Firefox/1.5.0.7

Accept: text/html, image/png, */*Accept-Language: en-us,en;q=0.5Cookie: rememberme=true; PREF=ID=21039ab4bbc49153:FF=4

Method URL Protocol Version

Headers

Blank Line

No Data for GET method

CIT 383: Administrative Scripting Slide #6

HTTP Response

HTTP/1.1 200 OK

Cache-Control: private

Content-Type: text/html

Server: GWS/2.1

Date: Fri, 13 Oct 2006 03:16:30 GMT

<HTML> ... (page data) ... </HTML>

Protocol Version HTTP Response Code

Headers

BlankLine

Web Page Data

CIT 383: Administrative Scripting

HTTP MethodsHEAD

Same as GET, but only asks for headers, not body.

GETRequests a representation of the resource. Most common method. Should not

cause server to modify (write, delete) any resources.

POSTSubmits data to be processed to the resource. The data is included in the body

of the request. This may result in the creation of a new resource or the updates of existing resources or both.

PUTUploads a representation of the specified resource.

DELETEDeletes the specified resource.

TRACEEchoes back the received request, so that a client can see what intermediate

servers are adding or changing in the request.

CIT 383: Administrative Scripting

HTTP Request HeadersHeader Description Example

Accept Acceptable content types. Accept: text/plain

Authorization HTTP authentication credentials.

Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

Cache-Control Caching directives Cache-Control: no cache

Cookie Cookie data for server. Cookie: color=red

Date Date and time sent Date: 29 Oct 2008 1:02:03

Host Name of server Host: cs.nku.edu

If-Modified-Since

Allows a 304 Not Modified to be returned for caching.

If-Modified-Since: 29 Oct 2008 1:02:03 GMT

User-Agent Browser description string Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.2) Ubuntu/8.04 Firefox/3.1

CIT 383: Administrative Scripting

HTTP Response Headers

Header Description Example

Cache-Control Caching directives Cache-Control: no cache

Content-Encoding

Type of encoding used. Content-Encoding: gzipSer

Content-Length Length of data returned. Content-Length: 1024

Content-Type Type of data returned. Content-Type: text/html

Date Date and time response sent. Date: 29 Oct 2008 1:02:03

Expires Date after which data expired. Expires: 1 Nov 2008 1:02:03

Location Used in redirection Location: http://www.example.com/about/

Server Server identification string. Server: Apache/2.0.55

Set-Cookie Cookie created by server. Set-Cookie: color=red

CIT 383: Administrative Scripting

HTTP Response CodesCode Description Meaning

200 OK Standard success response.

201 Created New resource created.

301 Moved permanently Permanent redirect to new URI.

304 Not modified Safe to use page stored in cache.

307 Temporary redirect Use new URI now; try old later.

401 Unauthorized Authentication failed.

403 Forbidden Disallowed, auth will not help.

404 Not found Resource was not found.

405 Method not allowed Used GET when should use POST.

500 Internal server error Internal server error.

CIT 383: Administrative Scripting

Net::HTTP Class

Net::HTTP.get(host, path): returns resource from host, path as a string.

Net::HTTP.get_response(host, path): returns HTTP response object, includes body + headers.

Net::HTTP.post_form(host, path,{parameters}): returns resource from host, path as a string using POST instead of GET, sending form parameters as a hash.

CIT 383: Administrative Scripting

Redirection Example

def fetch(uri)response = Net::HTTP.get_response(uri)case response

when Net::HTTPSuccess then response when Net::HTTPRedirection then fetch(response['location']) else response.error! end

endend

CIT 383: Administrative Scripting

URI Format

<proto>://<user>@<host>:<port>/<path>?<qstr>– Whitespace marks end of URL– “@” separates userinfo from host– “?” marks beginning of query string– “&” separates query parameters– %HH represents character with hex values– ex: %20 represents a space

http://username:password@www.auth.com:8001/a%20spaced%20path

CIT 383: Administrative Scripting

URI Class

URI.extract(string): returns array of URI strings extracted from string.

URI.extract("text http://example.com/ and mailto:test@example.com and text here also.")

=> ["http://example.com/", "mailto:test@example.com"]

URI.join(string,string,...): joins two or more strings into a URI.

URI.parse(string): creates URI object f/ string.

URI.split(uri): splits URI string into protocol, host, path, query, etc. components.

CIT 383: Administrative Scripting Slide #15

Cookies

Server to ClientContent-type: text/html

Set-Cookie: foo=bar; path=/; expires Fri, 20-Feb-2004 23:59:00 GMT

Client to ServerContent-type: text/html

Cookie: foo=bar

CIT 383: Administrative Scripting

Base64 Encoding

How do you send binary data using text?– Email attachments (MIME).– Cookies (HTTP).

Base64: encode 3 bytes as 4 text characters– Use characters A-Za-z0-9+/ to store 6 bits of data.– Byte has 8 bits, so 3 bytes = 24 bits– 4 base64 chars (6 bits each) = 24 bits– Use = to pad output if input not multiple of 3 bytes.

CIT 383: Administrative Scripting

Base64 Class

encode = Base64.encode64(‘informatics‘)

decode = Base64.decode64(‘aW5mb3JtYXRpY3M=‘)

CIT 383: Administrative Scripting

Topics

1. Evolution of HTML

2. HTML Structure

3. Regular Expressions v Parsing

4. HPricot

5. XPath

CIT 383: Administrative Scripting

Evolution of HTML

1991 HTML created (only 22 tags)

1995 HTML 2.0

1996 Tables added to HTML 2.0

Jan 1997 HTML 3.2 published by W3C

Dec 1997 HTML 4.0

2000 XHTML 1.0

2008 HTML 5.0 working draft published.

CIT 383: Administrative Scripting

HTML Structure

<html>

<title>My title</title>

<body>

<a href=“...”>My link</a>

<h1>My header</h1>

</body>

</html>

CIT 383: Administrative Scripting

HTML Structure

CIT 383: Administrative Scripting

Why Not Regular Expressions?

Angle-bracket tags are difficult to deal with.Tag regexp: <\w+\s+[^>]*>

Matches <img alt=“ruby” src=“rb.png”>

Doesn’t: <img alt=“ruby>” src=“rb.png”>

Solution:check for > in attributes.

Have to match every form of attributename=“value”

name=‘value’

name=value

name

CIT 383: Administrative Scripting

Hpricot

h = Hpricot(html-string)Creates a new HPricot::Doc object.

el = h.at(string)Finds first matching Hpricot::Elements object.

el = h.search(string or XPath expression)Returns array of matching objects.

el.inner_htmlReturns HTML enclosed in element.

CIT 383: Administrative Scripting

XPath Searches

h.search("p")Find all paragraph tags in document.

doc.search("/html/body//p")Find all paragraph tags within the body tag.

doc.search("//a[@src]") Find all anchor tags with a src attribute.

doc.search("//a[@src='google.com']") Find all a tags with a src attribute of google.com.

Final Exam

Comprehensive exam like midterm– 20% concepts (focus on classes + exceptions)– 80% programs (at least 2 programs like labs)

Study– Review the midterm practice problems.– Work out your lab programs again.– Solve un-assigned lab programs.– Review concepts, esp. classes + exceptions.

CIT 383: Administrative Scripting

Going Further

Ruby Quiz– Assignment-scale problems + solutions.

– http://rubyquiz.com/

Practical Ruby for System Administration– If Admin Scripting II existed, this would be the text.

General Ruby Books– The Ruby Way, 2nd edition

– The Ruby Programming Language

CIT 383: Administrative Scripting

CIT 383: Administrative Scripting Slide #27

References1. Michael Fitzgerald, Learning Ruby, O’Reilly,

2008.2. David Flanagan and Yukihiro Matsumoto, The

Ruby Programming Language, O’Reilly, 2008.3. Hal Fulton, The Ruby Way, 2nd edition, Addison-

Wesley, 2007.4. Robert C. Martin, Clean Code, Prentice Hall,

2008.5. Dave Thomas with Chad Fowler and Andy Hunt,

Programming Ruby, 2nd edition, Pragmatic Programmers, 2005.

Recommended