Download ppt - Protocolo HTTP. Let us start with this quote from the HTTP specification document [2]: The HTTP protocol is based on a request /response paradigm. A client

Protocolo HTTP

Protocolo HTTP Let us start with this quote from the HTTP

specification document [2]: The HTTP protocol is based on a request /response

paradigm. A client establishes a connection with a server and sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content. The server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity meta-information, and possible body content.

Protocolo HTTP (2) What this means to libwww-perl is that

communication always take place through these steps: First a request object is created and configured.

This object is then passed to a server and we get a response object in return that we can examine.

A request is always independent of any previous requests, i.e. the service is stateless (sem estado). The same simple model is used for any kind of service we want to access.

Exemplos1) if we want to fetch a document from a remote

file server, then we send it a request that contains a name for that document and the response will contain the document itself.

2) If we access a search engine, then the content of the request will contain the query parameters and the response will contain the query result.

3) If we want to send a mail message to somebody then we send a request object which contains our message to the mail server and the response object will contain an acknowledgment that tells us that the message has been accepted and will be forwarded to the recipient(s).

O objeto RequestThe libwww-perl request object has the class name HTTP::Request. The

fact that the class name uses HTTP:: as a prefix only implies that we use the HTTP model of communication. It does not limit the kind of services we can try to pass this request to. For instance, we will send HTTP::Requests both to ftp and gopher servers, as well as to the local file system.

The main attributes of the request objects are: The method is a short string that tells what kind of request this is. The

most common methods are GET, PUT, POST and HEAD. The uri is a string denoting the protocol, server and the name of the

"document" we want to access. The uri might also encode various other parameters.

The headers contain additional information about the request and can also used to describe the content. The headers are a set of keyword/value pairs.

The content is an arbitrary amount of data.

O objeto ResponseThe libwww-perl response object has the class name HTTP::Response. The

main attributes of objects of this class are:

The code is a numerical value that indicates the overall outcome of the request.

The message is a short, human readable string that corresponds to the code.

The headers contain additional information about the response and describe the content.

The content is an arbitrary amount of data.

Since we don't want to handle all possible code values directly in our programs, a libwww-perl response object has methods that can be used to query what kind of response this is. The most commonly used response classification methods are:

is_success() The request was was successfully received, understood or accepted.

is_error() The request failed. The server or the resource might not be available, access to the

resource might be denied or other things might have failed for some reason.

O User Agent (UA)Let us assume that we have created a request object.

What do we actually do with it in order to receive a response?

The answer is that you pass it to a user agent object and this object takes care of all the things that need to be done (like low-level communication and error handling) and returns a response object. The user agent represents your application on the network and provides you with an interface that can accept requests and return responses.

The user agent is an interface layer between your application code and the network. Through this interface you are able to access the various servers on the network.

User AgentThe class name for the user agent is

LWP::UserAgent.

Every libwww-perl application that wants to communicate should create at least one object of this class. The main method provided by this object is request(). This method takes an HTTP::Request object as argument and (eventually) returns a HTTP::Response object.

The user agent has many other attributes that let you configure how it will interact with the network and with your application.

The timeout specifies how much time we give remote servers to respond before the library disconnects and creates an internal timeout response.

The agent specifies the name that your application should use when it presents itself on the network.

The from attribute can be set to the e-mail address of the person responsible for running the application. If this is set, then the address will be sent to the servers with every request.

The parse_head specifies whether we should initialize response headers from the <head> section of HTML documents.

The proxy and no_proxy attributes specify if and when to go through a proxy server. URL:http://www.w3.org/pub/WWW/Proxies/

The credentials provide a way to set up user names and passwords needed to access certain services.

Many applications want even more control over how they interact with the network and they get this by sub-classing LWP::UserAgent. The library includes a sub-class, LWP::RobotUA, for robot applications

This example shows how the user agent, a request and a response are represented in actual perl code:

# Create a user agent object use LWP::UserAgent; $ua = LWP::UserAgent->new; $ua->agent("MyApp/0.1 "); # Create a request my $req = HTTP::Request->new(POST =>

'http://search.cpan.org/search'); $req->content_type('application/x-www-form-urlencoded');

$req->content('query=libwww-perl&mode=dist'); # Pass request to the user agent and get a response back my

$res = $ua->request($req); # Check the outcome of the response if ($res->is_success) { print $res->content; } else { print

$res->status_line, "\n"; } The $ua is created once when the application starts up. New

request objects should normally created for each request sent.

Capítulo 1 - Introdução

Web Client (Cliente Web)

Cliente Web: é uma aplicação que comunica-se com um servidor Web usando o protocolo HTTP

Cliente Web (2)

A interface mais comum a WWW é o navegador (browser)

web browser permite que você faça o download de documentos web e veja-os formatados na tela

URL (Universal Resource Locator)

É um subconjunto da URI (Universal Resource Identifier, ou Identificador de Recursos Universal)

HTTP (Hypertext Transport Protocol)

Common Gateway Interface (CGI)

Capítulo 2 – Desmistificando o Browser

Transação HTTP

programa web cliente web servidor web o protocolo HTTP é baseado em

texto, isto é, podemos ver os comandos sendo trocados

transação web

A requisição através do browser

http://hypothetical.ora.com/ http:// protocolo usado hypothetical.ora.com servidor / diretório no servidor

A requisição do cliente

GET / HTTP/1.0Connection: Keep-AliveUser-Agent: Mozilla/3.0Gold (WinNT;

I)Host: hyphotetical.ora.comAccept: image/gif, image/x-xbitmap,

*/*

A resposta do servidorHTTP/1.0 200 OKDate: Fri, 04 Oct 1996 14:31:51 GMTServer: Apache/1.1.1.Content-type: text-htmlContent-length: 327Last-modified: Fri, 04 Oct 1996 14:06:11 GMT

<title>...</title>

responseheader

body orbody orentity-body

Transação HTML

Cliente Servidor

HTML (Hypertext Markup Language)

Transações

Método POSTPOST /cgi-bin/query HTTP/1.0Referer:Connection:User-Agent:Host:Accept:Content-type: application/x-www-form-urlencodedContent-length: 47

querytype=subject&queryconst=numerical+analysis

Tipos de métodos de requisição

GET POST

Método PUTPUT /example.html HTTP/1.0Connection:User-Agent:Pragma:Host:Accept:Content-Length:

<!

</HTML>

Estrutura de uma transação HTTP

Requisição do Cliente

Method URI HTTP-version

General-header

Request-header

Entity-header

Entity-body

Resposta do Servidor

HTTP-version Status-code Reason-phrase

General-header

Response-header

Entity-header

Entity-body

Estrutura de uma requisição do cliente

Estrutura de uma resposta do Servidor

Capítulo 3 – Aprendendo HTTP

HTTP é um protocolo stateless no qual o cliente faz uma requisição (request) ao servidor que envia uma resposta (response) e então a transação é finalizada

Métodos de Requisição do Cliente

O método de requisição do cliente é um “comando” ou uma requisição que o cliente web faz ao servidor

Métodos: GET, POST, HEAD, DELETE, TRACE, PUT

GET: obtendo um Documento

HEAD: Obtendo a informação do cabeçalho

POST: Enviando dados ao servidor

PUT: Armazenando o Entity-Body na URL

DELETE: Removendo a URL

TRACE: View the Client’s Message Through the Request Chain

Versões do HTTP HTTP 1.0 HTTP 1.1

melhor implementação de conexões persistentes

Multihoming (permite um único host, porém respondendo por vários domínios diferentes)

entity tags byte ranges – permite que apenas partes do

documento sejam recuperadas digest authentication

Códigos de Resposta do Servidor

Faixa de valores

Significado da Resposta

100-199 Informacional

200-299 Requisição do cliente foi feita com sucesso

300-399 A requisição do cliente foi redirecionada. Outras alterações são necessárias

400-499 Requisição do cliente está incompleta

500-599 Erros do servidor

Cabeçalhos HTTP

Diferentes tipos de cabeçalhos

General headers Request headers Response headers Entity Headers

Conexões Persistentes

Connection: Keep-Alive

Tipos de mídia

Accept header Content-Type Exemplos:

Accept: */* Accept: type/* Accept: type/subtype

Caching de Cliente

Obtendo o tamanho do Conteúdo

cabeçalho Content-length

Faixa de Bytes (Byte ranges)

Referring Documents

Referer header

Identificação de Cliente e Servidor

Autorização An Authorization header is used to

request restricted documents Authorization: SCHEME REALMExemplo:Authorization: BASIC username:password,

onde username:password é codificado em base64

Autenticação

The realm of the BASIC authentication schema indicates the type of authentication requested

See also Digest authentication (disponível em HTTP 1.1)

Cookies

Set-Cookie e cabeçalhos Cookie

Capítulo 4 – A Biblioteca Socket

The socket library is a low-level programmer’s interface that allows client to set up a TCP/IP connection and communicate directly to servers. Servers use sockets to listen for incoming connections, and clients use sockets to initiate transactions on the port that the server is listening to.

Uma conversação típica usando Sockets

Socket Calls

socket()

bind()

listen

accept()

sysread()syswrite()

sysread()

close()

socket()

connect()

syswrite()

close()

Rotinas do Cliente Rotinas do Servidor

Usando chamadas de SocketFunção Uso Proposta

socket()

connect()

sysread()

syswrite()

close()

bind()

listen()

accept()

Capítulo 5 – A biblioteca LWP

A Web trabalha sobre o protocolo TCP/IP, onde o cliente e o servidor estabelecem uma conexão e trocam as informações necessárias através dessa conexão

Apêndice A – Cabeçalhos HTTP

Há quatro categorias de cabeçalhos: General Request Response Entity

Summary if Support Across HTTP Versions

HTTP 0.9 HTTP 1.0 HTTP 1.1

Apêndice B – Tabelas de Referência

Media Types Character Encoding Languages Character Sets

Tipos de Mídias

Content-type header Accept header Internet Media Types

Text Type/Subtype

text/plan text/richtext text/enriched text/tab-separetae-values text/html text/sgml

Multipart Type/Subtype

Message Type/Subtype

Application Type/Subtype

Codificação de Caracteres

Content-type of applicatrion/x-www-form-urlencoded

caracteres especiais são codificados para eliminar a ambiguidade

Veja RFC 1738 (http://www.faqs.org/rfcs/rfc1738.html)

Linguagens

A language tag is of the form of:<primary-tag> <-subtag>where zero or more subtags are

allowedSee RFC 1766 for more information

Conjunto de Caracteres

Accepted-language Content-language Veja RFC 1700 (

http://www.faqs.org/rfcs/rfc1700.html)

Bibliografia

WONG, C. Web Client Programming with Perl. 1st Edition March 1997. O’Reilly

[2] URL:http://www.w3.org/pub/WWW/Protocols/

Glossário

IANA – Internet Assigned Number Authority

CGI – Common Gateway Interface

Backup Slides

HTTP é stateless

O HTTP é um protocolo stateless (sem-estado) não existe uma conexão permanente entre o servidor e o cliente (navegador) portanto o servidor não sabe se uma conexão seguinte está relacionada a conexão anterior

Protocolo HTTP

Request HTTP (requisição) Response HTTP (resposta) Corpo de uma requisição HTTP

Cookies

São informações armazenadas no computador do usuário que são opcionalmente enviadas em cada requisição pelo navegador, processado pelo servidor e recebido de volta na resposta

Container Web