Retrieving Web Pages (HTTP), Topic 3, Chapter 6

Preview:

DESCRIPTION

Network Programming Kansas State University at Salina. Retrieving Web Pages (HTTP), Topic 3, Chapter 6. First, some comments. Switch to application protocols Client side focus Pre-build Modules A natural OO thing – a matter of productivity Argh!, someone else’s code - PowerPoint PPT Presentation

Citation preview

Retrieving Web Pages (HTTP), Topic 3, Chapter 6

Network Programming

Kansas State University at Salina

First, some comments Switch to application protocols

Client side focus

Pre-build Modules A natural OO thing – a matter of productivity Argh!, someone else’s code Lots of choices, language independent principles

Web related network programming Chapter 6 – retrieving web pages – easy Chapter 7 – Parsing HTML – hard Chapter 8 – XML and XML-RPC – interesting

HTTP Basics

Stateless, connectionless protocol Basic GET …

import sockets = socket.socket(socket.AF_INET, socket.SOCK_STREAM)s.connect(('www.sal.ksu.edu', 80))request = """GET /faculty/tim/index.html HTTP/1.0\nFrom: tim@sal.ksu.edu\nUser-Agent: Python\n\n"""s.send(request)fp = open( "index.html", "w" )while 1: data = s.recv(1024) if not len(data): break fp.write(data)s.close()fp.close()

Now, for the easy way …

import sys, urllib2

page = "http://www.sal.ksu.edu/faculty/tim/"req = urllib2.Request(page)fd = urllib2.urlopen(req)while 1: data = fd.read(1024) if not len(data): break sys.stdout.write(data)

Submitting with GET

>>> import urllib

>>> encoding = urllib.urlencode( [('activity', 'water ski'), \ ('lake', 'Milford'), ('code', 52)] )

>>> print encodingactivity=water+ski&lake=Milford&code=52

>>> url = "http://www.example.com" + '?' + encoding

>>> print urlhttp://www.example.com?activity=water+ski&lake=Milford&code=52

Submitting with POST

>>> encoding = urllib.urlencode( [('activity', 'water ski'),\ ('lake', 'Milford'), ('code', 52)] )

>>> print encodingactivity=water+ski&lake=Milford&code=52

>>> import urllib2

>>> req = urllib2.Request(url)

>>> fd = urllib2.urlopen("http://www.example.com", encoding)

Recommended