Web Services CPTE 433 John Beckett. Players Server – provides resources in terms of “pages”...

Preview:

Citation preview

Web Services

CPTE 433John Beckett

Players

• Server – provides resources in terms of “pages”

• Client – – Browser on a PC– Browser on a smaller device– Current trend: “App”

• HTTP: higher-level protocol defined by W3C– Migrating toward HTTPS

• IP: lower-level protocol defined by the IETF

Why Open Standards

• Seems like a dumb question now• Formerly, systems could not talk to

each other• Then gateways were used to go

between• Now common standards and

protocols are used widely• We are no longer dependent on a

single vendor to make the net work

Building Blocks

• URL specifies where a service is located:– Protocol– Username/Password

• http://jbeckett:abcdefg@hw.cs.southern.edu– Hostname– Directory– Parameter(s)

• Name• Value

Web 1 Server-Side

• Form is used to send information and request a response

• Data is transferred from client to host by either of:– Post – data is not visible on URL line– Get – data is visible on URL line

Data can be implied by the URL itself• Server-side program accepts data,

process it, and returns result via HTTP (Perl, PHP, ASP)

Web 1 Client Side

• Embedded scripting language– Originally LiveScript (Netscape)– Became JavaScript for marketing

reasons– Microsoft developed Jscript (ignored by

market)– Microsoft includes JavaScript now– Legal name is ECMA Script

Web 2: AJAX• Asynchronous Javascript And XML

– Evens load on the server• Set of techniques to disconnect from

traditional query/response cycle– Can make it more difficult for user to

determine state (e.g. “Did I click it?”), resulting in duplicate or confused requests

• Best example is Google Earth/Maps• Now very fashionable in the industry

HTTP Messages to Know

• 200 (OK) Request completed• 301 (Moved Permanently) – Need to

begin using the new URL• 302 (Redirect to specified URL)• 307 (Redirect to specified URL

temporarily)• 401 (Try again with authentication)• 403 (Unauthorized Access)• 404 (No such page)

Webmaster Role

• Enable people to do their own updates– Data– Web pages

• Use software to do this– Content Management System– DreamWeaver?

• Adobe Contribute allows you to control style while users enter content

– Microsoft SharePoint

Web SLA• Lead time for changes

– Better yet, they do their own changes• Performance:

– For the proponent of the site, latency at a given number of queries per second

– For the site visitor, other traffic is unimportant so your response to them as an individual is key

– Metaphor: Web store versus brick store• In the bricks, people see traffic• In the Web store, they think of typing in another URL

Architectures• Static Web server• CGI server

– “CGI” in this chapter is a generic term for all server-side methods.

– Perl is traditional CGI, creates additional threads– Module based: PHP, VBScript, Java– Thread creation can be a problem in modPerl (better to

not use it)• Database-driven site

– Wide variety of methods used for this– Well-developed: Content Management System– CPU performance can be an issue

• Multimedia (streaming) server– CPU performance can be an issue

LAMP

• Linux, Apache, MySQL, Perl• Linux, Apache, MySQL, PHP• Linux, Apache, MySQL, Python

• Best to have a name for your application architecture to save time

• Like anything else, standardize your application architecture

Multiple Servers Per Host

• Apache and IIS can sense the URL the user was going to and automatically serve the appropriate page.– Hostname– Protocol (http:// versus https:// versus ftp://)

• You could use multiple Ethernet ports for the same purpose– Improves performance if you have a very high-

bandwidth Internet connection• You could virtualize the entire server

– Simplifies https:// configuration– Requires separate IP address per site

The Scaling Dilemma

• If your site is not useful, it won’t be used much

• If your site is useful, it will be overwhelmed

• Horizontal: Use a cluster• Vertical: Segregate by function

– Web application– Database server / Web services server

Horizontal Scaling

• Round-robin name server recordsC:\nslookup google.comServer:

cns.s3woodstock.ga.atlanta.comcast.net

Name: google.comAddresses: 72.14.207.99,

64.233.187.99, 64.233.167.99

DNS Cacheing

• DNS cacheing can defeat round-robins:

1. Browser remembers last lookup2. DNS client in the OS the browser is

using may remember– ipconfig /flushdns

3. Forwarders may rememberBetter answer:• Hardware load balancer

Vertical Scaling

• Partition your service according to type of use:

• Static Web application• Dynamic Web application• Database server• Media file server

Application State

• The Web is a stateless system– It doesn’t “remember” of itself where the

conversation was previously• Applications must add on state control

– Cookies– Server-side information (tokens connect one

page to the next)• The state-management system can

present a scalability problem– Might be the hardest to solve

Security

• Information is going over the Net• Cross-scripting

• Partition your data so that “above the fold” items are not kept on the application server

Above the fold: So significant that a newspaper would put it on the top half of page 1

Secure Connections & Certificates

• SA Responsibility: Key management– Private part should not be on the same

server!• When is there a person available?• CPU time can be an issue

– If you are network bound, it is not an issue

Protecting Content

• Deny automatic directory generation• Beware of directory traversal• Use server-side to hide things. In

PHP:<html><head><title>Hiding the

secret</title></head><body><P>No, you aren't going to see the secret!</P><?php $a = "My Secret Code is 42" ?></body></html>

http://computing.southern.edu/jbeckett/secret.php

Cross-Scripting

• Visitor looks at your HTML, and creates their own HTML that mimics the parameters yours provides…

• except that it is hostile.• Check the referrer information• Double-validate

– In the browser– In your back-end program

SQL Injection

• Don’t use what people type in as SQL – it’s a very powerful language

• Choose which SQL elements to include based on the user’s choices

• Are you using SQL that is stored in a database?– Can somebody put hostile SQL in?

Limit Potential Damage

• When possible, your Web server should contain only a copy of the “real” data

• Perhaps the Web server should dish out static pages that are created when the underlying data changes

• Bonus of this technique: Performance improvement

• Use read-only mode when possible• Use OS permissions, limit to least needed• Log, log, log

Webmaster or SA?

• Webmaster should have the privileges he/she needs, and not more

• Consider establishing a separate host for the Web – perhaps even a separate security zone

• Text: Don’t become the Webmaster, let the company hire one.

Types of Web Changes

• Update – Newer material than what’s there

• Change – Revising structure• Fix – Correcting improper contents or

behavior

Three Web hosts?

• www-draft – where new things are developed

• www-qa – where new things are placed for verification before going “live”

• www – where the public sees things

What Is This Server For?

• Internal, external, or both?• Specific application?• Who will be using it?• Who will be updating it?• Uptime requirements?• Account management?• Storage needs?• Traffic expectations?

Namespace Principles

• People expect URLs to keep working• A URL should not have confidential

info in it (duh)– Student ID numbers

• Use the Include capability of your server software to implement your namespace scheme– /etc/dav for instance

External Sites – 4 steps

Task

• Registering domain

• DNS hosting• Web hosting• Content

Example for sgsdaschool.org

• No-ip.com• 204.15.252.7• 216.249.119.71• /home/

sgsdaschoolIf you have content already online when the domain is registered, Web spiders will find you automatically.

Never pay money to be “listed in all the best search engines.” Worthwhile search engines will find you. Only thing better is “buying a word”.

Why Outsource Web Hosting?

Pros• No local software to

install or maintain– Already set up with

most packages you need

• Dashboard makes it easy

• They specialize in Web apps

• Cost can be extremely low

Cons• Batch transfers to the

server take longer• Lack of control

Web Application Authentication

• .htaccess/.htpasswd – Cumbersome to maintain, do not scale well

• PAM (Pluggable Authentication Module)

• SQL lookup on external database• Active Directory lookup

• Who is updating the password service?

Mashup Apps• Quickly bring life to an idea• Grab existing site and re-format data

– Is it your data to control?• XML / Web Services are used extensively

for such apps• Inherent inefficiency may produce scaling

problems• Ability to cache may address scaling

problems– …and create propagation delay problems

http://hw.cs.southern.edu/Rooms/

Recommended