45
Lecture 1: Multi-Tier Organization of an Information System Gustavo Alonso Systems Group Computer Science Department - ETH Zurich [email protected] http://www.systems.inf.ethz.ch/

Lecture 1: Multi-Tier Organization of an Information System · Multi-Tier Organization of an Information System ... One Tier can be a physical single tier ... J2EE Most web servers

Embed Size (px)

Citation preview

Lecture 1:

Multi-Tier Organization of an

Information System Gustavo Alonso

Systems Group

Computer Science Department - ETH Zurich

[email protected]

http://www.systems.inf.ethz.ch/

Reading assignment 1

Multi-tier systems and architecture patterns

Goal

Understand how patterns are used in practice and the role of multiple tiers when used in a variety of environments. Become familiar with the industry literature on the topic of patterns.

Assignment:

Read the following two documents:

• http://www.redbooks.ibm.com/redbooks/pdfs/sg246822.pdf (up to but excluding

part 3, run time implementation)

• http://support.rightscale.com/12-

Guides/EC2_Best_Practices/EC2_Site_Architecture_Diagrams

Compare their contents and patterns proposed with the ones presented in the lecture

Write a report (10 pages) analyzing the cloud patterns proposed by Rightscale based on what was discussed in the lecture and the ideas of the IBM Redbook

Deadline: 12th October, 18:00 pm => send per e-mail (pdf only)

©Gustavo Alonso, ETH Zürich. 2

©Gustavo Alonso, ETH Zürich. 3

Contents

Design of an information system

Layers and tiers

Bottom up design

Top down design

Architecture of an information system

One tier

Two tier (client/server)

Three tier (middleware)

N-tier architectures

Clusters and tier distribution

A note on performance

Tiers of an Information Systems

©Gustavo Alonso, ETH Zürich. 5

Why Tiers? Information Systems are divided into

different tiers. The tiers can be conceptual or real The tiers can be implemented using

different technologies or using a single vendor stack (cfr. vendor lock-in, standards)

Distinguishing between tiers is, in practice, not an obvious exercise as the different system modules will also be divided according to business function (the system of the legal department, the system of the accounting department, etc.)

While the different tiers are a software engineering concept, they have almost always appeared as a result of changes in technology (computers, hardware, or networking)

The notion of tiers will allow us to describe the architecture of modern IT infrastructures abstracting from the system details

Tiered architecture

Business function architecture

©Gustavo Alonso, ETH Zürich. 6

Some practical notation Horizontal layer: refers to a layer

that is application independent and intended to work on a wide range of business environments

Vertical architecture/solution: refers to systems tailored to concrete applications and industry branches (e.g., insurance or banking)

Single vendor stack: all the layers are produced by a single vendor (e.g., WebSphere of IBM, WebLogic of BEA)

Host (hosted solution): mainframe, mainframe based applications

TCO (Total Cost of Ownership): the global cost of an IT system, including maintenance, hw-sw-people, licenses, etc.

Plumbing – glue: the infrastructure necessary to allow the different layers to communicate with each other and within themselves

©Gustavo Alonso, ETH Zürich. 7

INTEGRATION TIER

ACCESS TIER

CLIENT

TIER

APP

TIER

wrapper

wrapper

wrapper

db db db

business object

business object

business object

api api api

web client

java client

WS client

Web servers, RMI, RPC

J2EE, CGI, JAVA Servlets API

databases, multi-tier systems

backends, mainframes

system federations, filters

object brokers, message brokers

app. Servers, TP-Monitors,

stored procedures, Java beans

Web browsers

specialized clients (Java, .NET)

Web Services...

HTML, SOAP, XML

MOM, HTML, IIOP, RMI-IIOP, SOAP, XML

MOM, IIOP, RMI-IIOP, XML

ODBC, JDBC, RPC, MOM, IIOP, RMI-IIOP

©Gustavo Alonso, ETH Zürich. 8

A game of boxes and arrows Each box represents a part of the system.

Each arrow represents a connection between two parts of the system.

The more boxes, the more modular the system: more opportunities for distribution and parallelism. This allows encapsulation, component based design, reuse.

The more boxes, the more arrows: more sessions (connections) need to be maintained, more coordination is necessary. The system becomes more complex to monitor and manage.

The more boxes, the greater the number of context switches and intermediate steps to go through before one gets to the data. Performance suffers considerably.

System designers try to balance the flexibility of modular design with the performance demands of real applications. Once a layer is established, it tends to migrate down and merge with lower layers.

There is no problem in system

design that cannot be solved by

adding a level of indirection.

There is no performance

problem that cannot be solved

by removing a level of

indirection.

©Gustavo Alonso, ETH Zürich. 9

Layers and tiers Client is any user or program that

wants to perform an operation over the system. Clients interact with the system through a presentation layer

The application logic determines what the system actually does. It takes care of enforcing the business rules and establish the business processes. The application logic can take many forms: programs, constraints, business processes, etc.

The resource manager deals with the organization (storage, indexing, and retrieval) of the data necessary to support the application logic. This is typically a database but it can also be a text retrieval system or any other data management system providing querying capabilities and persistence.

Client

Application Logic

Resource Manager

Presentation layer

Business rules

Business objects

Client

Server

Database

Client

Business processes

Persistent

storage

©Gustavo Alonso, ETH Zürich. 10

Top down design The functionality of a system is divided

among several modules. Modules cannot act as a separate component, their functionality depends on the functionality of other modules.

Hardware is typically homogeneous and the system is designed to be distributed from the beginning.

top-down design

PL-A PL-B PL-C

AL-A AL-B

RM-1 RM-2

top-down architecture

RM-1 RM-2

AL-A

AL-B

PL-A PL-B

PL-C

deploy

©Gustavo Alonso, ETH Zürich. 11

Top down design process

presentation layer

resource management layer

application logic layer

client

1. define access channels and client platforms

2. define presentation formats and protocols for the selected clients and protocols

3. define the functionality necessary to deliver the contents and formats needed at the presentation layer

4. define the data sources and data organization needed to implement the application logic

top-down design

©Gustavo Alonso, ETH Zürich. 12

Bottom up design In a bottom up design, many of the basic

components already exist. These are stand alone systems which need to be integrated into new systems.

The components do not necessarily cease to work as stand alone components. Often old applications continue running at the same time as new applications.

This approach has a wide application because the underlying systems already exist and cannot be easily replaced.

Much of the work and products in this area are related to middleware, the intermediate layer used to provide a common interface, bridge heterogeneity, and cope with distribution.

Legacy systems

New application

Legacy application

©Gustavo Alonso, ETH Zürich. 13

Bottom up design

bottom-up design

PL-A PL-B PL-C

AL-A AL-B

AL-A

AL-B

PL-A PL-B

PL-C

wrapper wrapper wrapper

wrapper wrapper wrapper

legacy application

legacy application

legacy system

legacy system

legacy system

©Gustavo Alonso, ETH Zürich. 14

Bottom up design process

presentation layer

resource management layer

application logic layer

client

1. define access channels and client platforms

2. examine existing resources and the functionality they offer

3. wrap existing resources and integrate their functionality into a consistent interface

4. adapt the output of the application logic so that it can be used with the required access channels and client protocols

bottom-up design

©Gustavo Alonso, ETH Zürich. 15

System design In most practical settings, system design

happens mostly bottom up

Legacy systems

Existing infrastructures

Reuse of existing services

The main cost of bottom up design lies on the wrappers and interfaces to older applications

Without proper planning, these interfaces and wrappers become not only performance problems, they also turn into major limitations in terms of functionality

Evolving the system may require to redo everything again as the different elements become very dependent on the implementation of others

Top down design is conceptually more attractive since it allows the designer to make clean choices and build elegant systems

In practice, however, a complete top down design can only be done if:

Nothing exists already

or

The systems to use have clear and well defined interfaces that abstract from the details of the system underneath. These interfaces should also be independent of the implementation (this is what Web services are all about)

Programming language interfaces are not independent of the implementation of the interface !!

Information System Architecture

©Gustavo Alonso, ETH Zürich. 17

Layered design One Tier

Everything in one big black box (presentation, logic and resources)

Examples:

Mainframe applications

Virtualization

One Tier can be a physical single tier (older mainframe applications) or virtual (a web server as seen by a web browser)

Two Tier

Presentation layer (client) separated from logic and resource layers (server)

Examples:

Applet

Google Earth

Presentation layer is not just the interface but the processing of the data to be presented to the external world (users or machines)

Three Tier

Each layer separated

Examples:

CORBA

J2EE

Most web servers

Three tier is just a way to architect an application. Typically, three tier systems are implemented with middleware: platforms supporting the development of the middle tier

N Tier

Generalization of the above

Most realistic description of real systems

©Gustavo Alonso, ETH Zürich. 18

Historical perspective One Tier

Mainframe

No API

Dumb terminals

Two Tier:

Client-Server

New technology

• PCs

• Networking

New software concepts

• RPC

• API

Three Tier:

Middleware

New Technology

• Internet

• Server proliferation

Platforms for development

©Gustavo Alonso, ETH Zürich. 19

Functional perspective One Tier

Highly centralized

No client code

Highly optimized

• Maintenance

• Distribution

• Performance

Two Tier

Use performance of client

Client specialization

Integration of Applications

Three Tier

Platform based

Better management of resources (pooling)

Clusters of computers

Dynamic allocation of resources

More distribution

©Gustavo Alonso, ETH Zürich. 20

Working with the perspectives There is no such thing as a wrong architecture:

Architectures are more or less appropriate for a scenario

All architectures have limitations

But often they do not matter since the intended use will never reach the limits

Know the limits of each architecture

Past use and problems as technology evolves do not “kill” an architecture

All forms of architectures are in use today

All of them are useful in given contexts

Key to choosing the right layered architecture is knowing

The requirements (performance, architectural, evolution)

The intrinsic properties of each architecture

The concrete application scenario

One Tier Architectures

©Gustavo Alonso, ETH Zürich. 22

One tier: fully centralized The presentation layer, application logic

and resource manager are built as a monolithic entity.

Users/programs access the system through display terminals but what is displayed and how it appears is controlled by the server. (These are “dumb” terminals).

This was the typical architecture of mainframes, offering several advantages:

no forced context switches in the control flow (everything happens within the system),

all is centralized, managing and controlling resources is easier,

the design can be highly optimized by blurring the separation between layers.

What is a mainframe? http://www.mainframes.com/whatis.htm

1-tier architecture

Server

©Gustavo Alonso, ETH Zürich. 23

Physical one tier architectures One tier systems tend to be monolithic:

No need for interfaces to the outside world (which eventually lead to the problem of screen scraping: Google for “host screen scraping”))

All the functionality sits in a single system and runs best in a single (large) computer

Are very difficult to take apart and divide in different modules (this is why mainframes are still around)

Historically, as long as one tier systems were the norm:

There was no need for standardization of interfaces

Separation between application, infrastructure, and operating system was fuzzy (e.g., discussions on transactional operating systems)

It was realistic to think about using dedicated architectures to run some of these applications

Today, one tier systems still abound:

Legacy systems that are difficult or very costly to replace

Offer a degree of reliability that is not clearly reachable with other solutions

Software optimized and tested over a long time period (unlike new software)

There is a general lack of long term planning and understanding of the real costs of software development and maintenance

©Gustavo Alonso, ETH Zürich. 24

Integration through Screen Scraping Technically, screen scraping is integration through the presentation layer by extracting

information from the screens presented to the user.

Practically, screen scraping involves capturing the information in the screen, parsing its contents, and extracting the relevant data.

There are many tools that help with screen scraping both for host environments and for HTML/web pages.

Screen scraping is expensive, difficult to maintain, as well as a solution with very limited bandwidth and high latency

http://www.cleo.com/TBPWP/Transaction-Based%20Processing.pdf

©Gustavo Alonso, ETH Zürich. 25

One tier is not the same as mainframe Older one tier architectures run on mainframes but not all applications on mainframes are

one tier (modern ones are definitely not)

Mainframe is less a concrete computer architecture than a set of concepts that are used in large IT data centers

Mainframes are used today because of

Reliability, serviceability and fault tolerance (availability) – the hardware in a mainframe has built in self-checking and self-recovering, and can seamlessly recover from errors

Scalability – mainframes have an architecture that is highly optimized with dedicated processors for OS tasks and a very high I/O bandwidth. It is difficult to beat a mainframe in terms of performance for large batch jobs or loads with a very large number of concurrent users.

Mainframes host many legacy applications that would be very costly to port to other language/platforms

Browse the manual of an IBM mainframe line and OS for more information:

http://publib.boulder.ibm.com/infocenter/zoslnctr/v1r7/index.jsp?topic=/com.ibm.zconcepts.doc/zconcepts_8.html

©Gustavo Alonso, ETH Zürich. 26

Why mainframes today? There are many reasons why mainframes are still in use today:

High performance computing: everything else being equal (CPU power, memory, resources), a centralized application will always be faster than a distributed one (because it does not have the networking overhead)

Reliability: mainframes implement reliability in hardware and can mask hardware failures automatically

Disk I/O: mainframes implement I/O through dedicated processors. An standard mainframe has many such processors and can run a large number of I/O channels (in the hundreds of thousands). The CPU does not stop processing to do I/O, reducing context switches and I/O overhead (this is very important in large databases, data warehouses, and large applications in general)

Virtualization: Mainframes were built around the notion of virtualization. Enterprise computing is moving very quickly towards virtualization and, in such scenarios, running very few large scale computers (e.g., a mainframe) is more attractive than running many smaller computers (e.g., a bunch of PC’s and servers)

©Gustavo Alonso, ETH Zürich. 27

Virtual one tier architectures as pattern Consider it an architectural pattern

Advantages:

No software distribution (or limited)

• Example: web browser (in spirit)

Centralized upgrades and maintenance

Software as a service

• Charge per access, work, or data volume

• No need for licensing software at the client

Centralized control

Disadvantages

No use of resources at the client

Client diversity has to be dealt with internally

Integration implies embedding the system to be integrated in the one-tier system (not practical at large scale, example mash-ups)

This pattern is becoming more and more important

Software as a service

Search engines (Google, Yahoo)

Web servers / browser architectures (e.g., SalesForce)

Two Tier Architectures:

Client-Server

©Gustavo Alonso, ETH Zürich. 29

Two tier: client/server As computers became more powerful, it was

possible to move the presentation layer to the client. This has several advantages:

Clients are independent of each other: one could have several presentation layers depending on what each client wants to do.

One can take advantage of the computing power at the client machine to have more sophisticated presentation layers. This also saves computer resources at the server machine.

It introduces the concept of API (Application Program Interface). An interface to invoke the system from the outside. It also allows designers to think about federating the systems into a single system.

The resource manager only sees one client: the application logic. This greatly helps with performance since there are no client connections/sessions to maintain.

2-tier architecture

Server

©Gustavo Alonso, ETH Zürich. 30

API in client/server Client/server systems introduced the notion of service (the client invokes a service

implemented by the server)

Together with the notion of service, client/server introduced the notion of service interface (how the client can invoke a given service)

Taken all together, the interfaces to all the services provided by a server (whether there are application or system specific) define the server’s Application Program Interface (API) that describes how to interact with the server from the outside

Many standardization efforts were triggered by the need to agree to common APIs for each type of server

resource management layer

service interface

service interface

service interface

service interface

server’s API

service service service service

©Gustavo Alonso, ETH Zürich. 31

Technical aspects of client-server There are clear technical advantages when going from one tier to two tier

architectures:

take advantage of client capacity to off-load work to the clients

work within the server takes place within one scope (almost as in 1 tier)

the server design is still tightly coupled and can be optimized by ignoring presentation issues

still relatively easy to manage and control from a software engineering point of view

However, two tier systems have disadvantages:

The server has to deal with all possible client connections. The maximum number of clients is given by the number of connections supported by the server.

Clients are “tied” to the system since there is no standard presentation layer. If one wants to connect to two systems, then the client needs two presentation layers.

There is no failure or load encapsulation. If the server fails, nobody can work. Similarly, the load created by a client will directly affect the work of others since they are all competing for the same resources.

Many of these problems were not very important until clients and networks started to proliferate (and they became critical with the Internet)

©Gustavo Alonso, ETH Zürich. 32

Scalability limitations of client-server On the server side

On the client side

Server

x 10’s

x 100’s

x 1000’s

x 10000’s

… (internet scale)

Clients

x 10’s

x 100’s

x 1000’s

x 10000’s

… (internet scale)

Client

Servers

©Gustavo Alonso, ETH Zürich. 33

Architectural limitations of client-server

the client is the point of integration (increasingly fat clients)

The responsibility of dealing with heterogeneous systems is shifted to the client.

The client becomes responsible for knowing where things are, how to get to them, and how to ensure consistency

This is tremendously inefficient from all points of view (software design, portability, code reuse, performance since the client capacity is limited, etc.).

There is very little that can be done to solve this problems if staying within the 2 tier model.

Server A Server B

If clients want to access two or more servers, a 2-tier architecture causes several problems:

the underlying systems don’t know about each other

there is no common business logic

©Gustavo Alonso, ETH Zürich. 34

Thin vs Fat clients Thin clients

Minimal functionality, typically restricted to the presentation layer

Advantages:

Small code base

Easy to update and upgrade

Easy to port across platforms

Complexity is left in the server

Disadvantages:

Dos not use computing capabilities of client (storage, CPU, memory)

Functionality of the client necessarily limited

Not everything can be done at the server

Fat clients

Extended functionality, including parts of the application layer and integration code

Advantages:

Client is powerful and does not load the server

Client processing can be tailored to different users

Added value at the client side (commercial advantage)

Disadvantages:

Larger code base

Increases maintenance and upgrades cost

Backward compatibility issues

Difficult to port across platforms

©Gustavo Alonso, ETH Zürich. 35

Two tier architectures as pattern Two tier is a common integration pattern

Advantages

Uses resources at the client Customization by modifying the client Client can be a UI or an application Forces the application to have an interface Integration is easier through the client interface Centralized control for server (one –tier)

Disadvantages

Software at the client (remote) is part of the system Maintenance requires coordinating server and client Backward compatibility issues for older clients Stateless vs Stateful clients Connection management at the server becomes bottleneck Client is server dependent (no common client across servers) Performance loss through context switch client-server and networking

As a pattern, two tier or client-server is the most basic architectural pattern

Three Tier Architectures:

Middleware

©Gustavo Alonso, ETH Zürich. 37

Three tier: middleware In a 3 tier system, the three layers

are fully separated.

The layers are also typically distributed taking advantage of the complete modularity of the design (in two tier systems, the server is typically centralized)

A middleware based system is a 3 tier architecture. This is a bit oversimplified but conceptually correct since the underlying systems can be treated as black boxes. In fact, 3 tier makes only sense in the context of middleware systems (otherwise the client has the same problems as in a 2 tier system).

3-tier architecture

©Gustavo Alonso, ETH Zürich. 38

Middleware Middleware is just a level of indirection

between clients and other layers of the system.

It introduces an additional layer of business logic encompassing all underlying systems.

By doing this, a middleware system:

simplifies the design of the clients by reducing the number of interfaces,

provides transparent access to the underlying systems,

acts as the platform for inter-system functionality and high level application logic, and

takes care of locating resources, accessing them, and gathering results.

But a middleware system is just a system like any other! It can also be 1 tier, 2 tier, 3 tier ...

Integration logic

Clients

Resource managers

Application logic

Server A Server B

Middleware

Middleware

©Gustavo Alonso, ETH Zürich. 39

Technical aspects of middleware The introduction of a middleware layer helps in that:

the number of necessary interfaces is greatly reduced:

• clients see only one system (the middleware)

• local applications see only one system (the middleware)

it centralizes control and provides a common integration platform

it makes necessary functionality widely available to all clients

it allows to implement functionality that otherwise would be very difficult to

provide (e.g., transactions)

it is a first step towards dealing with some forms of application

heterogeneity and integration.

The middleware layer does not help in that:

it is another indirection level,

it is complex software,

it is an integration platform, not a complete system

Needs to be standardized

©Gustavo Alonso, ETH Zürich. 40

A three tier middleware based system

External clients

connecting logic

control

user logic

internal clients

2 t

ier

syst

ems

Resource

managers

wrappers

middleware

Resource

manager

2 tier system

mid

dle

war

e sy

stem

External client

©Gustavo Alonso, ETH Zürich. 41

Three tier as pattern

The introduction of a middle tier

is a common mechanism to solve

design problems of all kinds

Advantages

Middle tier allows to implement

additional functionality without

touching the server

Can act as point of integration

Hides servers from clients

Hides clients from servers

Disadvantages

Performance

Additional logic in the system (often difficult to trace if not well designed)

Added complexity

N Tier Architectures

©Gustavo Alonso, ETH Zürich. 43

N-tier: connecting to the Web N-tier architectures result from

connecting several three tier systems to each other and/or by adding an additional layer to allow clients to access the system through a Web server

The Web layer was initially external to the system (a true additional layer); today, it is slowly being incorporated into a presentation layer that resides on the server side (part of the middleware infrastructure in a three tier system, or part of the server directly in a two tier system)

The addition of the Web layer led to the notion of “application servers”, which was used to refer to middleware platforms supporting access through the Web

client

resource management layer

application logic layer middleware

presentation layer

Web server

Web browser

HTML filter

©Gustavo Alonso, ETH Zürich. 44

Each tier adds functionality

Complete systems covering all aspects of security, performance, high availability, etc. tend to have several tiers to address all these concerns

©Gustavo Alonso, ETH Zürich. 45

INTERNET

FIREWALL

LAN

Web server cluster

LAN, gateways

LAN

internal clients

LAN

middleware application

logic

resource management

layer database server

LAN

middleware application

logic

additional resource management layers

LAN

Wrappers and

gateways

file server

application

N-tier systems in reality