Upload
trankhue
View
222
Download
0
Embed Size (px)
Citation preview
Lecture 1:
Multi-Tier Organization of an
Information System Gustavo Alonso
Systems Group
Computer Science Department - ETH Zurich
http://www.systems.inf.ethz.ch/
Reading assignment 1
Multi-tier systems and architecture patterns
Goal
Understand how patterns are used in practice and the role of multiple tiers when used in a variety of environments. Become familiar with the industry literature on the topic of patterns.
Assignment:
Read the following two documents:
• http://www.redbooks.ibm.com/redbooks/pdfs/sg246822.pdf (up to but excluding
part 3, run time implementation)
• http://support.rightscale.com/12-
Guides/EC2_Best_Practices/EC2_Site_Architecture_Diagrams
Compare their contents and patterns proposed with the ones presented in the lecture
Write a report (10 pages) analyzing the cloud patterns proposed by Rightscale based on what was discussed in the lecture and the ideas of the IBM Redbook
Deadline: 12th October, 18:00 pm => send per e-mail (pdf only)
©Gustavo Alonso, ETH Zürich. 2
©Gustavo Alonso, ETH Zürich. 3
Contents
Design of an information system
Layers and tiers
Bottom up design
Top down design
Architecture of an information system
One tier
Two tier (client/server)
Three tier (middleware)
N-tier architectures
Clusters and tier distribution
A note on performance
©Gustavo Alonso, ETH Zürich. 5
Why Tiers? Information Systems are divided into
different tiers. The tiers can be conceptual or real The tiers can be implemented using
different technologies or using a single vendor stack (cfr. vendor lock-in, standards)
Distinguishing between tiers is, in practice, not an obvious exercise as the different system modules will also be divided according to business function (the system of the legal department, the system of the accounting department, etc.)
While the different tiers are a software engineering concept, they have almost always appeared as a result of changes in technology (computers, hardware, or networking)
The notion of tiers will allow us to describe the architecture of modern IT infrastructures abstracting from the system details
Tiered architecture
Business function architecture
©Gustavo Alonso, ETH Zürich. 6
Some practical notation Horizontal layer: refers to a layer
that is application independent and intended to work on a wide range of business environments
Vertical architecture/solution: refers to systems tailored to concrete applications and industry branches (e.g., insurance or banking)
Single vendor stack: all the layers are produced by a single vendor (e.g., WebSphere of IBM, WebLogic of BEA)
Host (hosted solution): mainframe, mainframe based applications
TCO (Total Cost of Ownership): the global cost of an IT system, including maintenance, hw-sw-people, licenses, etc.
Plumbing – glue: the infrastructure necessary to allow the different layers to communicate with each other and within themselves
©Gustavo Alonso, ETH Zürich. 7
INTEGRATION TIER
ACCESS TIER
CLIENT
TIER
APP
TIER
wrapper
wrapper
wrapper
db db db
business object
business object
business object
api api api
web client
java client
WS client
Web servers, RMI, RPC
J2EE, CGI, JAVA Servlets API
databases, multi-tier systems
backends, mainframes
system federations, filters
object brokers, message brokers
app. Servers, TP-Monitors,
stored procedures, Java beans
Web browsers
specialized clients (Java, .NET)
Web Services...
HTML, SOAP, XML
MOM, HTML, IIOP, RMI-IIOP, SOAP, XML
MOM, IIOP, RMI-IIOP, XML
ODBC, JDBC, RPC, MOM, IIOP, RMI-IIOP
©Gustavo Alonso, ETH Zürich. 8
A game of boxes and arrows Each box represents a part of the system.
Each arrow represents a connection between two parts of the system.
The more boxes, the more modular the system: more opportunities for distribution and parallelism. This allows encapsulation, component based design, reuse.
The more boxes, the more arrows: more sessions (connections) need to be maintained, more coordination is necessary. The system becomes more complex to monitor and manage.
The more boxes, the greater the number of context switches and intermediate steps to go through before one gets to the data. Performance suffers considerably.
System designers try to balance the flexibility of modular design with the performance demands of real applications. Once a layer is established, it tends to migrate down and merge with lower layers.
There is no problem in system
design that cannot be solved by
adding a level of indirection.
There is no performance
problem that cannot be solved
by removing a level of
indirection.
©Gustavo Alonso, ETH Zürich. 9
Layers and tiers Client is any user or program that
wants to perform an operation over the system. Clients interact with the system through a presentation layer
The application logic determines what the system actually does. It takes care of enforcing the business rules and establish the business processes. The application logic can take many forms: programs, constraints, business processes, etc.
The resource manager deals with the organization (storage, indexing, and retrieval) of the data necessary to support the application logic. This is typically a database but it can also be a text retrieval system or any other data management system providing querying capabilities and persistence.
Client
Application Logic
Resource Manager
Presentation layer
Business rules
Business objects
Client
Server
Database
Client
Business processes
Persistent
storage
©Gustavo Alonso, ETH Zürich. 10
Top down design The functionality of a system is divided
among several modules. Modules cannot act as a separate component, their functionality depends on the functionality of other modules.
Hardware is typically homogeneous and the system is designed to be distributed from the beginning.
top-down design
PL-A PL-B PL-C
AL-A AL-B
RM-1 RM-2
top-down architecture
RM-1 RM-2
AL-A
AL-B
PL-A PL-B
PL-C
deploy
©Gustavo Alonso, ETH Zürich. 11
Top down design process
presentation layer
resource management layer
application logic layer
client
1. define access channels and client platforms
2. define presentation formats and protocols for the selected clients and protocols
3. define the functionality necessary to deliver the contents and formats needed at the presentation layer
4. define the data sources and data organization needed to implement the application logic
top-down design
©Gustavo Alonso, ETH Zürich. 12
Bottom up design In a bottom up design, many of the basic
components already exist. These are stand alone systems which need to be integrated into new systems.
The components do not necessarily cease to work as stand alone components. Often old applications continue running at the same time as new applications.
This approach has a wide application because the underlying systems already exist and cannot be easily replaced.
Much of the work and products in this area are related to middleware, the intermediate layer used to provide a common interface, bridge heterogeneity, and cope with distribution.
Legacy systems
New application
Legacy application
©Gustavo Alonso, ETH Zürich. 13
Bottom up design
bottom-up design
PL-A PL-B PL-C
AL-A AL-B
AL-A
AL-B
PL-A PL-B
PL-C
wrapper wrapper wrapper
wrapper wrapper wrapper
legacy application
legacy application
legacy system
legacy system
legacy system
©Gustavo Alonso, ETH Zürich. 14
Bottom up design process
presentation layer
resource management layer
application logic layer
client
1. define access channels and client platforms
2. examine existing resources and the functionality they offer
3. wrap existing resources and integrate their functionality into a consistent interface
4. adapt the output of the application logic so that it can be used with the required access channels and client protocols
bottom-up design
©Gustavo Alonso, ETH Zürich. 15
System design In most practical settings, system design
happens mostly bottom up
Legacy systems
Existing infrastructures
Reuse of existing services
The main cost of bottom up design lies on the wrappers and interfaces to older applications
Without proper planning, these interfaces and wrappers become not only performance problems, they also turn into major limitations in terms of functionality
Evolving the system may require to redo everything again as the different elements become very dependent on the implementation of others
Top down design is conceptually more attractive since it allows the designer to make clean choices and build elegant systems
In practice, however, a complete top down design can only be done if:
Nothing exists already
or
The systems to use have clear and well defined interfaces that abstract from the details of the system underneath. These interfaces should also be independent of the implementation (this is what Web services are all about)
Programming language interfaces are not independent of the implementation of the interface !!
©Gustavo Alonso, ETH Zürich. 17
Layered design One Tier
Everything in one big black box (presentation, logic and resources)
Examples:
Mainframe applications
Virtualization
One Tier can be a physical single tier (older mainframe applications) or virtual (a web server as seen by a web browser)
Two Tier
Presentation layer (client) separated from logic and resource layers (server)
Examples:
Applet
Google Earth
Presentation layer is not just the interface but the processing of the data to be presented to the external world (users or machines)
Three Tier
Each layer separated
Examples:
CORBA
J2EE
Most web servers
Three tier is just a way to architect an application. Typically, three tier systems are implemented with middleware: platforms supporting the development of the middle tier
N Tier
Generalization of the above
Most realistic description of real systems
©Gustavo Alonso, ETH Zürich. 18
Historical perspective One Tier
Mainframe
No API
Dumb terminals
Two Tier:
Client-Server
New technology
• PCs
• Networking
New software concepts
• RPC
• API
Three Tier:
Middleware
New Technology
• Internet
• Server proliferation
Platforms for development
©Gustavo Alonso, ETH Zürich. 19
Functional perspective One Tier
Highly centralized
No client code
Highly optimized
• Maintenance
• Distribution
• Performance
Two Tier
Use performance of client
Client specialization
Integration of Applications
Three Tier
Platform based
Better management of resources (pooling)
Clusters of computers
Dynamic allocation of resources
More distribution
©Gustavo Alonso, ETH Zürich. 20
Working with the perspectives There is no such thing as a wrong architecture:
Architectures are more or less appropriate for a scenario
All architectures have limitations
But often they do not matter since the intended use will never reach the limits
Know the limits of each architecture
Past use and problems as technology evolves do not “kill” an architecture
All forms of architectures are in use today
All of them are useful in given contexts
Key to choosing the right layered architecture is knowing
The requirements (performance, architectural, evolution)
The intrinsic properties of each architecture
The concrete application scenario
©Gustavo Alonso, ETH Zürich. 22
One tier: fully centralized The presentation layer, application logic
and resource manager are built as a monolithic entity.
Users/programs access the system through display terminals but what is displayed and how it appears is controlled by the server. (These are “dumb” terminals).
This was the typical architecture of mainframes, offering several advantages:
no forced context switches in the control flow (everything happens within the system),
all is centralized, managing and controlling resources is easier,
the design can be highly optimized by blurring the separation between layers.
What is a mainframe? http://www.mainframes.com/whatis.htm
1-tier architecture
Server
©Gustavo Alonso, ETH Zürich. 23
Physical one tier architectures One tier systems tend to be monolithic:
No need for interfaces to the outside world (which eventually lead to the problem of screen scraping: Google for “host screen scraping”))
All the functionality sits in a single system and runs best in a single (large) computer
Are very difficult to take apart and divide in different modules (this is why mainframes are still around)
Historically, as long as one tier systems were the norm:
There was no need for standardization of interfaces
Separation between application, infrastructure, and operating system was fuzzy (e.g., discussions on transactional operating systems)
It was realistic to think about using dedicated architectures to run some of these applications
Today, one tier systems still abound:
Legacy systems that are difficult or very costly to replace
Offer a degree of reliability that is not clearly reachable with other solutions
Software optimized and tested over a long time period (unlike new software)
There is a general lack of long term planning and understanding of the real costs of software development and maintenance
©Gustavo Alonso, ETH Zürich. 24
Integration through Screen Scraping Technically, screen scraping is integration through the presentation layer by extracting
information from the screens presented to the user.
Practically, screen scraping involves capturing the information in the screen, parsing its contents, and extracting the relevant data.
There are many tools that help with screen scraping both for host environments and for HTML/web pages.
Screen scraping is expensive, difficult to maintain, as well as a solution with very limited bandwidth and high latency
http://www.cleo.com/TBPWP/Transaction-Based%20Processing.pdf
©Gustavo Alonso, ETH Zürich. 25
One tier is not the same as mainframe Older one tier architectures run on mainframes but not all applications on mainframes are
one tier (modern ones are definitely not)
Mainframe is less a concrete computer architecture than a set of concepts that are used in large IT data centers
Mainframes are used today because of
Reliability, serviceability and fault tolerance (availability) – the hardware in a mainframe has built in self-checking and self-recovering, and can seamlessly recover from errors
Scalability – mainframes have an architecture that is highly optimized with dedicated processors for OS tasks and a very high I/O bandwidth. It is difficult to beat a mainframe in terms of performance for large batch jobs or loads with a very large number of concurrent users.
Mainframes host many legacy applications that would be very costly to port to other language/platforms
Browse the manual of an IBM mainframe line and OS for more information:
http://publib.boulder.ibm.com/infocenter/zoslnctr/v1r7/index.jsp?topic=/com.ibm.zconcepts.doc/zconcepts_8.html
©Gustavo Alonso, ETH Zürich. 26
Why mainframes today? There are many reasons why mainframes are still in use today:
High performance computing: everything else being equal (CPU power, memory, resources), a centralized application will always be faster than a distributed one (because it does not have the networking overhead)
Reliability: mainframes implement reliability in hardware and can mask hardware failures automatically
Disk I/O: mainframes implement I/O through dedicated processors. An standard mainframe has many such processors and can run a large number of I/O channels (in the hundreds of thousands). The CPU does not stop processing to do I/O, reducing context switches and I/O overhead (this is very important in large databases, data warehouses, and large applications in general)
Virtualization: Mainframes were built around the notion of virtualization. Enterprise computing is moving very quickly towards virtualization and, in such scenarios, running very few large scale computers (e.g., a mainframe) is more attractive than running many smaller computers (e.g., a bunch of PC’s and servers)
©Gustavo Alonso, ETH Zürich. 27
Virtual one tier architectures as pattern Consider it an architectural pattern
Advantages:
No software distribution (or limited)
• Example: web browser (in spirit)
Centralized upgrades and maintenance
Software as a service
• Charge per access, work, or data volume
• No need for licensing software at the client
Centralized control
Disadvantages
No use of resources at the client
Client diversity has to be dealt with internally
Integration implies embedding the system to be integrated in the one-tier system (not practical at large scale, example mash-ups)
This pattern is becoming more and more important
Software as a service
Search engines (Google, Yahoo)
Web servers / browser architectures (e.g., SalesForce)
©Gustavo Alonso, ETH Zürich. 29
Two tier: client/server As computers became more powerful, it was
possible to move the presentation layer to the client. This has several advantages:
Clients are independent of each other: one could have several presentation layers depending on what each client wants to do.
One can take advantage of the computing power at the client machine to have more sophisticated presentation layers. This also saves computer resources at the server machine.
It introduces the concept of API (Application Program Interface). An interface to invoke the system from the outside. It also allows designers to think about federating the systems into a single system.
The resource manager only sees one client: the application logic. This greatly helps with performance since there are no client connections/sessions to maintain.
2-tier architecture
Server
©Gustavo Alonso, ETH Zürich. 30
API in client/server Client/server systems introduced the notion of service (the client invokes a service
implemented by the server)
Together with the notion of service, client/server introduced the notion of service interface (how the client can invoke a given service)
Taken all together, the interfaces to all the services provided by a server (whether there are application or system specific) define the server’s Application Program Interface (API) that describes how to interact with the server from the outside
Many standardization efforts were triggered by the need to agree to common APIs for each type of server
resource management layer
service interface
service interface
service interface
service interface
server’s API
service service service service
©Gustavo Alonso, ETH Zürich. 31
Technical aspects of client-server There are clear technical advantages when going from one tier to two tier
architectures:
take advantage of client capacity to off-load work to the clients
work within the server takes place within one scope (almost as in 1 tier)
the server design is still tightly coupled and can be optimized by ignoring presentation issues
still relatively easy to manage and control from a software engineering point of view
However, two tier systems have disadvantages:
The server has to deal with all possible client connections. The maximum number of clients is given by the number of connections supported by the server.
Clients are “tied” to the system since there is no standard presentation layer. If one wants to connect to two systems, then the client needs two presentation layers.
There is no failure or load encapsulation. If the server fails, nobody can work. Similarly, the load created by a client will directly affect the work of others since they are all competing for the same resources.
Many of these problems were not very important until clients and networks started to proliferate (and they became critical with the Internet)
©Gustavo Alonso, ETH Zürich. 32
Scalability limitations of client-server On the server side
On the client side
Server
x 10’s
x 100’s
x 1000’s
x 10000’s
… (internet scale)
Clients
x 10’s
x 100’s
x 1000’s
x 10000’s
… (internet scale)
Client
Servers
…
©Gustavo Alonso, ETH Zürich. 33
Architectural limitations of client-server
the client is the point of integration (increasingly fat clients)
The responsibility of dealing with heterogeneous systems is shifted to the client.
The client becomes responsible for knowing where things are, how to get to them, and how to ensure consistency
This is tremendously inefficient from all points of view (software design, portability, code reuse, performance since the client capacity is limited, etc.).
There is very little that can be done to solve this problems if staying within the 2 tier model.
Server A Server B
If clients want to access two or more servers, a 2-tier architecture causes several problems:
the underlying systems don’t know about each other
there is no common business logic
©Gustavo Alonso, ETH Zürich. 34
Thin vs Fat clients Thin clients
Minimal functionality, typically restricted to the presentation layer
Advantages:
Small code base
Easy to update and upgrade
Easy to port across platforms
Complexity is left in the server
Disadvantages:
Dos not use computing capabilities of client (storage, CPU, memory)
Functionality of the client necessarily limited
Not everything can be done at the server
Fat clients
Extended functionality, including parts of the application layer and integration code
Advantages:
Client is powerful and does not load the server
Client processing can be tailored to different users
Added value at the client side (commercial advantage)
Disadvantages:
Larger code base
Increases maintenance and upgrades cost
Backward compatibility issues
Difficult to port across platforms
©Gustavo Alonso, ETH Zürich. 35
Two tier architectures as pattern Two tier is a common integration pattern
Advantages
Uses resources at the client Customization by modifying the client Client can be a UI or an application Forces the application to have an interface Integration is easier through the client interface Centralized control for server (one –tier)
Disadvantages
Software at the client (remote) is part of the system Maintenance requires coordinating server and client Backward compatibility issues for older clients Stateless vs Stateful clients Connection management at the server becomes bottleneck Client is server dependent (no common client across servers) Performance loss through context switch client-server and networking
As a pattern, two tier or client-server is the most basic architectural pattern
©Gustavo Alonso, ETH Zürich. 37
Three tier: middleware In a 3 tier system, the three layers
are fully separated.
The layers are also typically distributed taking advantage of the complete modularity of the design (in two tier systems, the server is typically centralized)
A middleware based system is a 3 tier architecture. This is a bit oversimplified but conceptually correct since the underlying systems can be treated as black boxes. In fact, 3 tier makes only sense in the context of middleware systems (otherwise the client has the same problems as in a 2 tier system).
3-tier architecture
©Gustavo Alonso, ETH Zürich. 38
Middleware Middleware is just a level of indirection
between clients and other layers of the system.
It introduces an additional layer of business logic encompassing all underlying systems.
By doing this, a middleware system:
simplifies the design of the clients by reducing the number of interfaces,
provides transparent access to the underlying systems,
acts as the platform for inter-system functionality and high level application logic, and
takes care of locating resources, accessing them, and gathering results.
But a middleware system is just a system like any other! It can also be 1 tier, 2 tier, 3 tier ...
Integration logic
Clients
Resource managers
Application logic
Server A Server B
Middleware
Middleware
©Gustavo Alonso, ETH Zürich. 39
Technical aspects of middleware The introduction of a middleware layer helps in that:
the number of necessary interfaces is greatly reduced:
• clients see only one system (the middleware)
• local applications see only one system (the middleware)
it centralizes control and provides a common integration platform
it makes necessary functionality widely available to all clients
it allows to implement functionality that otherwise would be very difficult to
provide (e.g., transactions)
it is a first step towards dealing with some forms of application
heterogeneity and integration.
The middleware layer does not help in that:
it is another indirection level,
it is complex software,
it is an integration platform, not a complete system
Needs to be standardized
©Gustavo Alonso, ETH Zürich. 40
A three tier middleware based system
External clients
connecting logic
control
user logic
internal clients
2 t
ier
syst
ems
Resource
managers
wrappers
middleware
Resource
manager
2 tier system
mid
dle
war
e sy
stem
External client
©Gustavo Alonso, ETH Zürich. 41
Three tier as pattern
The introduction of a middle tier
is a common mechanism to solve
design problems of all kinds
Advantages
Middle tier allows to implement
additional functionality without
touching the server
Can act as point of integration
Hides servers from clients
Hides clients from servers
Disadvantages
Performance
Additional logic in the system (often difficult to trace if not well designed)
Added complexity
©Gustavo Alonso, ETH Zürich. 43
N-tier: connecting to the Web N-tier architectures result from
connecting several three tier systems to each other and/or by adding an additional layer to allow clients to access the system through a Web server
The Web layer was initially external to the system (a true additional layer); today, it is slowly being incorporated into a presentation layer that resides on the server side (part of the middleware infrastructure in a three tier system, or part of the server directly in a two tier system)
The addition of the Web layer led to the notion of “application servers”, which was used to refer to middleware platforms supporting access through the Web
client
resource management layer
application logic layer middleware
presentation layer
Web server
Web browser
HTML filter
©Gustavo Alonso, ETH Zürich. 44
Each tier adds functionality
Complete systems covering all aspects of security, performance, high availability, etc. tend to have several tiers to address all these concerns
©Gustavo Alonso, ETH Zürich. 45
INTERNET
FIREWALL
LAN
Web server cluster
LAN, gateways
LAN
internal clients
LAN
middleware application
logic
resource management
layer database server
LAN
middleware application
logic
additional resource management layers
LAN
Wrappers and
gateways
file server
application
N-tier systems in reality