Bologna, 25 - 26 September 2003
Giovanna Ferrari ([email protected])School of Computing Science
University of Newcastle upon TyneGiorgia Lodi ([email protected])Department of Computer Science
University of Bologna
V.Ghini, F. Panzieri
2Bologna, 25 - 26 September 2003
Summary• TAPAS Architecture: a quick look
through it• Sensors • Reporting Service • Configuration Service• Controller Service • Concluding remarks• References
3Bologna, 25 - 26 September 2003
TAPAS Architecture
ApplicationsQoS Monitoring
and Violation Detection
(Carlos’ talk)
Inter-Org. Interaction Regulation (Nick’s talk)
Middleware Services for QoS Management, Monitoring and Adaptation (Gio&Gio’s talk)
QoS Enabled Application Server
4Bologna, 25 - 26 September 2003
Partner SLA
CONFIGURATION
CONTROLLER
REPORTING
INTERPRETER
COORDINATOR
SECURITY
End user SLA
LOCAL RESOURCES
SENSOR
REMOTE RESOURCES
SENSOR
SENSORSENSORA P P L
END USER
Middleware Services
5Bologna, 25 - 26 September 2003
Sensors & Reporting Service
Partner SLA
CONFIGURATION
CONTROLLER
REPORTING
INTERPRETER
COORDINATOR
SECURITY
End user SLA
LOCAL RESOURCES
SENSOR
REMOTE RESOURCES
SENSOR
SENSORSENSORA P P L
END USER
6Bologna, 25 - 26 September 2003
Sensors (1/5)
• Which resources to monitor:– Physical Resources (e.g. hosts, networks)– Logical Resources (e.g. the applications, file
system, middleware services)
• Data collected by means of Sensors (i.e. sensors for physical and logical resources)
• Sensors activated by the Reporting Service when needed
7Bologna, 25 - 26 September 2003
Sensors (2/5)
• SLA used to derive basic QoS parameters for each type of the aforementioned resources
SLA item Quantification
Percentage of transactionsCompleted within defined Performance Levels %Availability of all Components Connected to the Network %Availability of Applications on the Network %Application response Time during peak periods msMedian Application response Time msAverage one way latency msExternal memory access time msCPU usage %
8Bologna, 25 - 26 September 2003
Sensors (3/5)
Host
Sensor
CPU running Queue Length
CPU blocked Queue Length
CPU Swapped Queue Length
System Call Rates
CPU User Percentage
CPU System Percentage
CPU idle Percentage
Free Available Memory
……….
CPU usage
Availability
Composite Metrics
Raw Metrics
9Bologna, 25 - 26 September 2003
Sensors (4/5)
Host
Sensor
Host
Sensor
Network RTT
…….Average one way latency
Raw metrics
Composite Metrics
Raw metrics
Network RTT
…….
10Bologna, 25 - 26 September 2003
Sensors (5/5)
Applications
Sensors
Percentage of transactions completed
Application Response Time (RT)
…………..
Availabiliy
Median Application RT
Application RT during peak periods
Composite Metrics
Raw Metrics
11Bologna, 25 - 26 September 2003
Reporting Service (RS)
• It periodically collects Raw Metrics by means of the sensors
• It calculates Composite Metrics according to the SLA items
• It records the data into repositories for statistical analysis, as well as for providing trustworthy reports
12Bologna, 25 - 26 September 2003
Configuration Service (1/2)
Partner SLA
CONFIGURATION
CONTROLLER
REPORTING
INTERPRETER
COORDINATOR
SECURITY
End user SLA
LOCAL RESOURCES
SENSOR
REMOTE RESOURCES
SENSOR
SENSORSENSORA P P L
END USER
13Bologna, 25 - 26 September 2003
Configuration Service (2/2)
• Enabled at deployment time • Responsible for admission control (i.e.
discovery, negotiation and reservation of the resources)
• Produces AgreedQoS, i.e. a contract between the middleware platform and the environment which is to be hosted
• AgreedQoS used as input for the Controller Service (CTRL)
14Bologna, 25 - 26 September 2003
Configuration Service: The Protocol (1/2)
• A Configuration Service (CS) for each Application Server (AS)– Application Service Provider (ASP): a distributed
architecture consisting of a cluster of ASs (i.e. cluster of CSs)
• Two different CS personalities: Leader and Slave– Leader gets SLS as input and starts the admission
control protocol– Slaves contacted by the Leader in order to get their
local resource availability and to negotiate QoS parameters
15Bologna, 25 - 26 September 2003
Configuration Service: The Protocol (2/2)
• Leader forms the group of CSs– IP addresses of Slaves known e.g at deployment time, …
• For each member of the group (Slave), Leader kicks off the admission control protocol: – Asks local RS for its own resource availability and asks each
Slave for its local resource availability Every Slave contacts its own RS for getting resource info
– Every CS (i.e. Leader and Slaves) books its local available resources
– Leader gets Composite Metrics from each Slave and from himself; starts the QoS negotiation: AgreedQoS contract as final result
– From this contract, Leader confirms (totally, partially, or not at all) the initial resource booking
Resources are allocated– Leader instruments Instantiate Manager in order to instantiate
application components into containers according to AgreedQoS
16Bologna, 25 - 26 September 2003
What does the AgreedQoS look like?• It is an object seen as the run-time version of the SLS
• It contains useful info about the platform which is going to be used for the execution of distributed applications
• It contains info about the resources reserved and allocated after the negotiation
3 database replicas:
colline:130.136.x.x
bess:130.136.x.x
bologna:120.128.x.x
Machines Availability
newton:130.136.x.x CPU: 50%, Memory 50%, RTT=#ms, no database av.
leonardo:130.136.x.x Web page:xxxx available
……………………..
17Bologna, 25 - 26 September 2003
Group Communication Requirements
• Group communication required for carrying out the admission control protocol
• Some primitives necessary in order to manage the group of CSs in a dynamic manner:– CreateGroup: creates a group of CS for a certain
application– Memebership: obtains information about the members of a
group– Join: allows a new group member to join a group– Leave: allows a group member to leave a group– Send: enables message exchange among the members of a
group (with QoS guarantees as well)– Deviler: delivers messages to the invoker in a reliable and
QoS-aware way
18Bologna, 25 - 26 September 2003
Controller Service (1/3)
Partner SLA
CONFIGURATION
CONTROLLER
REPORTING
INTERPRETER
COORDINATOR
SECURITY
End user SLA
LOCAL RESOURCES
SENSOR
REMOTE RESOURCES
SENSOR
SENSORSENSORA P P L
END USER
19Bologna, 25 - 26 September 2003
• At runtime, if CTRL detects any variation in QoS values (e.g. fluctuation of the workload or change in required I/O QoS levels); actions are being taken for adapting the platform
• These actions represent the Adaptive strategies that manage and control the resources and the output delivered • Adaptation starts when the QoS level is close to the Warning Point, even if the SLA is not breached yet
• At the Breaching Point the adaptation has failed and an event notification is sent
Controller Service (2/3)
20Bologna, 25 - 26 September 2003
• It receives from the CS the AgreedQoS contract• It periodically retrieves from the RS the Actual
QoS Values (QoS achieved according to the measurements of the sensors)
• It compares the two sets of values– If there is a violation, it decides the Adaptive Strategy
to deploy– The adaptation may require a resource re-allocation
• CS re-negotiates QoS
– The adaptation may fail an exception is raised to the higher monitoring service (i.e, Carlos’ monitoring)
Controller Service (3/3)
21Bologna, 25 - 26 September 2003
Concluding Remarks• Two possible implementations of
Configuration and CTRL services:1. Both services as extensions of AS
• assume one AS per machine– CTRL responsible for local resources, only– invokes CS if detects deviation from AgreedQoS– CS re-negotiate QoS, if necessary (i.e., adaptation)
2. CTRL as support service for AS• can be hosted by TTP
• UNIBO wishes to evaluate the alternative 1– conceptually simpler
• More about it, tomorrow …
22Bologna, 25 - 26 September 2003
References
• W.Beckman, J.Crowcroft, P.Gevros and M.Oleneva ``TAPAS Deliverable D1'', University of Cambridge and Adesso AG, 2002-10-17.
• L. R. Welch, B.A.Shirazi, B. Ravindran and C. Bruggeman “DeSiDeRaTa: QoS Management Technology for Dynamic Scalable, Dependable, Real-Time System”, University of Texas at Arlington, USA.
• M. Debusmann and A. Keller “SLA-driven Management of Distributed Systems using the Common Information Model " IBM Research Division, NY August 16 2002.