16
Decision-Theoretic Planning with (Re)Deployment of Components in Distributed Real-time & Embedded Systems Douglas C. Schmidt [email protected] Nishanth Shankaran, John S. Kinnebrew, Gautam Biswas, Dipa Suri, & Adam S. Howell Research Sponsored by NASA, Lockheed Martin, & Raytheon

Douglas C. Schmidt [email protected]

  • Upload
    kita

  • View
    70

  • Download
    7

Embed Size (px)

DESCRIPTION

Decision-Theoretic Planning with (Re)Deployment of Components in Distributed Real-time & Embedded Systems. Douglas C. Schmidt [email protected]. Nishanth Shankaran, John S. Kinnebrew, Gautam Biswas, Dipa Suri, & Adam S. Howell. Research Sponsored by NASA, Lockheed Martin, & Raytheon. - PowerPoint PPT Presentation

Citation preview

Page 1: Douglas C. Schmidt schmidt@dre.vanderbilt

Decision-Theoretic Planning with (Re)Deployment of Components in

Distributed Real-time & Embedded Systems

Douglas C. [email protected]

Nishanth Shankaran, John S. Kinnebrew,Gautam Biswas, Dipa Suri, & Adam S. Howell

Research Sponsored by NASA, Lockheed Martin, & Raytheon

Page 2: Douglas C. Schmidt schmidt@dre.vanderbilt

2

Enterprise Distributed Real-time Embedded (DRE) Systems

• Operate under limited resources

• Tight real-time performance QoS constraints

• Dynamic & uncertain environments

• Dynamic & changing goals

• Distribution of computation

• Multiple onboard processors

• Task distribution among satellites/processors

• Integration of information

• Data collection – Gizmo agent

• Process data – Science agent

• Downlink to earth – Comm. agent

• Coordinated operation

Limited CPU, memory & network bandwidth

Possible hardware failure

Limited CPU, memory & network bandwidth

Page 3: Douglas C. Schmidt schmidt@dre.vanderbilt

3

Motivating Application: Earth Science Enterprise Mission

Collect Data Process Data

Downlink to Earth

Trigger

Message A Final Result

Message B

System Description

• End-to-end systems-tasks/work-flows represented as operational string of components

• Operational strings simultaneously share resources• Strings are dynamically added/removed from the system based on mission & modeSystem requirements1. Automatically & accurately adapt to dynamic changes in requirements & conditions2. Handle failures arising from system failures

• Classes of operational strings with respect to importance

• Mission Critical, Mission Support, & Best Effort

Resource Allocation & Control Engine (RACE), a dynamic resource management

framework for DRE systems

Spreading Activation Partial Order Planner (SA-POP) for decision-theoretic planning

under resource constraints, combined with

Integrated Solution

Page 4: Douglas C. Schmidt schmidt@dre.vanderbilt

4

SA-POP Research & Development Challenges

Research Challenges

1. Efficiently handle uncertainty in planning

2. Incorporate resource-aware scheduling with planning

Development Challenges

1. Take advantage of functionally interchangeable components to efficiently meet resource constraints

2. Plan with multiple interacting goals, but produce distinct operational strings

Deployment, Configuration, & Control

SA-POP

Probabilistic Domain Knowledge

System Knowledge

Mission Goals

SA-POP is available at: www.dre.vanderbilt.edu/~jkinnebrew/SA-POP

Page 5: Douglas C. Schmidt schmidt@dre.vanderbilt

5

SA-POP: Planning in DRE Systems with ComponentsTask is an abstraction of functionality• Multiple (parameterized) components

may have the same function but different resource usage

Task Network specifies probabilistic effects & requirements for tasks

Deployment, Configuration, & Control

SA-POP

TaskNetwork

TaskMap

SpreadingActivation

Planning Scheduling

Probabilistic Domain Knowledge

System Knowledge

Mission Goals

Operational Strings

• Condition nodes specify data flow & system/environmental conditions

• Task nodes have links to/from condition nodes specifying effects/preconditions

• Links incorporate probabilistic information about domains

Task Map allows conversion between tasks & components

• Maps tasks (functionality abstraction) to parameterized components (implementation)

• Associates expected or worst case resource usage with each implementation

Operational String specifies a component-based application to achieve a goal

• Set of tasks along with ordering & timing constraints• Data connections between tasks• Implementation (parameterized component) suggested for each task

Page 6: Douglas C. Schmidt schmidt@dre.vanderbilt

6SA-POP

TaskNetwork

TaskMap

SpreadingActivation

Planning Scheduling

ci

aj

ck

c

c

c

SA-POP: Expected Utility Calculation using Spreading Activation

Tasknode

Preconditionnodes

Effect nodes

successfulwasaActiona

falsecaPtruecaP

falsecaPtruecaPw

jsj

isji

sj

isji

sj

ij

)|()|(

)|()|(

Precondition Link Weights

Forward propagation of probabilities

Backward propagation of utilities

executedwasaa

afalsecPoratruecPw

jxj

xjk

xjkjk

)|()|(

EffectLink Weights

(input data or system/environment preconditions)

(output data or system/ environment effects)

Page 7: Douglas C. Schmidt schmidt@dre.vanderbilt

7

SA-POP: Operational String Generation

Four hierarchical decision points in each interleaved planning+scheduling step:Partial Order Planning:1. Goal/subgoal choice: choose an open condition, which is goal or subgoal

unsatisfied in the current plan.2. Task choice: choose a task that can achieve current open condition.Resource Constrained Scheduling:3. Task instantiation: choose an implementation for this task from the Task Map.4. Scheduling decision(s): adjust task start/end time windows and/or add

ordering constraints between tasks to avoid potential resource violations.Continue recursively

SA-POP

TaskNetwork

TaskMap

Planning Scheduling

Mission Goals

Page 8: Douglas C. Schmidt schmidt@dre.vanderbilt

8

RACE Research & Development Challenges

Research Challenges

1. Efficiently allocate computing & network resources to application components

2. Avoid over-utilization of system resources – ensure system stability

3. Maintain QoS even in the presence of failure

4. Ensure end-to-end QoS requirements are met – even under high load conditions

Development Challenges

1. Need multiple resource management algorithms depending on application characteristics & current system condition (resource availability)

2. Single resource management mechanism customized for a specific mission goal or set of mission goals might be effective for that specific scenario

3. However, can not be reused for other scenario Reinvent the wheel for every scenario

Intelligent Mission Planner(SA-POP)

Target Platform with Varying Resource Availabilities and Capabilities

Resource Allocation and Control Engine (RACE)

Operational Strings with Varying Resource & QoS

Requirements

Uniform Interface to Deploy and Manager

Components

Page 9: Douglas C. Schmidt schmidt@dre.vanderbilt

9

RACE Functional Architecture

• Dynamic resource management framework atop CORBA Component Model (CCM) middleware (CIAO/DAnCE)

• Can easily be generalized to other middleware, e.g., DDS

• Allocates components to available resources

• Configure components to satisfy QoS requirements based on dynamic mission goals

• Perform run-time adaptation

• Coarse-grained mechanisms

• React to new missions, drastic changes in mission goals, or unexpected circumstances such as loss of resources

• e.g., component re-allocation or migration

• Fine-grained mechanisms

Intelligent Planner(SA-POP)

Target Platform with Varying Resource Availabilities and Capabilities

Resource Allocation & Control Engine (RACE)

Allocation Algorithms

Control Algorithms

Operational Strings with Varying Resource & QoS Requirements

Uniform Interface to Deploy and Manager

Components

CIAO/DAnCE Middleware

Deploy on Target Domain

Monitor and Adapt Resource Allocation

to Application Components

• Compensate for drift & smaller variations in resource usage

• e.g., adjustment of application parameters, such as QoS settings

Page 10: Douglas C. Schmidt schmidt@dre.vanderbilt

10

RACE Software Component Architecture

Applicaton Components

Descriptors of assembly/components to be deployed

MiddlewareFramework

Component Information

Ready

Fetch Component Information

Deployment Plan

Deploy Components

Application Data

Modify Application Property

ResourceUtilization

Ready

SystemResourceUtilization

Input Adapter

ResourceUtilization

Deploy Components/Modify Components

Domain Resources

Target Manager

Plan Manager

Plan Analyzer

Allocators Controllers

Input Adapter• A generic interface that translates

application components descriptors to internal data structure

Plan Analyzer• Examines input descriptors & select

appropriate allocation algorithms based on application characteristics

• Add appropriate application QoS & resource monitors

Planner Manager• Maintains a registry of installed

planners (algorithm implementations) along with their over head & currently executing sequences of planners that are generated by the Plan Analyzer

Page 11: Douglas C. Schmidt schmidt@dre.vanderbilt

11

RACE Software Component Architecture

Applicaton Components

Descriptors of assembly/components to be deployed

MiddlewareFramework

Component Information

Ready

Fetch Component Information

Deployment Plan

Deploy Components

Application Data

Modify Application Property

ResourceUtilization

Ready

SystemResourceUtilization

Input Adapter

ResourceUtilization

Deploy Components/Modify Components

Domain Resources

Target Manager

Plan Manager

Plan Analyzer

Allocators Controllers

CIAO/DAnCE Middleware• A CCM middleware framework atop of

which components of the operational strings are deployed

Target Manager• Runtime resource utilization monitor

that tracks the utilization of system resources, such as onboard CPU, memory, & network bandwidth utilization.

Page 12: Douglas C. Schmidt schmidt@dre.vanderbilt

12

RACE: Addressing Dynamic Resource Management Challenges

Need for a control framework

• Allocation algorithms allocates resource to components based on current system condition & estimated resource requirements

• No accurate apriori knowledge of input workload & how the execution time depend on input workload

RACEController

Resource Utilization Application QoS Information

Target Manager

CPU Monitors

Resource Utilization RACE

Control Agent

ApplicationControl Agents

QoS Manager

Application QoS

Monitors

QoS InformationOS

Control Agent

• Dynamic changes in system operating modes

Control objectives:

• Ensure end-to-end QoS requirements are met at all times

• Ensure resource utilization is below the set-point – ensure system stability

RACE is available with the Component-Integrated ACE ORB (CIAO) (deuce.doc.wustl.edu/Download.html)

• RACE Controller: Reallocates resources to meet control objectives

• RACE Control Agents: Maps resource reallocation to OS/application specific parameters

Page 13: Douglas C. Schmidt schmidt@dre.vanderbilt

13

Experimentation Results – Hardware / Software Testbed

• Experiments were performed on the ISISLab testbed at Vanderbilt University (www.dre.vanderbilt.edu/ISISlab)

• Hardware Configuration• 6 nodes with 2.8 GHz Intel Xeon dual processor, 1 GB physical memory, 1Ghz Ethernet

network interface, & 40 GB hard drive• Software Configuration

• Redhat Fedora Core Release 4 operating• ACE+TAO+CIAO Middleware

• Two operational strings - one mission-critical, one best effort -with 6 components each were deployed on 6 nodes

Node 6

Node 1 Node 5

Node 4

Node 3

Node 2

Mission Critical Operational String

Best-Effort Operational String

Page 14: Douglas C. Schmidt schmidt@dre.vanderbilt

14

Experimentation Results – Performance Analysis• An end-to-end deadline of 500 ms was

specified for the mission-critical operational string

• Mission critical string was deployed at time T = 0s, & best-effort was deployed at time T = 1800 sec

• Until T = 1800 sec, end-to-end execution time of mission critical string is lower than its deadline

• At T = 1800 sec, end-to-end execution time of mission critical string is way above its deadline

• This is due to excessive resources consumption by best-effort string

• RACE reacts to increase in execution time by perform adaptive system control modifications by modifying operating system priority, scheduler class and/or tearing down lower priority operational string(s)

DeadlineExecution

Time

RACE ensures end-to-end deadline of mission critical string is met even under fluctuations in resource availability/demand

Page 15: Douglas C. Schmidt schmidt@dre.vanderbilt

15

Satellite System

Uniform Interfaceto deploy and manage components

RACE

Deployment, Configuration & Control Mechanism

Allocation Algorithms

ControlAlgorithms

Application Performance

Data

Resource Utilization

Data

Application Monitors

Resource Monitors

Component Middleware Infrastructure(CIAO/DAnCE)

Deploy and manage components

Task Network

Spreading Activation

Planning + Scheduling

Mission Scientists

Mission Goals

Domain Experts

Operational Strings

Deployment/Mission Feedback

Task Map

SA-POP

Lessons Learned from SA-POP & RACE IntegrationFlexible

• Pluggable resource allocation & control algorithms in RACE

• SA-POP task network & task map tailored to domain

• Unlimited combinations of goals, goal priorities, & timing requirements in SA-POP

• Shared task map allows substitution of functionally equivalent task implementations by RACE

Scalable• Separation of concerns between SA-POP &

RACE limits search spaces in each• SA-POP handles cascading planning

choices in operational string generation• SA-POP only considers resource allocation

feasibility with course-grained (system-wide) resource constraints

• RACE handles resource allocation optimization with fine-grained (individual processing node) resource constraints & dynamic control for fixed operational strings

Page 16: Douglas C. Schmidt schmidt@dre.vanderbilt

16

Satellite System

Uniform Interfaceto deploy and manage components

RACE

Deployment, Configuration & Control Mechanism

Allocation Algorithms

ControlAlgorithms

Application Performance

Data

Resource Utilization

Data

Application Monitors

Resource Monitors

Component Middleware Infrastructure(CIAO/DAnCE)

Deploy and manage components

Task Network

Spreading Activation

Planning + Scheduling

Mission Scientists

Mission Goals

Domain Experts

Operational Strings

Deployment/Mission Feedback

Task Map

SA-POP

Lessons Learned from SA-POP & RACE Integration

Dynamic

• SA-POP task network with spreading activation provides expected utility information for generating robust applications in uncertain environments

• SA-POP replanning achieved efficiently with incrementally updated task network & plan repair as necessary

• RACE control algorithms alleviate need for replanning in many cases

• RACE provides reallocation & redeployment of revised operational strings when replanning is necessary