22
GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University of Georgia Athens, GA, 30602, USA [sonu,pdoshi]@cs.uga.edu Demo for AAMAS-2012

GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University of Georgia Athens,

Embed Size (px)

Citation preview

GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making

Ekhlas Sonu, Prashant DoshiDept. of Computer Science

University of GeorgiaAthens, GA, 30602, USA

[sonu,pdoshi]@cs.uga.edu

Demo for AAMAS-2012

ObjectiveTo design and implement a realistic testbed to evaluate the performance of decision making algorithms in a problem domain that is:

Relevant in cooperative, competitive and mixed settings

i.e. across different frameworks such as Dec-POMDP, I-POMDP, etc.

Scalable in problem sizeNo. of Physical States

Flexible in agent capabilitiesNumber and type of actions and observations

Extensible in number of agents and adaptable to agent types

Motivation

Recently, there have been substantial development in multi-agent decision making algorithms that has driven researchers to go beyond the traditional toy problem domains such as the Tiger Problem, Machine Maintenance Problem, Grid meeting, etc.Some larger problem domains include Cooperative Box-Pushing, Mars Rover, etc.:

Applied in cooperative settings

A Desirable Problem Domain

A desirable problem domain for multi-agent decision

making must be:

Scalable in physical states

Flexible in agent capabilities actions & observations

Extensible in number of agents

Relevant to cooperative, competitive and mixed settings

Able to produce solutions rich in structure

Realistic with a popular appeal

Proposed Scenario: Autonomous Unmanned Aerial Vehicles

Application:Law enforcement [Murphy, Cycon; 1998]Fighting forest fires [Casbeer, et.al.; 2005]Border surveillance [Haddal, Gertler; 2010]Wartime reconnaissance

Uncertainty in AUAVs due to: Uncertainty about physical stateNoisy actuators and sensors

Added Complexity: Presence of other agentsMay be cooperative or competitive

Related ResearchFocuses on formulating flight trajectories [R.

Bernard, et.al.,2002, 2003. S.M. Li, et.al 2002]

An example decision making scenario with AUAVs

We propose a problem domain involving a Autonomous Uninhabited Aerial VehiclesThe operating theatre may be divided into various sectors (as is a common practice) and may be represented as a grid of a predetermined size

An example decision making scenario with AUAVs

An example UAV recon problem may involve a UAV (I) (or a team of UAVs) trying to apprehend a target (T) (or a team of moving targets) while another team of UAVs (J) tries to help the target(s) escape to a safe houseOf course the exact problem description is flexible

S.H.

GaTAC: Overview

Georgia Testbed for Autonomous Control of vehicles (GaTAC): computer simulation framework for evaluating solution to a UAV reconnaissance problem. It provides:

Hyperrealistic 3D rendering of AUAV acting in real world scenarioScalability in problem size and number of agentsFlexibility in designing actions and observations of each agent

Input:Agent control function (policies) for all agents generated by any

(multi-agent) decision making algorithmOutput:

Simulation of policies on a flight simulatorResults of simulations may be compared for policies generated by

different algorithms using metrics such as number of captures, cumulative reward, etc.

We begin with a formal description for any UAV decision making problem

Formulate problem as .dpomdp/.ipomdp

file

Configure GaTAC for simulation (i.e. setup environment)

GaTAC

.dpomdp/.ipomdp

GaTAC

Solve using algo. of choice

Obtain policies

Policies for each agent are fed to GaTAC to be simulated and evaluated

.dpomdp/ .ipomdp

GaTAC

Solve

Simulate policies and evaluate results using metrics such as number of success, cumulative rewards, etc.

GaTAC ComponentsEach instance of GaTAC has three components:

Flight SimulatorOff-shelf open source flight simulator on which policies are simulatedOne instance of flight simulator for each agent

Autonomous Control ModuleControl each aircraft and make it behave according to the policy on the flight simulator

Communication ModuleSend aircraft behavior from ACM to flight simulatorCommunicate with other agents (if required)

GaTAC instances may run on different machinesConnected using communication module

Flight SimulatorCommunicati

on ModuleAutonomous

Control Module

Communication between agents

Flight Simulator

FlightGear: Open-source (written in C++)Multi-platformHyperrealistic 3D graphics3D virtual mapFlexible with choices of

Multiple models of aircraftsLocations to act as operating environmentWeather condition, time of day, etc.

6 DOF flight dynamics modelSimulates effects of airflow on different parts of aircraft

FlightGear in Operating Scenario

FG utilizes realistic 3D scenery available from TerraGear

Provides multiple view of the flying aircraft

Cockpit view, tail view, etc.Multiple instances of FG

may be linked together through external servers- ideal for multi-agent settings

Low Level ActionsControl Rudder, Throttle, Aileron, Roll, Pitch, etc.

High Level Actions Takeoff, Fly-Straight, Turn, Change Altitude

Agent Actions on Grid

Used to algorithmically control the aircraft and make it behave according to policy: 3 levels of hierarchy

Perform low level actions to control aircraft by adjusting parameters along the 6DoF

Perform simple tasks that represent simple aircraft behaviorsActions constructed using high level actions to represent actions of agents in the problem at hand

Communication Module

Establish a communication channel between:Autonomous Control Module and FlightGearBetween each agent (if required e.g. in team settings)

Communication channels use UDP, httpd and XMLCommunicate low-level flight control data from an

instance of autonomous control module to respective instance of FlightGear

Communicate aircraft position to all other instances of GaTAC in real time (used to formulate observations)

Communication Module Functions

Send control data from ACM to FG May adjust flight parameters such as controlling

thrust, rudder, aileron, altitude, etc.Receive the aircraft’s flight dynamics in real time

from FG and send to ACM for path correctionPosition , aircraft orientation on 6 DoF, flight speed,

altitude, etc.May be used to pass messages between GaTAC

instances (when communication between agents is required)

GaTAC Control Algorithm

Read policy from file

Start FlightGear

Fly according to policy

Get Observations/ Next Action

Mission Accomplished

Observation =Successful?

Yes

NoObtain action to perform

from the policyAgent action systematically

broken down into high-level and then low-level actions to control

the aircraft algorithmically

Repeat until termination condition reached

Next action may be obtained from policy using the observation

ConclusionsGaTAC:

Can act as an open-source testbed for decision theoretic agents

May be used to compare different algorithms irrespective of decision

making framework (Dec-POMDP, I-POMDP, MTDP, etc.)

Is extensible: no upper bound on size of problem

No. of physical states, no. of agents, no. and types of actions & observations

Facilitates deployment of decision theoretic agents in hyper-realistic real

world settings (cooperative, competitive, or mixed)

Easily configurable for simulating any UAV problem

Provides for communication between agents

May be extended to include choice of locations and aircrafts

Demo