Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Air Combat EvolutionDan “Animal” Javorsek, PhD
Program Manager, DARPA/STO
Demonstrate trusted, scalable, human-level autonomy for air combat
A.C.E.
ACE Proposers Day
17 May 2019
Distribution A: Approved for Public Release, Distribution Unlimited
Future U.S. Combat Success Requires AI Capable UAVs
Can we use existing methods designed for humans to mature autonomy?
2
“In the future, it is desirable to have each
operator control multiple unmanned
systems, thus shifting the human’s role from
operator towards mission manager.”
Unmanned Systems Roadmap, 2018
Distribution A: Approved for Public Release, Distribution Unlimited
Build performance and trust the way we do with humans
Striker Escort
Suppression of Enemy Air Defenses
Point Protection
Traffic Avoidance
Autopilot
Terrain Avoidance
Navigation
Mosaic Warfare
3
Dogfight
Physics-Based Maneuver Systems
Nonlinear Interactive
Systems
Low
erPr
oble
m C
ompl
exity
H
ighe
r
Lower Cognitive Workload Higher
Dogfight is gateway to nonlinear combat autonomy
Combat autonomy is stuck here!
Distribution A: Approved for Public Release, Distribution Unlimited
• Need performance from automated tactical decision making• Must build pilot trust in combat automation• Scale performance and maintain trust up the stack• Demonstrate performance on increasingly realistic platforms
Technical Challenges
4
local
globa
l
will push combat autonomy up the stack
Maneuver
Individual Tactical Behaviors (1v1)
Team Tactical Behaviors (2v1, 2v2)
Multi-aircraft Operational Behaviors
Heterogeneous Multi-aircraft Strategic Behaviors
incr
easi
ng n
onlin
earit
y
current automation lives here
ACE will build scalable performance and trust in combat autonomy
AlphaDogfight
AlphaMosaic
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
ACE Program Structure
5
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
ACE Program Schedule
6
FY 2019 FY 2020 FY 2021 FY 2022 FY 2023
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Task
AFWERX AlphaDogfight Trials – AFRL
Technical Area 1Increase performance for local behaviors• Develop & evaluate 1v1 & 2vX dogfight algorithms• Increase performance via complexity build-up
Technical Area 2 (Human Use)Build trust for local behaviors• Create Human-Machine Interface• Implementation of Dual Operational Task (DOT)
Trust Assessments for all phases
Technical Area 3Scale performance & trust to global behaviors• Learning transference of TA1 algorithms applied
to large force exercise data analytics• DOT Mission Commander scenario development
Technical Area 4 Full-scale experimentation infrastructure• Full-scale (FS) aircraft purchase, modification,
airworthiness, training, and testing• Implementation and assessment of TA1
algorithms and TA2 HMI
Experimentation Integration Team (EIT)• Interface control documentation (ICD) and
application programming interface (API) development & maintenance
• Lead ICD/API working groups• M&S and sub-scale (SS) environment
development and performance assessment
Phase 1: M&S Phase 2: Sub-scale (SS) Phase 3: Full-scale (FS)Competition
1 Performer
FSPurchase
FS Modification AirworthinessComplete
SS 1v1 Training
1v1 Training
Test PlanComplete
1v1 Comp
SSPurchase
2v1 Training
2v1 Comp
FlightTesting
SSCompetition
SS 2v1Training
GroundDemo
FlightDemo
1 Performer
1 Performer
Single Experimentation Integration Team
This BAA Multiple Performers
2v2 Comp
2v2 Training
SS 2v2Training
ModPlan Complete
SS Modification
AdversaryA/C Decision
Multiple Performers
TA1 Performer(s)
Distribution A: Approved for Public Release, Distribution Unlimited
ACE Program Structure
7
…to be released at a later date
Distribution A: Approved for Public Release, Distribution Unlimited
TA1: Build combat autonomy for local behaviors• Challenge: Dogfight represents a new class of games
- Continuous, unbounded, incomplete knowledge
- Adversary can actively conceal/deceive
- High-tempo with simultaneous players
• Insights: - Contrary to popular belief the Dogfight is manifold bounded
- Established tech base actively addressing these challenges
- Hybridized AI approaches blended with rules-based tree search show strong promise
• Technical Area 1 (TA1) Objectives:- Develop and demonstrate within visual range (WVR) individual
and team control algorithms
- Implementation in M&S, sub-scale unmanned aerial vehicles (UAVs), and full-scale combat representative aircraft
- Success metric: Win Probability (PW)
8
Consequence-Normalized Crosscheck Ratio (RN)
Win
Pro
babi
lity
(Pw)
1v1
Phase 1:M&S
performance
Phase 2:Sub-scale50/50
Phase 3:Full-scale
trust
Sim Combat AircraftCommercial
UAVs
perfo
rman
ce
trust
2v1
2v2
TA1 increases performance of dogfight automation in increasingly realistic scenarios
Game Complexity:(state-space complexity)
Tic-tac-toe Checkers Chess Go
Sequential Games
Drivingor Atari
Board Games Atari StarcraftPoker
103 1020 1047 10170
0.5 30060
State of the art AI exceeds requirement for ACE in many dimensions
Information Observability:(perfect vs imperfect info)
Tempo:(actions per minute)
Starcraft
10270
Starcraft
Not solicited in this BAA…planned for Oct 2019Sim graphic source: Ernest et al., J Def Manag 2016
Distribution A: Approved for Public Release, Distribution Unlimited
• Attract non-traditional DARPA performers- AI video gaming world - Utilize the AFWERX Other Transactional Consortium and AFRL’s
Autonomy Research Collaboration Network (ARCNet)
• AFWERX AlphaDogfight Trials (solicitation June 2019) - Modeled after StarCraft 2, Defense of the Ancients 2, Quake III bot ladders- https://www.afwerxchallenge.com/- Prove algorithms against game adversary and each other
DARPA-AFWERX-AFRL Collaboration
9
© 2019 BLIZZARD ENTERAINMENT
© 2019 FlightGear© 2019 Digital Combat Simulator
© 2
019
Dig
ital C
omba
t Sim
ulat
or
© 2019 Falcon 40
Distribution A: Approved for Public Release, Distribution Unlimited
ACE Program Structure
10Distribution A: Approved for Public Release, Distribution Unlimited
TA2: Build trust for local behaviors• Challenge: Modeling pilot trust
- Trust is a subjective relational experience - Trust depends on performance, situation, & consequences
• Insights: - Crosscheck ratio is one reflection of pilot trust if given an
appropriate dual operational task paradigm - Crosscheck ratio can be measured using commercial
eyetrackers
• Technical Area 2 (TA2) Objectives:- Develop experimental methodology for modeling and
measuring pilot trust in the dogfight combat autonomy- Design and develop Human-Machine Interfaces (HMIs)- Model and measure pilot trust using a Dual Operational Task
(DOT) implementation- Provide plan for Institutional Review Board (IRB) approval for
all Human Subjects Research (HSR)- Success metrics: Crosscheck Ratio (RN), Trust Calibration
Error (e)
11
Consequence-Normalized Crosscheck Ratio (RN)
Win
Pro
babi
lity
(Pw)
1v1
Phase 1:M&S
performance
Phase 2:Sub-scale50/50
Phase 3:Full-scale
trust
Sim Combat AircraftCommercial
UAVs
perfo
rman
ce
trust
2v1
2v2
Win
Pro
babi
lity
(Pw)
Crosscheck Ratio (R)(Unmonitored/Monitored Timeshare)
Trust Calibration Error (e)
Unlimited(Fighter)
Limited(Bomber
)
Trust Calibration
Curvefrom M&S
Measure workload distribution
TA2 increases trust in dogfight automation in increasingly realistic scenarios
Unaware(Cruise Missile)
Calibrate pilot trustMission Commander Task
Dogfight Task
Div
ide
the
pilo
t’s a
ttent
ion
Sim graphic source: Ernest et al., J Def Manag 2016
darpa.mil
Source: USAF
Duchowski, A. T. (2018)
©2019 Designtechnica
Corporation
Distribution A: Approved for Public Release, Distribution Unlimited
Cross TA interactions featuring TA2
12
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
ACE Program Structure
13
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
TA3 scales performance and trust to global behaviors in simulation
TA3: Scale performance & trust to global behaviors• Challenge: Extending learning to new scales without developing
independent algorithms at each level- Tailored algorithms for each scale can produce new behaviors- Aircraft capabilities (weapons, sensors, performance) vary widely and must
be incorporated- Algorithms retraining necessary when new information is introduced
• Insights: - STO seedling data suggests that algorithms can be quickly and consistently
adapted from one scale to another - Implementation of machine learning transference neural network
• Technical Area 3 (TA3) Objectives:- Develop data set and model for large force exercise data analytics- Develop Dual Operational Task Mission Commander scenarios- Scale local combat autonomy to, and develop battle management for, large
force exercise data analytics- Quantify relationship between local behavior and global behavior
performance metrics- Success metric: Kill Ratio (RK)
14
Consequence-Normalized Crosscheck Ratio (RN)
Kill
Rat
io (R
K)Sim
perfo
rman
ce
trustSIA: Semi-intelligent Autonomy Sim graphic source: Ernest et al., J Def Manag 2016Distribution A: Approved for Public Release, Distribution Unlimited
Analytical Models Can Aid in Identifying Tactics Otherwise Unthinkable*
15
This is a crazy idea, right?29Aug2014 – Dodgers (Mattingly) employ four
man shift, Padres hitter (Smith) grounds out
*limited by multiple factors: creativity, complexity, convention, training
Standard Infield Deploymentconfiguration generally deployed for 100+ yrs
Source: Fan Graphs
1223% increase in shifts since 2011 – Source: 538.comNot completely analytically derived – used against Ted Williams in the ‘40s
David Ortiz, one of the best hitters in baseball, becomes below average against shift
Analytically derived infield shift deployment against
David Ortiz
Ortiz BABIP w/out shift (bottom left) 0.341
Ortiz BABIP w/ shift (top right) 0.284
Runs saved w/ shift 11**BABIP = Batting Average on Balls in Play; average BABIP ~ 0.300
**Baseball analysts generally equate 10 runs equal to a win
Currently lack real world data set to even analyze, develop mosaic tactics, strategies beyond M&S Distribution A: Approved for Public Release, Distribution Unlimited
16
Explosion of tracking data made it possible to apply machine learning to build increasingly fine-grained models of player and team behavior
Data-driven “Ghosting” allows for scalable quantification, analysis, and comparison of player and team behavior
ACE will apply machine learning to large training exercises to experiment, explore mosaic tactics in the real world
Red Flag: two-week advanced aerial combat training exercise held several times a year by the United States Air Force.
AlphaMosaic
source: Le, Carr, Yue, Lucey; Data-Driven Ghosting using Deep Imitation Learning 2017
source: af.mil
Distribution A: Approved for Public Release, Distribution Unlimited
Cross TA interactions featuring TA3
17
darpa.mil
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
ACE Program Structure: TA4
18Distribution A: Approved for Public Release, Distribution Unlimited
TA4: Full-scale Air Combat Experimentation Infrastructure• Aircraft modification background:
- DARPA Controlled Safety Review Process - Aircraft capable of dogfight maneuvers
o Existing autopilots capable of 3D maneuverso Architectures capable of real-time insertion of data streams into the
functioning operational systemo Two seats (safety pilot + evaluation pilot)
• Objectives:- Supply full-scale aircraft and integrate dogfighting
algorithms- Develop and integrate HMIs for full-scale aircraft- Retain safety pilot override controls and/or autopilot
disconnect for trust assessments- Perform all safety/airworthiness reviews for supervised
live dogfight engagements- Execute full-scale live flight experiments
19
Consequence-Normalized Crosscheck Ratio (RN)
Win
Pro
babi
lity
(Pw)
1v1
Phase 1:M&S
performance
Phase 2:Sub-scale50/50
Phase 3:Full-scale
trust
Sim Combat AircraftCommercial
UAVs
perfo
rman
ce
trust
2v1
2v2
Sim graphic source: Ernest et al., J Def Manag 2016Distribution A: Approved for Public Release, Distribution Unlimited
Cross TA interactions featuring TA4
20
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
Cross TA interactions, all TAs
21
darpa.mil
Distribution A: Approved for Public Release, Distribution Unlimited
Metrics: Build trust in AI the same way we do with pilots
22
Phase 1 Phase 2 Phase 3
M&S Subscale Full-scale
TA1: Increase performance for local behaviors
Win Probability (PW): Limited Th: 50% Ob: 100%
For 2D: unopposed, unaware, O/D/HA-limited, O/D/HA-unlimited; 3D: repeat all
Win Probability (PW): Limited Th: 75% Ob: 100%
For 2D: unopposed, unaware, O/D/HA-limited, O/D/HA-unlimited; 3D: repeat all
Win Probability (PW): Limited 1v1 Th: 75% Limited 2v1 Th: 90%Limited 2v2 Th: 80% Ob: 100%
TA2: Build trust for local behaviors
CrosscheckRatio (R):Th: 0.50Ob: 0.95
N/A
CrosscheckRatio (R):Th: 0.75Ob: 0.95
Trust Calibration Error (e):Th: 0.20Ob: 0.0
CrosscheckRatio (R):Th: 0.80Ob: 0.95
Trust Calibration Error (e):Th: 0.10Ob: 0.0
Phase 1 Phase 2 Phase 3
M&S
TA3: Scale performance & trust to global behaviors
Mission Commander Scenarios:Th: 3Ob: 5
Kill Ratio (RK): Th: 10:1Ob: 50:1
Kill Ratio (RK): Th: 30:1Ob: 50:1
Th: ThresholdOb: Objective
O/D/HA: Offensive/Defensive/High Aspect Initial ConditionsUnopposed: Pre-planned maneuvers Unaware: Station keeping on unaware adversary
Limited: Baseline adversary with standard gameplan, limited maneuver potential, and thrustUnlimited: Adversary with no gameplan, maneuver potential, or thrust restrictions
Ernest et al., J Def Manag 2016
Ernest et al., J Def Manag 2016
Distribution A: Approved for Public Release, Distribution Unlimited
• Overall Scientific and Technical Merit- Standard DARPA BAA language
- Technical Area 2 proposals should:
• Emphasize Dogfight task HMI which is integral to trust assessment (HMI can affect trust independent of algorithm performance)
• Develop a Mission commander task HMI that is representative enough to perform dual operational task evaluations
- Technical Area 3 proposals should:• Provide a detailed model architecture, data analytics plan
• Include an implementation plan for AFSIM and NGTS with relative merits for each or reason for proprietary environment
- Technical Area 4 proposals should:
• Consider different cost options (lease vs buy) with price per flight hour (including operations and maintenance) and operational tempo limitations (flights per week per aircraft)
• Consider alternate platforms and human-only adversary options to enable cost and schedule flexibility
• Include recommended partnership information (POC, availability, etc) if considering government furnished operational aircraft
• Potential Contribution and Relevance to the DARPA Mission• Standard DARPA BAA language
• Cost and Schedule Realism• Standard DARPA BAA language
Source Selection and Evaluation Criteria
23Distribution A: Approved for Public Release, Distribution Unlimited
• Teaming encouraged• Highlight previous experience • Schedule
- Proposers Day – 17 May 2019- BAA released – May 2019- Optional 1-on-1s – 05 June, 07 June 2019
- Email: [email protected] by 29 May 2019 to request - FAQ/Questions Due Date – 07 June 2019- Full Proposals Due – BAA release + 45 days
Submission Highlights
24Distribution A: Approved for Public Release, Distribution Unlimited
25Distribution A: Approved for Public Release, Distribution Unlimited