31
Edge Cases and Autonomous Vehicle Safety SSS 2019, Bristol 7 February 2019 © 2019 Philip Koopman Prof. Philip Koopman @PhilKoopman

Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

Edge Cases andAutonomous Vehicle

SafetySSS 2019, Bristol7 February 2019

© 2019 Philip Koopman

Prof. Philip Koopman

@PhilKoopman

Page 2: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

2© 2019 Philip Koopman

Making safe robots Doer/Checker safetyEdge cases matter Robust perception mattersThe heavy tail distribution Fixing stuff you see in testing

isn’t enoughPerception stress testing Finding the weaknesses in perception

UL 4600: autonomy safety standard

Overview

[General Motors]

Page 3: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

3© 2019 Philip Koopman

98% Solved For 20+ YearsWashington DC to San Diego CMU Navlab 5 Dean Pomerleau Todd Jochem

https://www.cs.cmu.edu/~tjochem/nhaa/nhaa_home_page.html

AHS San Diego demo Aug 1997

July1995

Page 4: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

4© 2019 Philip Koopman

1985 1990 1995 2000 2005 2010

DARPAGrand

Challenge

DARPALAGR

ARPADemo

II

DARPASC-ALV

NASALunar Rover

NASADante II

AutoExcavator

AutoHarvesting

Auto Forklift

Mars Rovers

Urban Challenge

DARPAPerceptOR

DARPAUPI

Auto Haulage

Auto Spraying

Laser Paint RemovalArmy

FCS

Carnegie Mellon University Faculty, staff, studentsOff-campus Robotics Institute facility

NREC: 30+ Years Of Cool Robots

SoftwareSafety

Page 5: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

5© 2019 Philip Koopman

The Big Red Button era

Before Autonomy Software Safety

Page 6: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

6© 2019 Philip Koopman

Traditional Validation Meets Machine Learning Use traditional software

safety where you can

..BUT..

Machine Learning (inductive training) No requirements

–Training data is difficult to validate No design insight

–Generally inscrutable; prone to gaming and brittleness

?

Page 7: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

7© 2019 Philip Koopman

APD (Autonomous Platform Demonstrator)

TARGET GVW: 8,500 kg  TARGET SPEED: 80 km/hr  

Approved for Public Release. TACOM Case #20247 Date: 07 OCT 2009

Safety critical speed limit enforcement

Page 8: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

8© 2019 Philip Koopman

Specify unsafe regions

Specify safe regions Under-approximate to simplify

Trigger system safety responseupon transition to unsafe region

Safety Envelope Approach to ML Deployment

UNSAFE!

Page 9: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

9© 2019 Philip Koopman

“Doer” subsystem Implements normal, untrusted functionality

“Checker” subsystem – Traditional SW Implements failsafes (safety functions)

Checker entirely responsible for safety Doer can be at low Safety Integrity Level Checker must be at higher SIL

(Also known as a “safety bag” approach)

Architecting A Safety Envelope SystemDoer/Checker Pair

Low SIL

High SILSimpleSafetyEnvelopeChecker

ML

Page 10: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

10© 2019 Philip Koopman

Validating an Autonomous Vehicle Pipeline

ControlSystems

ControlSoftwareValidation

Doer/CheckerArchitecture

AutonomyInterface To

Vehicle

TraditionalSoftwareValidation

Perception presents a uniquely difficult assurance challenge

Randomized& HeuristicAlgorithms

Run-TimeSafety EnvelopesDoer/Checker

Architecture

MachineLearning

BasedApproaches

???

Page 11: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

11© 2019 Philip Koopman

Validation Via Brute Force Road Testing? If 100M miles/critical mishap… Test 3x–10x longer than mishap rate Need 1 Billion miles of testing

That’s ~25 round tripson every road in the world With fewer than 10 critical mishaps…

Page 12: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

12© 2019 Philip Koopman

Good for identifying “easy” cases Expensive and potentially dangerous

Brute Force AV Validation: Public Road Testing

http://bit.ly/2toadfa

Page 13: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

13© 2019 Philip Koopman

Safer, but expensive Not scalable Only tests things you have thought of!

Closed Course Testing

Volvo / Motor Trend

Page 14: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

14© 2019 Philip Koopman

Highly scalable; less expensive Scalable; need to manage fidelity vs. cost Only tests things you have thought of!

Simulation

http://bit.ly/2K5pQCN

Udacity http://bit.ly/2toFdeT

Apollo

Page 15: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

15© 2019 Philip Koopman

You should expect theextreme, weird, unusual Unusual road obstacles Extreme weather Strange behaviors

Edge Case are surprises You won’t see these in testing

Edge cases are the stuff you didn’t think of!

What About Edge Cases?

https://www.clarifai.com/demo

http://bit.ly/2In4rzj

Page 16: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

16© 2019 Philip Koopman

Unusual road obstacles & obstacles Extreme weather Strange behaviors

Just A Few Edge Cases

http://bit.ly/2top1KD

http://bit.ly/2tvCCPK

https://dailym.ai/2K7kNS8

https://en.wikipedia.org/wiki/Magic_Roundabout_(Swindon)

https://goo.gl/J3SSyu

Page 17: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

17© 2019 Philip Koopman

Where will you be after 1 Billion miles of validation testing?

Assume 1 Million miles between unsafe “surprises” Example #1:

100 “surprises” @ 100M miles / surprise– All surprises seen about 10 times during testing– With luck, all bugs are fixed

Example #2: 100,000 “surprises” @ 100B miles / surprise– Only 1% of surprises seen during 1B mile testing– Bug fixes give no real improvement (1.01M miles / surprise)

Why Edge Cases Matter

https://goo.gl/3dzguf

Page 18: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

18© 2019 Philip Koopman

The Real World: Heavy Tail Distribution(?)

Common ThingsSeen In Testing

Edge CasesNot Seen In Testing

(Heavy Tail Distribution)

Page 19: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

19© 2019 Philip Koopman

Need to find “Triggering Events” to inject into sims/testing

The Heavy Tail Testing Ceiling

Page 20: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

20© 2019 Philip Koopman

Need to collect surprises Novel objects Novel operational conditions

Corner Cases vs. Edge Cases Corner cases: infrequent combinations

– Not all corner cases are edge cases Edge cases: combinations that behave unexpectedly

Issue: novel for person ≠ novel for Machine Learning ML can have “edges” in unexpected places ML might train on features that seem irrelevant to people

Edge Cases Part 1: Triggering Event Zoo

https://goo.gl/Ni9HhU Not A Pedestrian

Page 21: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

21© 2019 Philip Koopman

A scalable way to test & train on Edge Cases

What We’re Learning With Hologram

Your fleet and your data lake

Hologram cluster tests

your CNN

Hologram cluster

identifies weaknesses

& helps retrain your CNN

Your CNN becomes

more robust

Page 22: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

22© 2019 Philip Koopman

Edge Cases Part 2: Brittleness

https://goo.gl/5sKnZV

QuocNet:

Car Not aCar

MagnifiedDifference Bus

Not aBus

MagnifiedDifference

AlexNet:

Szegedy, Christian, et al. "Intriguing properties of neural networks." arXiv preprint arXiv:1312.6199 (2013).

Malicious Image Attacks Reveal Brittleness:

https://goo.gl/ZB5s4Q(NYU Back Door Training)

Page 23: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

23© 2019 Philip Koopman

Sensor data corruption experiments

ML Is Brittle To Environment Changes

Synthetic Equipment Faults

Gaussian blur

Exploring the response of a DNN to environmentalperturbations from “Robustness Testing forPerception Systems,” RIOT Project, NREC, DIST-A.

Defocus & haze area significant issue

Gaussian Blur &Gaussian Noise cause

similar failures

Page 24: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

24© 2019 Philip Koopman

False positive on lane markingFalse negative real bicyclist

False negative whenin front of dark vehicle

False negative whenperson next to light pole

Context-Dependent Perception FailuresPerception failures are often context-dependent False positives and false negatives are both a problem

Will this pass a “vision test” for bicyclists?

Page 25: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

25© 2019 Philip Koopman

Mask-R CNN: examples of systemic problems we found

Example Triggering Events via Hologram

“Red objects”

Notes: These are baseline, un-augmented images.(Your mileage may vary on your own trained neural network.)

“Columns”

“Camouflage”

“Sun glare”

“Bare legs”

“Children”

“Single Lane Control”

Page 26: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

26© 2019 Philip Koopman

Page 27: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

27© 2019 Philip Koopman

Page 28: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

28© 2019 Philip Koopman

Page 29: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

29© 2019 Philip Koopman

Centered on a Safety Case Credit for existing safety standards Mix & match cross-standards techniques Discourages questionable practices

“Unknowns” are first class citizens Balance between analysis & field experience Field monitoring required to feed back to argumentation Assessment findings & field data used to update standard

Plan: public standard by end of 2019 Standards Technical Panel forming Spring 2019 to review draft

UL 4600 – An Autonomy Safety Standard

Page 30: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

30© 2019 Philip Koopman

SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen in autonomy safety arguments Conformance to an existing standard Proven in Use Field Testing Vehicle simulation Formal proof of correctness Other issues (e.g., unwarranted independence assumptions)

Also, SafeAI Paper on taxonomy of ODD/OEDR Koopman, P. & Fratrik, F., "How many operational design domains,

objects, and events?" SafeAI 2019, AAAI, Jan 27, 2019.

“Credible Autonomy Safety Argumentation”

Page 31: Edge Cases and Autonomous Vehicle Safetyusers.ece.cmu.edu/~koopman/lectures/Koopman19_SSS_slides.pdf · SSS 2019, Phil Koopman, Aaron Kane, Jen Black List of pitfalls we’ve seen

31© 2019 Philip Koopman

More safety transparency Independent safety assessments Industry collaboration on safety

Minimum performance standards Share data on scenarios and obstacles Safety for on-road testing (driver & vehicle)

Autonomy software safety standards Traditional software safety … PLUS … Dealing with surprises and brittleness Data collection and feedback on field failures

Ways To Improve AV Safety

http://bit.ly/2MTbT8F (sign modified)

Mars

Thanks!