Robust Monocular Flight in Cluttered Outdoor Environments .Robust Monocular Flight in Cluttered Outdoor

  • View

  • Download

Embed Size (px)

Text of Robust Monocular Flight in Cluttered Outdoor Environments .Robust Monocular Flight in Cluttered...

  • Robust Monocular Flight in Cluttered OutdoorEnvironments

    Shreyansh Daftry, Sam Zeng, Arbaaz Khan, Debadeepta Dey, Narek Melik-Barkhudarov,J. Andrew Bagnell and Martial Hebert

    Robotics Institute, Carnegie Mellon University, USA

    AbstractRecently, there have been numerous advances in thedevelopment of biologically inspired lightweight Micro AerialVehicles (MAVs). While autonomous navigation is fairly straight-forward for large UAVs as expensive sensors and monitoringdevices can be employed, robust methods for obstacle avoidanceremains a challenging task for MAVs which operate at lowaltitude in cluttered unstructured environments. Due to payloadand power constraints, it is necessary for such systems tohave autonomous navigation and flight capabilities using mostlypassive sensors such as cameras. In this paper, we describe arobust system that enables autonomous navigation of small agilequad-rotors at low altitude through natural forest environments.We present a direct depth estimation approach that is capableof producing accurate, semi-dense depth-maps in real time.Furthermore, a novel wind-resistant control scheme is presentedthat enables stable way-point tracking even in the presence ofstrong winds. We demonstrate the performance of our systemthrough extensive experiments on real images and field tests ina cluttered outdoor environment.


    Flying robots that can quickly navigate through clutteredenvironments, such as urban canyons and dense forests, havelong been an objective of robotics research [28, 37, 48, 25, 39].Even though nature is home to several species of birds thathave mastered the art of maneuverability during high-speedflights, very little have been established so far in realizingthe same for Micro Aerial Vehicles (MAVs) [22]. Inspiredby this, we take a small step in the direction of building arobust system that allows a MAV to autonomously fly at highspeeds of up to 1.5 m/s through a cluttered forest environment,as shown in Figure 1. Our work is primarily concerned withnavigating MAVs that have very low payload capabilities, andoperate close to the ground where they cannot avoid denseobstacle fields.

    In recent years, MAVs have built a formidable resume bymaking themselves useful in a number of important applica-tions, from disaster scene surveillance and package delivery torobots used in aerial imaging, architecture and construction.The most important benefit of using such lightweight MAVsis that it allows the capability to fly at high speeds in space-constrained environments. However, in order to function insuch unstructured environments with complete autonomy, itis essential that they are able to avoid obstacles and navigaterobustly. While autonomous operations and obstacle avoidanceof MAVs has been well studied in general, most of theseapproaches uses laser range finders (lidar) [2, 36] or Kinect

    Fig. 1. We present a novel method for high-speed, autonomous MAV flightthrough dense forest areas, using only passive monocular camera.

    cameras (RGB-D sensors) [3]. For agile MAVs with verylimited payload and power, it is not feasible to carry suchactive sensors. Therefore, in this work we are interestedin methods for range mapping and obstacle avoidance foragile MAVs through unstructured environments that only haveaccess to a passive monocular camera as its primary sensingmodality.

    Existing approaches for monocular cluttered flight rangefrom purely reactive approaches [33] that uses an imita-tion learning based framework for motion prediction to fullplanning-based approaches [12, 4], that utilize receding hori-zon control to plan trajectories. While promising results havebeen shown, several challenges remain that impede trulyautonomous flights in general outdoor environments withoutreliance on unrealistic assumptions. The reactive layer iscapable of good obstacle avoidance, however is inherentlymyopic in nature. This can lead to it being easily stuck in cul-de-sacs. In contrast, receding horizon based approaches planfor longer horizons and hence minimize the chances of gettingstuck [24]. These approaches require a perception method foraccurately constructing the local 3D map of the environment inreal-time. However, current perception methods for monoculardepth estimation either lack the robustness and flexibility todeal with unconstrained environments or are computationallyexpensive for real-time robotics applications.

    Furthermore, the intuitive difficulties in stability and controlof such agile MAVs in the presence of strong winds is often








    ] 1

    6 A

    pr 2


  • overlooked. An impressive body of research on control andnavigation of MAVs has been published recently, includingaggressive maneuvers [30], which in principle should be idealto fly through forests. However, these control schemes havebeen designed to work under several strong assumptions andcontrolled laboratory conditions, which are easily alleviatedduring outdoor cluttered flights. The effect of wind on smallMAV flight control can be quite significant, and can lead tofatal conditions when flying in close proximity to clutter orother aerial vehicles [44]. It is the above considerations thatmotivate our contribution.

    Our technical contributions are three-fold: First, we presentan end-to-end system consisting of a semi-dense direct visual-odometry based depth estimation approach that is capable ofproducing accurate depth maps at 15 Hz, even while the MAVis flying forward; is able to track frames even with strong inter-frame rotations using a novel method for visual-inertial fusionon Lie-manifolds, and can robustly deal with uncertainty bymaking multiple, relevant yet diverse predictions. Secondly,we introduce a novel wind-resistant LQR control based ona learned system model for stable flights in varying windconditions. Finally, we evaluate our system through extensiveexperiments and flight tests of more than 2 km in denseforest environments, and show the efficacy of our proposedperception and control modules as compared to the state-of-the-art methods.


    Monocular Depth Estimation: Structure from Motion(SfM) approaches [47, 20, 9] can be used to reconstruct 3Dscene geometry from a moving monocular camera. While suchSfM approaches are already reasonably fast, they producesparse maps that are not suited for collision-free navigation:typically the resulting point cloud of 3D features is highlynoisy, and more crucially, the absence of visual features inregions with low texture does not necessarily imply free space.In contrast, methods for obtaining dense depth maps [31, 46]are either still too computationally expensive for high-speedflight in a forest or implicitly assume that the camera movesslowly and with sufficient translation motion. On a high speedflight, this is difficult to guarantee, as all flying maneuversinduce a change in attitude which presumably would often leadto a loss of tracking. In recent years, learning based approaches[12, 13] that aim to predict depth directly from visual featureshave shown promising results. However, these methods arehighly susceptible to training data and hence do not generalizeto unseen data, making them unsuitable for exploration basednavigation tasks.

    Wind Modeling, Detection and Correction: A signif-icant literature exists on understanding wind characteristicsfor MAV guidance and navigation. More specifically, workhas been done on large scale phenomena such as thermalsoaring [1], ridge soaring [26], and exploitation of wind gusts[5, 32, 27]. However for all of these, the environment in whichthese conditions are studied are relatively benign and free of

    Fig. 2. Schematic diagram of software and hardware modules.

    clutter. Thus, they are strongly reliant on assumptions suchas constant wind fields and long time-scale. The reason forsuch assumptions is that short term, turbulent wind effectsare difficult to model even with detailed Computational FluidDynamics (CFD) analysis, making it impractical from a con-trol standpoint as studied by Waslander and Wang [44]. Incontrast, our work tries to explore the use of simple estimationalgorithms to detect current wind disturbances and designa controller that reduces their impact on waypoint trackingaccuracy.


    We have used a modified version of the 3DR ArduCopterwith an onboard quad-core ARM processor and a Microstrain3DM-GX3-25 IMU. Onboard, there are two monocular cam-eras: one PlayStation Eye camera facing downward for realtime pose estimation and one high-dynamic range PointGreyChameleon camera for monocular navigation. We have adistributed processing framework as shown in Figure 2, wherean image stream from the front facing camera is streamed tothe base station where the perception module produces a local3D scene map; the planning module uses these maps to find thebest trajectory to follow and transmits it back to the onboardcomputer where the control module does trajectory tracking.


    The perception system runs on the base station and isresponsible for providing a 3D scene structure which can beused for motion planning and control strategies. In the last fewyears, direct approaches [41, 46, 42, 17] for scene geometryreconstruction have become increasingly popular. Instead ofoperating solely on visual features, these methods directlywork on the image intensities for both mapping and tracking:The world is modeled as a dense surface while in turn newframes are tracked using whole-image alignment. In this sec-tion, we build upon recently published work [15, 8] to present

  • Fig. 3. Semi-dense Depth Map Estimation. (Top left) Example image frame(Bottom left) Depth map. Colormap: Inverted Jet, Black pixels are for invalidpixels (Right) Corresponding 3D point clo