7
Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1 , Tom Strizic 1 , and Jason Pusey 2 Abstract—In this paper, we propose and analyze mesh-based tools to control bounding motions of an 8 degree-of-freedom planar quadruped model with limited footholds on terrain. There are two complementary goals in our presentation. First, we aim to clarify potential advantages and diadvantages of our mesh-based approach in planning agile motions for a legged system. A key advantage is the ability to map our reachable states and their feasible transitions, given a relatively high- dimensional nonlinear dynamics system for which traditional meshing techniques would be impractical. A suspected disad- vantage is that meshing has finite resolution, and robustness of mesh-based results should correspondingly be considered. Our second goal is to discuss proposed definitions and metrics for agility. In particular, we present initial guidelines for quan- tifying and thereby improving specific metrics that correlate to good performance in a long jump task. We find that our mesh-based policies predict future dynamics for plans up to about a 5-step horizon. To quantify agility, we focus on two intuitive aspects: control of forward motion (speed) and control of foothold placement. Our results provide evidence that the ability to decouple these goals to perform a long jump follow a complex and non-intuitive structure, and that our mesh-based approach is well suited for such agile motion planning tasks. I. INTRODUCTION One of central aims of a legged robots is to be able control foothold placement, for example to cope with in- termittent terrain where wheels can not be used. However, it is generally much more challenging to design a control framework where step length can be adjusted, with the dynamics correspondingly varying at each step, than it is to design a single inter-step controller aimed at driving the system to a stable limit cycle. This problem gets particularly challenging as the desired step lengths increase, requiring the system to be more dynamic, with significant periods of ballistic flight between intermittent contacts at the ground. A variety of successful approaches have been developed for control bounding quadruped models [1]–[6], and even for real-world robots designed for bounding with a flexi- ble spine [7]–[9]. However, the methods suggested are not clearly applicable to the dual goals of planning for agile foothold placement and of quantifying and optimizing such planning goals. This work was supported in part by the U.S. Army’s Robotics CTA through grant No. W911NF-08-R-0012, by an NSF CAREER award (CMMI 1255018), and by the Summer Student Research Participation Program at the U.S. Army Research Laboratory administered by the Oak Ridge Institute for Science and Education. 1 Katie Byl and Tom Strizic are with the Department of Electri- cal and Computer Engineering, University of California Santa Barbara, Santa Barbara, CA 93106-9560, USA [email protected], [email protected] 2 Jason Pusey is with the U.S. Army Research Laboratory, Aberdeen Proving Ground, MD, USA [email protected] The most relevant work we have found for quantifying agility is [8]. However, the authors focus on the ability of a robot to change its energy (e.g., accelerate and decelerate), rather that the ability to accurately control its end state and/or to exploit limited footholds. A promising recent approach for deriving control of a dynamic legged system with constraints on available foothold placement is presented in [10]. This work is part of a growing effort of research in trajectory optimization for legged sys- tems, in a framework of Sequential Quadratic Programming (QP), yielding impressive results in recent years across implementations designed to optimize a variety of desired objective functions [10]–[14]. However, such methods are designed to solve for particular trajectories for robot motions and the correspondingly required control inputs over time, without analyzing robustness of the results to noise and other sources of variability. Also, these methods are not appropriate for online planning, for example, to respond to new information as a robot is in locomotion and only senses upcoming terrain with a limited lookahead horizon. In a realistic legged robot scenario, both the upcoming terrain and current state of the robot are varying over time, and control methods must either be “fast enough to compute” or “compact enough to pre-compute”. In this work, we explore the later option, using a mesh-based approach to pre-compute reachable states and controller-dependent transitions among them. The approach builds from recent results using the same planar model, earlier presented in [15], but the focus of this paper is on analysis of meshing resolution, and on quantification of robustness and agility. Our approach and analysis is largely inspired by a growing body of work within our lab on mapping the reachable state space of a running or hopping system [16], [17], and on our previous work on meshing [18] as a method of minimizing fall rates (i.e., optimizing mean first-passage time), as originally proposed in [19]. The basic framework and theory are described in [20]. In its original formulation, our mesh-based approach aimed at minimizing fall rates for a walking model on rough terrain. In a general sense, it aimed at optimizing the robustness of the system to perturbations caused by variations in terrain height. In this work, we instead focus on the goal of agility. Abstractly, a robust system aims “not to fail” when subject to various sources of variability. By contrast, being agile implies the ability to drive the system to any of a set of particular states or conditions, and to do so rapidly. We argue an intuitive notion of agility implies that a legged system (a) can move to any of a set of physical locations on terrain

Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

Mesh-Based Methods for Robust and Agile Dynamic Gaits

Katie Byl1, Tom Strizic1, and Jason Pusey2

Abstract— In this paper, we propose and analyze mesh-basedtools to control bounding motions of an 8 degree-of-freedomplanar quadruped model with limited footholds on terrain.There are two complementary goals in our presentation. First,we aim to clarify potential advantages and diadvantages of ourmesh-based approach in planning agile motions for a leggedsystem. A key advantage is the ability to map our reachablestates and their feasible transitions, given a relatively high-dimensional nonlinear dynamics system for which traditionalmeshing techniques would be impractical. A suspected disad-vantage is that meshing has finite resolution, and robustnessof mesh-based results should correspondingly be considered.Our second goal is to discuss proposed definitions and metricsfor agility. In particular, we present initial guidelines for quan-tifying and thereby improving specific metrics that correlateto good performance in a long jump task. We find that ourmesh-based policies predict future dynamics for plans up toabout a 5-step horizon. To quantify agility, we focus on twointuitive aspects: control of forward motion (speed) and controlof foothold placement. Our results provide evidence that theability to decouple these goals to perform a long jump follow acomplex and non-intuitive structure, and that our mesh-basedapproach is well suited for such agile motion planning tasks.

I. INTRODUCTION

One of central aims of a legged robots is to be ablecontrol foothold placement, for example to cope with in-termittent terrain where wheels can not be used. However,it is generally much more challenging to design a controlframework where step length can be adjusted, with thedynamics correspondingly varying at each step, than it isto design a single inter-step controller aimed at driving thesystem to a stable limit cycle. This problem gets particularlychallenging as the desired step lengths increase, requiringthe system to be more dynamic, with significant periods ofballistic flight between intermittent contacts at the ground.

A variety of successful approaches have been developedfor control bounding quadruped models [1]–[6], and evenfor real-world robots designed for bounding with a flexi-ble spine [7]–[9]. However, the methods suggested are notclearly applicable to the dual goals of planning for agilefoothold placement and of quantifying and optimizing suchplanning goals.

This work was supported in part by the U.S. Army’s Robotics CTAthrough grant No. W911NF-08-R-0012, by an NSF CAREER award (CMMI1255018), and by the Summer Student Research Participation Program atthe U.S. Army Research Laboratory administered by the Oak Ridge Institutefor Science and Education.

1Katie Byl and Tom Strizic are with the Department of Electri-cal and Computer Engineering, University of California Santa Barbara,Santa Barbara, CA 93106-9560, USA [email protected],[email protected]

2Jason Pusey is with the U.S. Army Research Laboratory, AberdeenProving Ground, MD, USA [email protected]

The most relevant work we have found for quantifyingagility is [8]. However, the authors focus on the ability of arobot to change its energy (e.g., accelerate and decelerate),rather that the ability to accurately control its end state and/orto exploit limited footholds.

A promising recent approach for deriving control of adynamic legged system with constraints on available footholdplacement is presented in [10]. This work is part of a growingeffort of research in trajectory optimization for legged sys-tems, in a framework of Sequential Quadratic Programming(QP), yielding impressive results in recent years acrossimplementations designed to optimize a variety of desiredobjective functions [10]–[14]. However, such methods aredesigned to solve for particular trajectories for robot motionsand the correspondingly required control inputs over time,without analyzing robustness of the results to noise andother sources of variability. Also, these methods are notappropriate for online planning, for example, to respond tonew information as a robot is in locomotion and only sensesupcoming terrain with a limited lookahead horizon. In arealistic legged robot scenario, both the upcoming terrainand current state of the robot are varying over time, andcontrol methods must either be “fast enough to compute” or“compact enough to pre-compute”.

In this work, we explore the later option, using amesh-based approach to pre-compute reachable states andcontroller-dependent transitions among them. The approachbuilds from recent results using the same planar model,earlier presented in [15], but the focus of this paper ison analysis of meshing resolution, and on quantification ofrobustness and agility.

Our approach and analysis is largely inspired by a growingbody of work within our lab on mapping the reachablestate space of a running or hopping system [16], [17], andon our previous work on meshing [18] as a method ofminimizing fall rates (i.e., optimizing mean first-passagetime), as originally proposed in [19]. The basic frameworkand theory are described in [20]. In its original formulation,our mesh-based approach aimed at minimizing fall rates for awalking model on rough terrain. In a general sense, it aimedat optimizing the robustness of the system to perturbationscaused by variations in terrain height.

In this work, we instead focus on the goal of agility.Abstractly, a robust system aims “not to fail” when subjectto various sources of variability. By contrast, being agileimplies the ability to drive the system to any of a set ofparticular states or conditions, and to do so rapidly. We arguean intuitive notion of agility implies that a legged system(a) can move to any of a set of physical locations on terrain

Page 2: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

-θ1

θ2 -θ3-θ4

d1

d2 r2r3

d3

d4 r4

r1

x1, y1

Fig. 1: (top) Geometry of the 8 DOF 2D model, with d1 =0.1 m, d2 = −0.048 m, d3 = −0.049 m, d4 = 0.09 m,r2 = r3 = 0.24 m. (bottom) Inertial and passive elementswith m0 = m5 = 0.043 kg, m1 = m4 = 0.57 kg, m2 = 4.4kg, m3 = 4.5 kg, J2 = 0.0145 kg-m2, J3 = 0.015 kg-m2, k1 = 3300 N/m, k3 = 11 N-m/rad, k4 = 3460 N/m,c1 = c4 = 50 N-s/m, c3 = 0.6 N-m-s/rad

rapidly, and (b) can also reach a large range of state space.In other words, you wish not only to arrive at a particularphysical location but also to do so with appropriate dynamics(e.g., capable of leaping a tall building, or of crossing a ditch,or simply of stopping in place, as required by higher-levelplanning objectives). In this work, we use a long jump taskas a particular task requiring agility.

The rest of this paper is organized as follows. Section IIdescribes our planar bounding model and gives an overviewof the finite set of seven low-level controllers we use forhigh-level planning. Section III describes our algorithms tomesh the reachable state space and to derive approximateoptimal control policies using this mesh. Section IV presentsresults for a long jump task, focusing on the scalabilityof the mesh, accuracy of resulting results, and potentialagility metrics toward predicting performance, and Section Vsummarizes our conclusions and discusses directions forfuture work.

II. MODELING AND LOW-LEVEL CONTROL

We study an 8 degree of freedom (DOF) bounding model,shown in Fig. 1. All simulations for dynamics are donewithin Matlab, using equations of motion derived using theLagrangian approach. The system includes masses at eachfoot as well as along the body, and has springs both withinthe legs and within a flexible spine. The model and low-level controls we use are outlined in more detail in [15]. Weprovide a general overview and highlight key aspects below.

We designed seven low-level controllers and allow forswitching between controllers once per step. The switchingoccurs when the rear hip of the model is at its apex and

essentially consists of picking new set points for each internaljoint in the model.

More specifically, the 4-link model has three active torqueinputs, so that the rotational internal joints are actuatedwhile any pivot joint at either foot during ground contactis unactuated. Our control approach at each step drives eachof the three actuated, internal joints to a desired angle. Thecontrol to do uses Proportional plus Derivative (PD) controlto set torque, based on a simplified feedback linearizationapproximation. Specifically, we use torque to cancel thespring and damper in the spine, and to account for figuration-dependent effect of inertia. For control of the rear and frontleg joints, our feedback linearization accounts only for thependulum dynamics of the leg. Our goal is to provide amethod that does not depend on exact sensory knowledgeof all states during control but still attempts to cancel out alarge portion of the nonlinearities of the dynamics.

The control law for the spine is:

τsp = k3θsp + c3θsp + Jeff (−Kp,spθsp −Kd,spθsp) (1)

We select PD gains Kp,sp and Kd,sp that aim for thefollowing desired second-order dynamics:

θdes = −ω2nθ − 2ζωnθ = −Kpθ −Kdθ (2)

with natural frequency ωn = 25 rad/s and damping ratioζ = 1.

For each leg, the PD gains Kp,l and Kd,l are selectedas above with ωn = 30π rad/s and ζ = 1. As mentioned,these controllers aim to cancel the free pendulum dynamicsof each leg. Torques for the rear leg, τrl, and for the frontleg, τfl, are:

τrl = (m1d21 +m0r

21)(−Kp,l(θ1 − θ1,des)−Kd,lθ1)

τfl = (m4d24 +m5r

24)(−Kp,l(θ4 − θ4,des)−Kd,lθ4)

(3)

The set points for the seven controllers do not vary dra-matically and are hand-tuned to provide a range of behaviorsfor multi-step trials on flat terrain. Further details are notpresented here, due to space limitations.

III. MESHING AND HIGH-LEVEL CONTROL

This section describes our algorithm for generating a non-uniform meshing of reachable states, and of using this mesh-ing for several high-level goals. High-level goals investigatedetermination of optimal policies to return to a goal state,derived using value iteration, and brute force searches overfeasible low-level control sequences, either to vary forwardspeed or to jump over a particular upcoming gap in terrain.

A. Meshing the Reachable State Space

Our meshing algorithm Alg. 1 (see [18], [20] for details)generates a discrete approximation of full reachable set ofstates the system can visit, along with all possible transitionsbetween states. The states in the mesh are Poincare snapshotsof the continuous-time dynamics, taken at the apex ofeach bounding motion, which is also the moment at whichswitching between controllers occurs. The mesh is generated

Page 3: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

by deterministically simulating each of seven possible one-step controllers from each state in the mesh, only adding newpoints if they exceed a given distance metric from all otherexisting points in the mesh.

Algorithm 1 Meshing Algorithm1: Input : Initial set of states P , set of controllers U , threshold

distance dthr2: Output : State transition map M , Final set of states P3: Qnext ← P4: while Qnext not empty do5: Qcurrent ← Qnext

6: empty Qnext

7: for each pj ∈ Qcurrent do8: for each uk ∈ U do9: Simulate one step and save the generated state vector

pj in P and Qnext if it is far enough (dthr) fromall p ∈ P . The corresponding information about thestep length are then stored in M .

10: end for11: end for12: end while13: return M , P

The algorithm begins with just two seed states: the fixedpoint for the limit cycle associated Controller 1, and anabsorbing failure state, and the threshold distance dthr fromevery other point p ∈ Rn in the mesh P is:

d(x, P ) = minp∈P

√√√√ n∑i=1

(xi − piσi

)2

(4)

where x ∈ Rn is the system state, and σi is the standarddeviation for state component i over all current points in themesh.

B. High-level control

Toward achieving agile motions, we explore high-levelcontrol to drive the system from the limit cycle behaviorof Controller 1 (C1) to reach a wide variety of end stateson terrain and in state space. With only seven controllers, itis possible to exhaustively explore all n-step combinationsof low-level control up to some computationally practicallimit in n. In practice, this limit is effectively n = 10steps, corresponsing to roughly 4 to 6 meters ahead onterrain (depending on control actions chosen). Most of thesesequences result in failures, where the robot crashes intothe ground. We then search among the successful plans toexamine two goals: (1) varying the location of the end stateand (2) crossing a gap of particular size and location, aheadon terrain. Note that compared to our previous work in [15],we no longer use a tree-search and instead simply computeall possible transitions.

To investigate robust performance, we also solve forcontrol policies to drive the system from an arbitrary statein our discrete mesh back to the limit cycle, or to anyother point of interest in the mesh. We use the well-knownvalue iteration algorithm to do so. Our implementation issummarized briefly below.

C. Value Iteration

We use a deterministic version of the Value Iterationalgorithm to determine optimal control sequences betweennodes in the mesh. Starting with a set of states p ∈ P ,actions a ∈ A, and a state transition map M(p, a), we selecta goal state pgoal, and designate a failure state pfail. Theiteration begins by creating a vector U of utilities associatedto each state, with ugoal = 0, ufailure = 103, and all othersinitialized to zero. On each iteration we update the utility foreach state as U(p) = Cstep+ γmina∈A U

′(M(p, a)), whereCstep = 1 is the step cost, γ = 1 is a discount factor, and U ′

is the utility vector from the previous iteration. By recordingthe control action that yielded the minimum utility for eachstate, we also update the optimal policy π∗ on every iteration.This continues until the maximum change in utility for anystate, δ, is less than a threshold ε = 10−4. For the mesh usedin this study, δ converged to zero after 14 iterations, taking0.7 s to compute.

Algorithm 2 Value Iteration1: Input : Set of states P , actions A, state transition map M(p, a),

goal state pgoal, convergence threshold ε2: Output : Vector of utilities for each state U , optimal policy

from each state π∗

3: Local Variables : Utilities for last iteration U ′, max change inutilities from last iteration δ

4: U(pgoal) = ugoal, U(pfail) = ufail

5: while δ > ε do6: U ′ ← U7: for all states p in P do8: U(p)← Cstep + γmina∈A U

′(M(p, a))9: π∗(p)← argmina∈A U

′(M(p, a))10: end for11: U(pgoal) = ugoal, U(pfail) = ufail

12: δ ← maxp∈P |U(p)− U ′(p)|13: end while

IV. RESULTS AND DISCUSSION

This section presents our results for high-level motionplanning, using our meshed approximation of the dynamicsystem. We focus on providing intuition for the feasibilityof meshing and on providing a greater foundation for futuredirections within the legged robotics community, towardquantifying and improving agility.

A. Mesh Dimensionality

Our entire motivation for using a mesh-based approach isbased on the observation that enforcing a low-level controlleron the internal joints of a legged system significantly reducesthe dimension of the reachable state space. For example, inour work with a 5-link point-footed walker, using a singlesliding-mode, low-level controller during rough terrain walk-ing resulted in an approximately 2D manifold of reachablePoincare’ states [21]. Using a set of n low-level controllers,the reachable space is nearly (but somewhat larger than) aset of n 2D surfaces.

In this paper, we study a model of higher dimensionality,with 8 DOF instead of 5, i.e., with 16 total position and

Page 4: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

velocity states across. Meshing the full 16-dimensional spaceis not feasible, due to the curse of dimensionality, andso we have investigated similar deterministic meshing ofthe reachable state space. However, the planar boundingmodel includes springs in the legs, meaning the degree ofunderactuation is greater, and the PD control used here doesnot attempt to drive the internal states to their set points asaggressively as the sliding-mode control used for the walker.

Thus, the first aspect of interest in our results is todetermine the effective dimensionality of the reachable statespace, when switching control is implemented using ourseven controllers. Depending on the structure of pointsin space, this can potentially a fractional dimension [22].Ultimately, what we care about is the rate of the size of themesh, as the desired resolution between nearby mesh pointsbecomes more refined. This is in exactly what fractionaldimensionality (e.g., fractal dimension) quantifies.

To estimate the dimensionality of our reachable statespace, we first generated a “fine resolution” mesh, using adistance threshold of dthr = 0.25 when using Equation 4to test the distances to all other points in the mesh, duringmesh generation. This resulted in a mesh of 52,826 points.Then, we randomized the order of points in the mesh and“repopulated” a new mesh, using a scaled version of thenormalization set by the full mesh. (dthr varies as the meshis grown.)

Figure 2 shows our results, giving a log-log plot of meshsize (N ) as a function of the allowed spacing between points(d). As expected, the data fall on a line as the mesh becomescoarse (to the right in the plot), since we have hypothesizedN ∝

(1d

)n, where n is the “dimension” of the reachable state

space. Toward the left, the mesh can never exceed the originalsize of 52,826 points, and the full data are bounded by thesetwo extremes, depicting with the red and blue dashed lines.

−2.5 −2 −1.5 −1 −0.5 0 0.55

6

7

8

9

10

11

12

log(d) [resolution of mesh]

log(

N)

[num

ber

of m

esh

poin

ts]

Slope = −3.1

Total Mesh−Size Limit

Fig. 2: Dimensionality of the mesh is approximately 3. Plotshows how the number of mesh points varies as the resolutionof the mesh is reduced. Doubling the resolution increases themesh size by about 23.1, i.e., the mesh is approximately 3D.

Since the slope in Figure 2 is approximately -3.1, the meshgrows as N ∝

(1d

)3.1. With n ≈ 3.1, we have an approx-

imately “3D” set of points. With such a low-dimensional

mesh, we still have hope of achieving sufficient resolutionto predict step-to-step dynamics. However, if the dynamicsare highly sensitive to initial conditions (e.g., chaotic), thishope is lost. Detemining sensitivity to initial conditions isan open topic of interest for this work, although simulationsof true dynamics (presented in IV-D) are promising for timehorizons up to about 5 steps.

B. Value Iteration Results

Figure 3 shows one particular 3D projection of the “non-doomed” states within our mesh points, with the limit cycleof Controller 1 shown as a magenta star. Here, color indicatesthe number of steps required to arrive at this limit cyclestate, as calculate using our value iteration algorithm, whichuses a nearest neighbor approximation in predicting statetransitions. The plot shows some of the structure of dynamicsystem. There are several distinct shapes, each correspondingto a particular controller used on the previous step.

The lower portion of Figure 3 show the full mesh, in-cluding states which are doomed to fail, regardless of futurecontrol actions. The points account for approximately onethird of the total mesh.

The value iteration results are useful for two purposes.First, they enable us to more efficiently evaluate the long-term stability of solutions found for our long jump task. Itis quite possible that optimal solutions to cross a large gapwould require extreme motions that essentially “crash” ontothe other side, and in fact, we predicted this would likely bethe case. Interestingly, the optimal gap-crossing performancedoes not differ noticeably when we require that the final statehas a feasible policy to return to the stable C1 limit cycle.Explaining this is a topic of interest for future work.

Figure 4 shows another 3D perspective of our meshpoints, again depicted the “value” (defined as steps-to-goal)through color. Subplots here attempt to show more detailfor each of the two otherwise relatively flat-looking surfaces(horizontally and vertically).

C. Control of Distance Traveled on Terrain

Definitions of agility are difficult to decouple from particu-lar high-level application goals. However, one general goal isthe ability to reach a large, open set of locations on terrain,which should grow as a function of the number of stepstaken. Here, we investigate how this growth occurs and alsowhether the full set of seven states is required for optimal(or near-optimal) performance.

Figure 5 shows the set of rear hip locations in x as afunction of the number of steps taken, start with the C1 limitcycle state. Removing any one of the seven control optionsreduces the reachable set somewhat, although perhaps notdramatically. Removing Controller 7 (C7) definitely has themost significant affect and is depicted to give the reader someidea of its effect of removal. As n gets larger, the set growsroughly linearly, as expected.

We we were curious whether or not the ability to varyend location on terrain might correlate well with long jumpperformance. It would be useful to minimize the number

Page 5: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

−2−1.5

−1−0.5

−0.50

0.5−2

−1.8

−1.6

−1.4

−1.2

−1

X1X

2

x 4

1 step2 steps3 steps4 steps5 steps6 steps7 steps8 steps9 steps10 stepsGoal State

−2−1.5

−1−0.5

−0.50

0.51

1.5−2

−1.8

−1.6

−1.4

−1.2

−1

X1X

2

x 4

Fig. 3: 3D projection of mesh. In the upper plot, colorindicates the number of steps required to return to the limitcycle motion. The lower plot highlights point in the meshthat are doomed to fail in either 1 step (red +) or two steps(green *). States (X1, X2, X4) represent various joint angleshere.

of intermediate metrics required to gauge agility across avariety of somewhat different tasks, for instance, planningfor a single large gap, or for multiple smaller gaps, or forvertical obstacles, or for catching a moving target, etc. Toinvestigate this, we explored performance of various subsetsof the full set of 7 low-level controllers, always includingC1 as one of these.

The best performance achieveable by a set of only threecontrollers came by using C1, C3, and C7. Using only theseoptions results in a set of achievable end points spanningapproximately 3/4 the full range, as illustrated. At the otherextreme, removing C6 seems to have almost no effect, andthe worst set of four low-level controllers (C1, C2, C4 andC5) reduces performance dramatically. Of note, although C2and C5 do not seem to add much (beyond using C1 alone)in this task, they are still essential for good performanceacross a range of potential gap locations for the long jump,as described in the next section.

−5

0

5

−200

2040

−5

0

5

10

15

20

25

30

X10X

9

x 12

1 step2 steps3 steps4 steps5 steps6 steps7 steps8 steps9 steps10 stepsGoal State

−202−0.2 0 0.2 0.4 0.6−5

0

5

10

15

20

25

30

X10

X9

x 12

1 step2 steps3 steps4 steps5 steps6 steps7 steps8 steps9 steps10 stepsGoal State

−1

0

1−10 0 10 20 30 40

−1

0

1

2

3

4

5

X10

X9

x 12

1 step2 steps3 steps4 steps5 steps6 steps7 steps8 steps9 steps10 stepsGoal State

Fig. 4: Another 3D projection of the mesh states, usinga different subset of states (X9, X10, X12), which nowrepresent various joint velocites.

Fig. 5: Limiting policies affects n-step reachable distance inx.

D. Planning for a Long Jump

Our final results focus on the long jump task. Here, boththe distance to the near edge of the gap and the length ofthis gap are variables, and we determine the full range ofconditions for which are mesh predicts a feasible controlsequence exists. In simulations, we find that these predictedsolutions are usually but not always successful.

More specifically, the median absolute value of the lo-cations of footholds for predicted versus actual footholdsacross a 6-step planning horizon is less than one centimeterat each foothold. However, the distribution is bimodal, withapproximately 10% of simulations showing large deviationsfrom planned dynamics.

An important topic for future work is to determine if thisis primarily due to limitations in meshing such dynamicsystems, or simply indicates the low-level controllers can be

Page 6: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

better designed. We anticipate that using robustness of resultsas a benchmark during low-level control optimization mayreduce sensitivity significantly, mirroring efforts in design oflow level control for walkers which aimed directly at im-proving robustness of mesh-based policy performance [18].

Figure 6 shows performance for the long jump task. Here,there is a general trend in which the achievable gap lengthasymptotically reaches a maximum distance of 0.82 metersas the lookahead distance to prepare for the gap increases,along the x axis. The red dashed line is a first-order exponen-tial decay, approximating this general trend. We also tracea lower bound on the worst possible performance bound,given the x axis represents a constant lookahead distancethat a robot model has on terrain. In other words, with alookahead of around 0.7 meters ahead, it is possible therewill actually be a gap just short of one meter ahead, wherethere is a sharp, spikey drop in performance. The lower plotshows the same data on a semi-log plot, to emphasize theexponential approach to some maximal jump capability, aslookahead increases.

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Distance to gap (m)

Leng

th o

f gap

(m

)

Max Jump PossibleBest Jump PossibleConservative Best Jump LimitWorst Case For LookaheadExponential Fit10−Step Lookahead Horizon

0 0.5 1 1.5 2 2.5 3 3.5 4 4.510

−2

10−1

100

Distance to gap (m)

Leng

th o

f gap

(m

)

Fig. 6: Long jump performance. X axis show distance tothe gap (i.e., lookahead), and y axis shows length of gap.Lower plot shows same data, with y now representing howmuch smaller the achievable gap width is, compared withthe maximum gap length of 0.82 (m).

Figure 8 shows the same data for the case in whichonly low-level controllers 1, 3, and 7 are used. The sametype of aymptotic improvement occurs, but the maximum

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Distance to gap (m)

Leng

th o

f gap

(m

)

Using Controls 1,3,7 only

Max Jump PossibleBest Jump PossibleWorst Case For Lookahead

Fig. 7: Long jump performance, using only Controllers 1, 3,and 7.

jump length is now 0.55 meters, rather than 0.82 (m). Wenoted significant descreases in performance when any of thecontrollers was removed from the availablel set, except forC4, which has no apparent effect in optimal performance forthis task.

Finally, Figure 8 shows the set of footholds on terrain fora few particular optimal policies, for different gap widthsand gap locations. The sequence of control actions is listedabove the footholds and, as previously described, switchingoccurs at the apex of each ballistic flight phase in thebounding motion. We note that the sequences for differentoptimal policies vary significantly, and that we are not able toidentify simplified rules to approximate the policy. The mesh-based approach allows us to determine full performance ina dynamic system for which high-level policies can notbe designed through intuition, and it gives a means ofquantifying long-term performance easily.

0 1 2 3 4 5 60

1

2

3

3 2 6 7 2 2 1 6 6 5

0 1 2 3 4 5 60

1

2

3

1 3 7 7 7 2 1 7 3 5

0 1 2 3 4 5 60

1

2

3

1 7 6 6 2 6 1 5 3 6

0 1 2 3 4 5 60

1

2

3

1 3 7 7 7 1 7 7 3 6

0 1 2 3 4 5 60

1

2

3

3 5 5 2 7 3 2 3 1 2

Front Foot PlacementRear Foot Placement

Fig. 8: Examples of optimal policies and their resultingfoothold patterns.

Page 7: Mesh-Based Methods for Robust and Agile Dynamic Gaitskatiebyl/papers/DRAFT_Strizic_CDC16.pdf · 2016-03-16 · Mesh-Based Methods for Robust and Agile Dynamic Gaits Katie Byl 1, Tom

V. CONCLUSIONS AND FUTURE WORK

This paper describes various aspects of a mesh-basedapproach for multi-step planning of dynamic legged systemmodels, focusing on quantifying agility and investigatingrobustness. In describing agility, we focus on the ability toreach a desired forward location in terrain in a particularnumber of steps, and of performing a long jump at variabledistances ahead and for variable gap widths. Each task in turnrequires access to a some region of state space, to changevelocity, leg angles, and other variables. Motivated by thedesired to improve agility in future work, this is a first steptoward quantifying some important aspects of agility.

Of particular note, simulations of the dynamics for variousinitial conditions are generally but not always well-predictedby the mesh version of the dynamics when value iterationpolicies are used (to arrive at an arbitrary state in the mesh),which is a topic for further investigation. This mesh-basedframework also provides a means of boot-strapping futuredesign of low-level control, with the robustness betweenmesh-predicted and actual dynamics as a driving goal dur-ing optimization of low-level control. That is, accuracy ofthe mesh is of greater importance than theoretical agilityperformance, based on mesh-based estimates: the estimatesare only useful if accurate.

For future work, we also hope to combine various factorsin producing optimal policies. For example, [23] uses ametric to trade off robustness with energy use, and weenvision eventually combining metrics for these two goalsalong with various metrics for agility.

REFERENCES

[1] I. Poulakakis, E. Papadopoulos, and M. Buehler, “On thestability of the passive dynamics of quadrupedal runningwith a bounding gait,” The International Journal of RoboticsResearch, vol. 25, no. 7, pp. 669–687, 2006. [Online]. Available:http://ijr.sagepub.com/content/25/7/669.abstract

[2] Q. Cao and I. Poulakakis, “Passive quadrupedal bounding with asegmented flexible torso,” in IROS, 2012.

[3] U. Culha and U. Saranlı, “Quadrupedal bounding with an actuatedspinal joint,” in Proc. IEEE Int. Conf. Robotics and Automation(ICRA), 2011.

[4] U. Culha, “An actuated flexible spinal mechanism for a boundingquadrupedal robot,” Master’s thesis, Bilkent University, 2012.

[5] M. Khoramshahia, A. Sprowitzm, A. Tuleu, M. N. Ahmadabadi,and A. J. Ijspeert, “Benefits of an active spine supported boundinglocomotion with a small compliant quadruped robot,” in Proc. IEEEInt. Conf. Robotics and Automation (ICRA), 2013.

[6] K. Byl, B. Satzinger, T. Strizic, P. Terry, and J. Pusey, “Toward agilecontrol of a flexible-spine model for quadruped bounding,” in Proc.SPIE 9468, Unmanned Systems Technology XVII, 94680C, 2015.

[7] G. C. Haynes, J. Pusey, R. Knopf, A. M. Johnson, and D. E.Koditschek, “Laboratory on legs: an architecture for adjustable mor-phology with legged robots,” vol. 8387, 2012.

[8] J. M. Duperret, G. D. Kenneally, J. L. Pusey, and D. E. Koditschek,“Towards a comparative measure of legged agility,” in Proc. Int. Symp.on Experimental Robotics (ISER 2014), 2015.

[9] G. A. Folkertsma, S. Kim, and S. Stramigioli, “Parallel stiffness ina bounding quadruped with flexible spine,” in Proc. Int. Conf. onIntelligent Robots and Systems (IROS). IEEE, 2012, pp. 2210–2215.

[10] M. Posa, C. Cantu, and R. Tedrake, “A direct method for trajectoryoptimization of rigid bodies through contact,” International Journal ofRobotics Research, vol. 33, no. 1, pp. 69–81, 2014.

[11] L. Righetti, J. Buchli, M. Mistry, M. Kalakrishnan, and S. Schaal,“Optimal distribution of contact forces with inverse-dynamics control,”International Journal of Robotics Research, vol. 32, no. 3, pp. 280–298, 2013.

[12] M. Posa and R. Tedrake, “Direct trajectory optimization of rigid bodydynamical systems through contact,” in Algorithmic foundations ofrobotics X. Springer, 2013, pp. 527–542.

[13] I. Mordatch, J. M. Wang, E. Todorov, and V. Koltun, “Animatinghuman lower limbs using contact-invariant optimization,” ACM Trans-actions on Graphics (TOG), vol. 32, no. 6, p. 203, 2013.

[14] W. Xi, Y. Yesilevskiy, and C. D. Remy, “Selecting gaits for economicallocomotion of legged robots,” Int Journal of Robotics Research [onlinefirst], Nov 2015, doi 10.1177/0278364915612572.

[15] V. Paris, T. Stizic, J. Pusey, and K. Byl, “Tools for the design ofstable yet nonsteady bounding,” in Proc. American Control Conference(ACC) [in press], 2016.

[16] G. Piovan and K. Byl, “Reachability-based control for the active slipmodel,” Int Journal Robotics Research, 2014.

[17] ——, “Approximation and control of the slip model dynamics viapartial feedback linearization and two-element leg actuation strategy,”IEEE Trans. on Robotics [in press], 2016.

[18] C. O. Saglam and K. Byl, “Robust policies via meshing for metastablerough terrain walking,” in Proc. Robotics: Science and Systems X (RSSX). RSS, 2014, p. Paper 49.

[19] K. Byl and R. Tedrake, “Metastable walking machines,” InternationJournal of Robotics Research, vol. 28, no. 8, pp. 1040–10 064, 2009.

[20] C. O. Saglam and K. Byl, “Metastable markov chains,” in Proc. IEEEConference on Decision and Control (CDC), 2014.

[21] ——, “Stability and gait transition of the five-link biped on stochas-tically rough terrain using a discrete set of sliding mode controllers,”in Proc. IEEE Int. Conf. Robotics and Automation (ICRA), 2013.

[22] S. H. Strogatz, Nonlinear dynamics and chaos: with applications tophysics, biology, chemistry, and engineering. Westview press, 2014.

[23] C. O. Saglam and K. Byl, “Quantifying the trade-offs between stabilityversus energy use for underactuated biped walking,” in IEEE/RSJ Int.Conf. on Intelligent Robots and Systems (IROS). IEEE, 2014, pp.2550–2557.