Modeling and Control of Soft Robots Using the Koopman ...rss2019.informatik.uni-freiburg.de/papers/0179_FI.pdf · come the challenges of modeling and controlling soft robots. The

Modeling and Control of Soft Robots Using theKoopman Operator and Model Predictive Control

Daniel BruderMechanical EngineeringUniversity of Michigan

Ann Arbor, Michigan 48109Email: [email protected]

Brent GillespieMechanical EngineeringUniversity of Michigan


C. David RemyMechanical EngineeringUniversity of Michigan


Ram VasudevanMechanical EngineeringUniversity of Michigan


Abstract—Controlling soft robots with precision is a challengedue in large part to the difficulty of constructing models thatare amenable to model-based control design techniques. Koop-man operator theory offers a way to construct explicit lineardynamical models of soft robots and to control them usingestablished model-based linear control methods. This method isdata-driven, yet unlike other data-driven models such as neuralnetworks, it yields an explicit control-oriented linear modelrather than just a “black-box” input-output mapping. This workdescribes this Koopman-based system identification method andits application to model predictive controller design. A model andMPC controller of a pneumatic soft robot arm is constructedvia the method, and its performance is evaluated over severaltrajectory following tasks in the real-world. On all of the tasks,the Koopman-based MPC controller outperforms a benchmarkMPC controller based on a linear state-space model of the samesystem.

I. INTRODUCTION

Soft robots have bodies made out of intrinsically soft and/orcompliant materials. This inherent softness enables them tosafely interact with delicate objects, and to passively adapttheir shape to unstructured environments [26]. Such traits aredesirable for robotic applications that demand safe human-robot interaction such as wearable robots, in-home assistiverobots, and medical robots. Unfortunately, the soft bodies ofthese robots also impose modeling and control challenges,which have restricted their functionality to date. While manynovel soft devices such as soft grippers [13], crawlers [32], andswimmers [16] exploit the flexibility of their bodies to achievecoarse behaviors such as grasping and locomotion, they do notexhibit precise control capabilities.

The challenge in constructing such precise control tech-niques is due in large part to the difficulty of devising modelsof soft robots that are amenable to model-based control designtechniques. Consider for instance a rigid-bodied robotic systemthat is made up of rigid links connected together by discretejoints. Since joint displacements can be used to fully describethe configuration of a rigid-bodied system, joint displacementsand their derivatives make a natural choice for the statevariables for rigid-bodied robots [27]. One can use this choiceof state variables to describe the dynamics of the rigid-bodiedrobot. This, as a result, makes the application of model-based control design techniques such as feedback linearization

Model MPC

Lift

ed S

pace

Stat

e-sp

ace

(Convex)

(Non-convex)

(Linear)

(Nonlinear)

cost

cost

input

input

Fig. 1. A nonlinear dynamical system (bottom-left) has a linear representa-tion in the lifted space made up of all real-valued functions (top-left). While amodel predictive controller (MPC) designed for the nonlinear system in state-space requires solving a non-convex optimization problem to choose inputs ateach time-step (bottom-right), this problem is convex for an MPC controllerdesigned for the lifted linear system (top-right). This paper develops a data-driven method to construct such a lifted model representation for soft roboticsystems in the presence of outliers and a to construct a convex, model-basedcontrol design technique for such systems.

[27], nonlinear model predictive control [2], LQR-trees [29],sequential action control [3], and others feasible.

Soft robots, in contrast, do not exhibit localized deformationat discrete joints, but instead deform continuously along theirbodies and have infinite degrees-of-freedom. In the absenceof joints, there does not yet exist a canonical choice of statevariables to describe the geometry of a soft robot. As aresult, existing representations are typically only rich enoughto describe the system under restrictive simplifying assump-tions. For example, the popular piecewise constant curvaturemodel [35] provides a low-dimensional description of theshape of continuum robots, but only under the assumptionthat bending occurs in sections of constant curvature. Othersimplified models such as pseudo-rigid-body [12] and quasi-static [5, 30, 9, 34] have proven useful, but they are only ableto describe behavior in the subset of conditions over which thesimplifying assumptions hold. This can make applying model-

based control design techniques impractical.Alternatively, data-driven methods such as traditional ma-

chine learning and deep learning can be applied to constructmodels for soft robots without making structural simplifyingassumptions. Such models provide a “black-box” mappingfrom inputs to outputs and have been shown to predict behav-ior well across various configurations of soft robots [8, 30].However, since no explicit model is constructed, these methodsare also not amenable to existing model-based control designtechniques.

Koopman operator theory offers an approach that can over-come the challenges of modeling and controlling soft robots.The approach leverages the linear structure of the Koopmanoperator to construct linear models of nonlinear controlleddynamical systems from input-output data [6, 18], and tocontrol them using established linear control methods [1, 14].In theory, this approach involves lifting the state-space toan infinite-dimensional space of scalar functions (referred toas observables), where the flow of such observables alongtrajectories of the nonlinear dynamical system is describedby the linear Koopman operator. In practice, however, it isnot feasible to compute an infinite-dimensional operator, soa modified version of the Extended Dynamic Mode Decom-postion (EDMD) is employed to compute a finite-dimensionalprojection of the Koopman operator onto a finite-dimensionalsubspace of all observables (scalar functions). This approxi-mation of the Koopman operator describes the evolution ofthe output variables themselves, provided that they lie withinthe finite subspace of observables upon which the operator isprojected. Hence, this approach makes it possible to controlthe output of a nonlinear dynamical system using a controllerdesigned for its linear Koopman representation.

The Koopman approach to modeling and control is wellsuited for soft robots for several reasons. Soft robots poseless of a physical threat to themselves or their surroundingswhen subjected to random control inputs than conventionalrigid-bodied robots. This makes it possible to safely collectinput-output data over a wide range of operating conditions,and to do so in an automated fashion. Furthermore, since theKoopman procedure is entirely data-driven, it inherently cap-tures input-output behavior and avoids the ambiguity involvedin choosing a discrete set of states for a structure with infinitedegrees of freedom.

The work presented here can be considered an extension ofthe work on Koopman-based modeling and control of Mauroyand Goncalves [18] and Korda and Mezic [14]. The novelcontributions of this work, as depicted in Fig. 1 are:

1) An extension to the Koopman system identification pro-cedure described in [18] to make the resulting Koopmanoperator both more sparse and less sensitive to outliersand noise in the training data,

2) The application of this identified Koopman model formodel predictive control of a physical soft robotic sys-tem.

The rest of this paper is organized as follows: In Section IIwe formally introduce the Koopman operator and describe how

it is used to construct linear models of nonlinear dynamicalsystems. In Section III we describe how the Koopman modelcan be used to construct a linear model predictive controller(MPC). In Section IV we describe the soft robot and the set ofexperiments used to evaluate the performance of a Koopman-based MPC controller. In Section V concluding remarks andperspectives are provided.

II. LINEAR SYSTEM IDENTIFICATION

Any finite-dimensional, Lipschitz continuous nonlinear dy-namical system has an equivalent infinite-dimensional linearrepresentation in the space of all real-valued functions of thesystem’s state [15, Definition 3.3.1]. This linear representation,which is called the Koopman operator, describes the flowof functions along trajectories of the system. While it isnot possible to numerically represent the infinite-dimensionalKoopman operator, it is possible to represent its projectiononto a finite-dimensional subspace as a matrix. This sectionshows that for a given choice of basis functions, a lifted lineardynamical system model can be extracted directly from thematrix approximation of the Koopman operator. The remain-der of this sections outlines the approach for constructingthe Koopman operator approximation and the linear systemrepresentation from data. Section III illustrates how this modelcan be incorporated into a model predictive control algorithm.

A. Koopman Representation of a Dynamical System

Consider a dynamical system

x(t) = F (x(t)) (1)

where x(t) ∈ X ⊂ Rn is the state of the system at time t ≥ 0,X is a compact subset, and F is a continuously differentiablefunction. Denote by φ(t, x0) the solution to (1) at time twhen beginning with the initial condition x0 at time 0. Forsimplicity, we denote this map, which is referred to as theflow map, by φt(x0) instead of φ(t, x0).

The system can be lifted to an infinite dimensional functionspace F composed of all square-integrable real-valued func-tions with compact domain X ⊂ Rn. Elements of F are calledobservables. In F , the flow of the system is characterizedby the set of Koopman operators Ut : F → F , for eacht ≥ 0, which describes the evolution of the observables f ∈ Falong the trajectories of the system according to the followingdefinition:

Utf = f φt, (2)

where indicates function composition. As desired, Ut is alinear operator even if the system (1) is nonlinear, since forf1, f2 ∈ F and λ1, λ2 ∈ R

Ut(λ1f1 + λ2f2) = λ1f1 φt + λ2f2 φt= λ1Utf1 + λ2Utf2.

(3)

Thus, the Koopman operator provides a linear representationof the flow of a nonlinear system in the infinite-dimensionalspace of observables (see Fig. 1) [7]. Contrast this representa-tion with the one generated by the (nonlinear) flow map that

for each t ≥ 0 describes how the initial condition evolvesaccording to the dynamics of the system. In particular if onewants to understand the evolution of an initial condition x0at time t according to (1), then one could solve the nonlineardifferential equation to generate the flow map. On the otherhand, one could apply Ut (a linear operator) to the indicatorfunction centered at x0 (i.e. the function that is 1 at x0 and zeroeverywhere else) to generate an indicator function centered atthe point φt(x0).

B. Identification of Koopman Operator

Since the Koopman operator is an infinite-dimensionalobject, it cannot be represented by a finite-dimensional ma-trix. Therefore, we settle for the projection of the Koopmanoperator onto a finite-dimensional subspace. Using a modi-fied version of the Extended Dynamic Mode Decomposition(EDMD) algorithm [36] originally presented in [18, 19], weidentify a finite-dimensional approximation of the Koopmanoperator via linear regression applied to observed data.

Define F ⊂ F to be the subspace of F spanned by N > nlinearly independent basis functions ψi : Rn → RNi=1. Wedenote the image of ψi as Ri which is equal tow ∈ R|∃x ∈ Rn such that ψi(x) = w. For convenience, weassume that the first n basis functions are defined as

ψi(x) = xi (4)

where xi denotes the ith element of x. Any observable f ∈ Fcan be expressed as a linear combination of elements of thesebasis functions

f = θ1ψ1 + · · ·+ θNψN (5)

where each θi ∈ R. To aid in presentation, we introduce thevector of coefficients θ = [θ1 · · · θN ]> and the lifting functionψ : Rn → RN defined as:

ψ(x) :=[xi · · · xn ψn+1(x) · · · ψN (x)

]>. (6)

We denote the image of ψ as M = R1 × · · · × RN ⊂ RN .By (5) and (6), f evaluated at a point x in the state space isgiven by

f(x) = θ>ψ(x) (7)

We therefore refer to ψ(x) as the lifted state, and θ as thevector representation of f .

Given this vector representation for observables, a linearoperator L : F → F can be represented as an N ×Nmatrix. We denote by Ut ∈ RN×N the approximation of theKoopman operator in F , which operates on observables viamatrix multiplication:

Utθ = θ′ (8)

where θ, θ′ are each vector representations of observables in F .Our goal is to find a Ut that describes the action of the infinitedimensional Koopman operator Ut as accurately as possible inthe L2-norm sense on the finite dimensional subspace F ofall observables.

To perfectly mimic the action of Ut on an observablef ∈ F ⊂ F , according to (2) the following should be true forall x ∈ X

Utf(x) = f φt(x) (9)

(Utθ)>ψ(x) = θ>ψ φt(x) (10)

U>t ψ(x) = ψ φt(x), (11)

where the second equation follows by substituting (5) and thelast equation follows by cancelling θ>. Since this is a linearequation, it follows that for a given x ∈ X , solving (11) forUt yields the best approximation of Ut on F in the L2-normsense [22]:

Ut =(ψ>(x)

)†(ψ φt(x))> (12)

where superscript † denotes the least-squares pseudoinverse.To approximate the Koopman operator from a set of ex-

perimental data, we take K discrete state measurements inthe form of so-called “snapshot pairs” (a[k], b[k]) for eachk ∈ 1, . . . ,K where

a[k] = x[k] (13)b[k] = φTs

(x[k]) + σ[k], (14)

where σ[k] denotes measurement noise, Ts is the samplingperiod which is assumed to be identical for all snapshot pairs,and x[k] denotes the measured state corresponding to the kth

measurement. Note that consecutive snapshot pairs do not haveto be generated by consecutive state measurements. We thenlift all of the snapshot pairs according to (6) and compile theminto the following K ×N matrices:

Ψa :=

ψ(a[1])>

...ψ(a[K])>

Ψb :=

ψ(b[1])>

...ψ(b[K])>

(15)

UTs is chosen so that it yields the least-squares best fit to allof the observed data, which, following from (12), is given by

UTs := Ψ†aΨb. (16)

Sometimes a more accurate model can be attained by incor-porating delays into the set of snapshot pairs. To incorporatethese delays, we define the snapshot pairs as

a[k] =[x[k]>, x[k − 1]> . . . , x[k − d]>

]>(17)

b[k] =[(φTs

(x[k]) + σk)>

x[k]> . . . x[k − d+ 1]>]>(18)

where d is the number of delays. We then modify the domainof the lifting function such that ψ : Rn+nd → RN toaccommodate the larger dimension of the snapshot pairs.Once these snapshot pairs have been assembled, the modelidentification procedure is identical to the case without delays.

C. Building Linear System from Koopman Operator

For dynamical systems with inputs, we are interested inusing the Koopman operator to construct discrete linear modelsof the following form

z[j + 1] = Az[j] +Bu[j]

x[j] = Cz[j](19)

for each j ∈ N, where x[0] is the initial condition in statespace, z[0] = ψ(x[0]) is the initial lifted state, u[j] ∈ Rm is theinput at the jth step, and C acts as a projection operator fromthe lifted space onto the state-space. Specifically, we desirea representation in which (non-lifted) inputs appear linearly,because models of this form are amenable to real-time, convexoptimization techniques for feedback control design, as wedescribe in Section III.

We construct a model of this form by first applying thesystem identification method of Section II-B to the followingmodified snapshot pairs

α[k] =

[ψ(a[k])u[k]

]β[k] =

[ψ(b[k])u[k]

](20)

for each k ∈ 1, . . . ,K. The input u[k] in snapshot k is notlifted to ensure that it appears linearly in the resulting model.With these pairs, we define the following K × (N +m)matrices:

Γα =

α[1]>

...α[K]>

Γβ =

β[1]>

...β[K]>

(21)

and solve for the corresponding Koopman operator accordingto (12)

UTs:= Γ†αΓβ . (22)

Note that by (11) and (22) the transpose of this Koopmanmatrix is the best approximation of a transition matrix betweenthe elements of snapshot pairs in the L2-norm sense

U>Ts

[ψ(a[k])u[k]

]≈[ψ(b[k])u[k]

], (23)

and we desire the best A,B matrices such that

Aψ(a[k]) +Bu[k] ≈ ψ(b[k]) (24)

Therefore, the best A and B matrices of (19) are embeddedin U>Ts

and can be isolated by partitioning it as follows:

U>Ts=

[AN×N BN×mOm×N Im×m

](25)

where I denotes an identity matrix, O denotes a zero matrix,and the subscripts denote the dimensions of each matrix. TheC matrix is defined

C =[In×n On×(N−n)

](26)

since by (4), x = [ψ1(x), . . . , ψn(x)]. Note we can also incor-porate input delays into the model by appending them to thesnapshot pairs as we did in (17) and (18).

Fig. 2. An illustration of the effect of deviating from the image of the liftedfunctions M and how it can be remedied by defining a projection operationas described in Section II-D. The evolution of the finite dimensional systemin the state space X from x0 is depicted as a red curve. The lifted versionof this evolution is depicted as the blue curve which is contained in M. Thediscrete time system representation in the higher-dimensional space createdby iteratively applying the state matrix A to z[j] may generate a solutionthat is outside of M. Though one can still apply C to z to project it backto X , this may result in poor performance. Instead, by projecting z[j] ontothe manifold at each discrete time step to define a new lifted state z[j], thedeviation from M is reduced, which improves overall predictive performance.

D. Practical Considerations: Overfitting and Sparsity

A pitfall of data-driven modeling approaches is the tendencyto overfit. While least-squares regression yields a solutionthat minimizes the total L2 error with respect to the trainingdata, this solution can be particularly susceptible to outliersand noise [25]. To guard against overfitting to noise whileidentifying UTs

, we utilize the L1-regularization method ofLeast Absolute Shrinkage and Selection Operator (LASSO)[31]:

~UTs= arg min

~UTs

||~Γα~UTs− ~Γβ ||22 + λ||~UTs

||1 (27)

where λ ∈ R+ is the weight of the L1 penalty term, and ~·denotes a vectorized version of each matrix with dimensionsconsistent with the stated problem. For λ = 0, (27) providesthe same unique least-squares solution as (22); as λ increasesit drives the elements of ~UTs

to zero. For an overview of theLASSO method and its implementation see Tibshirani [31].

The benefit of using L1-regularization to reduce overfittingrather than L2-regularization (e.g. ridge regression) is itsability to drive elements to zero, rather than just makingthem small. This promotes sparsity in the resulting Koopmanoperator matrix (and consequently the A and B matrices).Sparsity is desirable since it reduces the memory neededto store these matrices on a computer, enabling a higherdimensional set of basis functions to be used to construct thelifting function ψ.

Though sparsity is desirable, it can come at the loss of

Algorithm 1: Koopman Linear System IdentificationInput: λ , a[k], b[k] and u[k] for k = 1, ...,KStep 1: Lift data via (6)Step 2: Combine lifted data and inputs via (20)Step 3: Approximate Koopman operator UTs

via (27)Step 4: Extract model matrices A,B via (25)Step 5: Identify projection operator P via (30)Output: A := PA, B := PB

accuracy in prediction. As illustrated in Fig. 2, the liftingfunction ψ maps from Rn to M, but at some time step j,Aψ(a[j]) +Bu[j] may not map onto M. When this happensand we try to simulate our linear model from an initialcondition, it may leave the space of legitimate “lifted states”rapidly and fail to predict behavior accurately. We thereforedesire the sparsest model that minimizes the distance fromMat each iteration.

This can be accomplished by applying a projection operatorat each time step. For each snapshot pair, the ideal projectionoperator P should satisfy the following for all k

P (Aψ(a[k]) +Bu[k]) = ψ(b[k]). (28)

To build an approximation to this operator, we construct thefollowing K ×N matrix,

Ωa :=

(Aψ(a[1]) +Bu[1])>

...(Aψ(a[K]) +Bu[K])

>

. (29)

Then the best projection operator in the L2-norm sense basedon our data is given by

P :=(Ω†aΨb

)>. (30)

Composing P with the A and B matrices in (19) yields amodified linear model that significantly reduces the distancefrom M at each iteration,

z[j + 1] = Az[j] + Bu[j] (31)

where A := PA and B := PB. Algorithm 1 summarizes theproposed model construction process.

III. MODEL PREDICTIVE CONTROL

A system model enables the design of model-based con-trollers that leverage model predictions to choose suitablecontrol inputs for a given task. In particular, model-basedcontrollers can anticipate future events, allowing them tooptimally choose control inputs over a finite time horizon.The most popular model-based control design technique ismodel predictive control (MPC), wherein one optimizes thecontrol input over a finite time horizon, applies that input for asingle timestep, and then optimizes again, repeatedly [24]. Forlinear systems, MPC consists of iteratively solving a convexquadratic program.

Algorithm 2: Koopman-Based MPCInput: Prediction horizon: Nh

Cost matrices: Gi, Hi, gi, hi for i = 0, ..., NhConstraint matrices: Ei, Fi, bi for i = 0, ..., NhModel matrices: A, B

for k = 0, 1, 2, ... doStep 1: Set z[0] = ψ(x[k])Step 2: Solve (32) to find optimal input (u[i]∗)Nh

i=0

Step 3: Set u[k] = u[0]∗

Step 4: Apply u[k] to the systemend

Importantly, this is also the case for Koopman-based MPCcontrol, wherein one solves the following program at each timeinstance k of the closed-loop operation:

minu[i],z[i]

z[Nh]TGNhz[Nh] + gTNh

z[Nh]+

+

Nh−1∑i=0

z[i]TGiz[i] + u[i]THiu[i] + gTi z[i] + hTi u[i]

s.t. z[i+ 1] = Az[i] + Bu[i], ∀i ∈ 0, . . . , Nh − 1Eiz[i] + Fiu[i] ≤ bi, ∀i ∈ 0, . . . , Nh − 1z[0] = ψ(x[k])

(32)where Nh ∈ N is the prediction horizon, Gi ∈ RN×N andHi ∈ Rm×m are positive semidefinite matrices, and whereeach time the program is called, the predictions are initializedfrom the current lifted state ψ(x[k]). The matrices Ei ∈ Rc×Nand Fi ∈ Rc×m and the vector bi ∈ Rc define state andinput polyhedral constraints where c denotes the number ofimposed constraints. While the size of the cost and constraintmatrices in (32) depend on the dimension of the lifted state N ,Korda and Mezic [14] show these can be rendered independentof N by transforming the problem into its so-called ”dense-form.” Algorithm 2 summarizes the closed-loop operation ofthis Koopman based MPC controller.

Since this optimization problem is convex, it has a uniqueglobally optimal solution that can efficiently be constructedwithout initialization [4] for models with thousands of statesand inputs [21, 28]. This contrasts sharply with the MPCformulation for nonlinear systems (referred to as nonlinearmodel predictive control or NMPC [2]). NMPC requires solv-ing an optimization problem with nonlinear constraints and a(potentially) nonlinear cost function. As a result, algorithmsto solve such problems typically require initialization andcan struggle to find globally optimal solutions [23]. Thoughtechniques have been proposed to improve the speed ofalgorithms to solve NMPC problems [20, 11] or even globallysolve such problems without requiring initialization [37], theseformulations still take several seconds per iteration, which canmake them too slow to be applied during real-time control.

IV. EXPERIMENTS

This section describes the robot and the set of experimentsused to demonstrate the efficacy of the modeling and controlmethods from Sections II and III. Video of the soft robotperforming several tasks from the final experiment is includedin a supplementary video file1.

A. Robot Description: Soft Arm with Laser Pointer

The robot used for the experiments is a suspended soft armwith a laser pointer attached to the end effector (see Fig. 3).The laser dot is projected onto a 50 cm× 50 cm flat boardwhich sits 34 cm beneath the tip of the laser pointer when therobot is in its relaxed position (i.e. hanging straight down).The position of the laser dot is measured by a digital webcamoverlooking the board.

The arm itself consists of two sections that are each com-posed of three pneumatic artificial muscles or PAMs (alsoknown as McKibben actuators [33]) adhered to a centralfoam spine by latex rubber bands (see Fig. 3). The PAMsin the upper and lower sections are internally connected sothat only three input pressure lines are required, and theyare arranged such that for any bending of the upper section,bending in the opposite direction occurs in the bottom section.This ensures that the laser pointer mounted to the end effectorpoints approximately vertically downward so that the laserlight strikes the board at all times. The pressures inside theactuators are regulated by three Enfield TR-010-g10-s pneu-matic pressure regulators, which accept 0− 10V commandsignals corresponding to pressures of ≈ 0− 140 kPa. In theexperiments there is a three-dimensional input correspondingto the voltages into the three pressure regulators and a twodimensional state corresponding to the position of the laserdot with respect to the center of the board.

B. Characterization of Stochastic Behavior

Most mechanical systems demonstrate stochastic behavior(i.e. when an identical input and state produces a differentoutput) to some extent. Stochastic behavior is characteristic ofelectronic pressure regulators, which can limit the precision ofpneumatically driven soft robotic systems and undermine thepredictive capability of models.

We quantified the stochastic behavior of our soft robotsystem by observing the variations in output from period-to-period under sinusoidal inputs to the three actuators of theform

u[k] =

6 sin( 2π

T kTs) + 3

6 sin(2πT kTs −

T3 ) + 3

6 sin( 2πT kTs −

2T3 ) + 3

(33)

for periods of T = 6, 7, 8, 9, 10, 11, 12 seconds and a samplingtime of Ts = 0.1 seconds with a zero-order-hold betweensamples. Under these inputs, the laser dot traces out a circlewith some variability in the trajectory over each period. In

1https://youtu.be/e35o2OPsQHs

Camera

Laser Pointer

PAMs

Foam Spine

Pressure Regulators

Fig. 3. The soft robot consists of two bending segments with a laser pointerattached to the end effector. A set of three pressure regulators is used tocontrol the pressure inside of the pneumatic actuators (PAMs), and a camerais used to track the position of the laser dot.

Fig. 4. The left plot shows the average response of the system over asingle period when the sinusoidal inputs of varying frequencies described by(33) are applied. All of the particular responses are subimposed in light grey.The right plot shows the distribution of trajectories about the mean, with alldistances within two standard deviations highlighted in grey. The width ofthe distribution illustrates how for the soft robot system identical inputs canproduce outputs that vary by up to 2 cm.

Fig. 4 the trajectories over 210 periods are superimposed alongwith the average over all trials. Nearly all of the observedpoints fell within 1 cm of the mean trajectory. Given thisinherent stochasticity of our soft robotic system, in the bestcase we expect only to be able to control the output to within≈ 1 cm of a desired trajectory.

C. Data Collection and Model Identification

To construct a model, we ran the system through 16 trialseach lasting approximately 20 minutes. A randomized inputwas applied during each trial to generate a representativesampling of the system’s behavior over its entire operatingrange. To ensure randomization, a matrix Υ ∈ [0, 10]3×1000 ofuniformly distributed random numbers between zero and tenwas generated to be used as an input lookup table. Each controlinput was smoothly varied between elements in consecutivecolumns of the table over a transition period Tu, with a time

TABLE IAVERAGE PREDICTION ERROR OVER 2.5 SECOND HORIZON (CM)

Period of Sinusoidal Inputs (seconds)Model 6 7 8 9 10 11 12 Avg.

Koopman 2.21 2.78 1.35 1.53 1.21 0.66 1.41 1.59Linear S.S. 4.64 4.54 3.94 3.56 3.15 2.72 2.83 3.63

offset of Tu/3 between each of the three control signals

ui(t) =(Υi,k+1 −Υi,k)

Tu

(t+

(i− 1)Tu3

)+ Υi,k (34)

where k = floor (t/Tu) is the current index into the lookuptable at time t. The transition period Tu varied from 5 secondsto 10 seconds between trials. After collection, the data wasuniformly sampled with period Ts = 0.1 seconds.

Two models were fit from the data: a Koopman model,and a linear state space model. The linear state-space modelprovides a baseline for comparison and was identified from thesame data as the Koopman model using the MATLAB SystemIdentification Toolbox [17]. This model is a four dimensionallinear state-space model expressed in observer canonical form.The Koopman model was identified via the method describedin Section II on a set of 191, 000 snapshot pairs a[k], b[k]that incorporate a single delay d = 1:

a[k] =[x[k]> x[k − 1]> u[k − 1]>

]>(35)

b[k] =[(φTs

(x[k]) + σ[k])>

x[k]> u[k]>]>

, (36)

and using an N = 330 dimensional set of basis functionsconsisting of all monomials of maximum degree 4. To findthe sparsest acceptable matrix representation of the Koopmanoperator, equation (27) was solved for λ = 0, 1, 2, ..., 50. Pre-dictions from the resulting models were evaluated against asubset of the training data, with the error quantified as theaverage Euclidean distance between the prediction and actualtrajectory at each point, normalized by dividing by the averageEuclidean distance between the actual trajectory and the origin.Fig. 5 shows that as λ increases so does this error, but thedensity of the A matrix of the lifted linear model decreases.The model chosen is the one that minimizes prediction error,which results in an A matrix with 70% of its entries equal tozero.

D. Experiment 1: Model Prediction Comparison

The accuracy of the predictions generated by each of thetwo models were evaluated by comparing them to the actualbehavior of the system under the sinusoidal inputs defined in(33). The model responses were simulated over a time horizonof 2.5 seconds given the same initial condition and input as thereal system. The results of this comparison are summarized byFig. 6 and Table I. They illustrate that the Koopman modelpredictions are more accurate over the time horizon.

E. Experiment 2: Model-Based Control Comparison

The two identified models were each used to build a modelpredictive controller which solves an optimization problem

Fig. 5. As λ (the weight of the L1 penalty term in (27)) increases, the densityof the lifted system matrix A decreases. The model generated by solving (27)with the value of λ designated by the vertical grey bar has lower error anda sparser A matrix than the least-squares solution to (22), which occurs atλ = 0.

-5 0 5 100

2

4

6

8

10

0 0.5 1 1.5 2 2.5Time (s)

0

2

4

6

8

10

Erro

r (cm

)

ActualKoopman PredictionLinear Prediction

Fig. 6. The (average) actual response and the model predictions for therobot over a 2.5 second horizon with the sinusoidal inputs described in (33)with period T = 10 seconds applied. The left plot shows the actual trajectoryof the laser dot along with model predictions. The error displayed on theright plot is defined as the Euclidean distance between the predicted laser dotposition and the actual position at each point in time. The prediction error issmaller for the Koopman model over the entire horizon.

in the form of (32) at each time step using the GurobiOptimization software [10]. We refer to the two controllers bythe abbreviations K-MPC for the one based on the Koopmanmodel, and L-MPC for the one based on the linear state-spacemodel. Both model predictive controllers run in closed-loopat 10 Hz, feature an MPC horizon of 2.5 seconds (Nh = 25),and a cost function that penalizes deviations from a referencetrajectory r[k] over the horizon with both a running andterminal cost:

Cost = 100 (y[Nh]− r[Nh])2

+

Nh−1∑i=0

0.1 (y[i]− r[i])2 (37)

In the K-MPC case, y[i] = Cz[i], where C is defined as in(26). In the L-MPC case, y[i] = CLxL[i] where xL is the fourdimensional system state and CL is the projection matrix that

-20

-15

-10

-5

0

5

10

-15 -10 -5 0 5 10 15-20

-15

-10

-5

0

5

10

-15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15

Task 1K-

MPC

L-M

PCTask 2 Task 3

Fig. 7. The results of the K-MPC controller (row 1, blue) and the L-MPC controller (row 2, red) in performing trajectory following tasks 1-3. The referencetrajectory for each task is subimposed in black as well as a grey buffer with width equal to two standard deviations of the noise probability density shown inFig. 4.

isolates the states describing the current laser dot coordinates.The performance of the controllers was assessed with re-

spect to a set of three trajectory following tasks. Each taskwas to follow a reference trajectory as it traced out one of thefollowing shapes over a certain amount of time:

1) Pacman (90 seconds)2) Star (180 seconds)3) Block letter M (300 seconds)

The error for each trial was quantified as the Euclideandistance from the reference trajectory at each time step overthe length of the trial.

The performances of the K-MPC and L-MPC controllers atTasks 1, 2, and 3 are shown visually in Fig. 7, and the erroris quantified in Table II. In both tasks the K-MPC controllerachieved better performance, exhibiting an average trackingerror of 1.26 cm compared to the L-MPC controller’s averageerror of 2.45 cm. This amounts to an average error roughly25% larger than than the maximum magnitude of observednoise (see Fig. 4)

V. CONCLUSION

In this work, a data-driven modeling and control methodbased on Koopman operator theory was successfully applied

TABLE IIAVERAGE ERROR IN TRAJECTORY FOLLOWING TASKS (CM)

Task Std.Controller1 2 3

Avg. Dev.K-MPC 1.25 1.19 1.34 1.26 0.07L-MPC 2.21 2.34 2.73 2.45 0.31

to a soft robot. The Koopman-based MPC controller wasshown to be capable of commanding a soft robot to followa reference trajectory better than an MPC controller based onanother linear data-driven model. By making explicit control-oriented models of soft robots easier to construct, this methodenables the rapid development of new control strategies andapplications.

While these preliminary results are promising, further workis needed to make such methods feasible for higher dimen-sional robotic systems. Toward that end, this work introduced amethod for promoting sparsity in matrix representations of theKoopman model. Additional work will explore strategies forfurther promoting sparsity, choosing the most effective basis ofobservables, and building models that can account for externalloading and contact forces.

ACKNOWLEDGMENTS

This material is supported by the Toyota Research In-stitute, and is based upon work supported by the NationalScience Foundation Graduate Research Fellowship Programunder Grant No. 1256260 DGE. Any opinions, findings, andconclusions or recommendations expressed in this material arethose of the author(s) and do not necessarily reflect the viewsof the National Science Foundation.

REFERENCES

[1] Ian Abraham, Gerardo de la Torre, and Todd Murphey.Model-based control using koopman operators. In Pro-ceedings of Robotics: Science and Systems, Cambridge,Massachusetts, July 2017. doi: 10.15607/RSS.2017.XIII.052.

[2] Frank Allgower and Alex Zheng. Nonlinear modelpredictive control, volume 26. Birkhauser, 2012.

[3] Alexander R Ansari and Todd D Murphey. Sequentialaction control: Closed-form optimal control for nonlinearand nonsmooth systems. IEEE Trans. Robotics, 32(5):1196–1214, 2016.

[4] Stephen Boyd and Lieven Vandenberghe. Convex opti-mization. Cambridge university press, 2004.

[5] D. Bruder, A. Sedal, R. Vasudevan, and C. D. Remy.Force generation by parallel combinations of fiber-reinforced fluid-driven actuators. IEEE Robotics andAutomation Letters, 3(4):3999–4006, Oct 2018. ISSN2377-3766. doi: 10.1109/LRA.2018.2859441.

[6] Daniel Bruder, C David Remy, and Ram Vasudevan.Nonlinear system identification of soft robot dynam-ics using koopman operator theory. arXiv preprintarXiv:1810.06637, 2018.

[7] Marko Budisic, Ryan Mohr, and Igor Mezic. Appliedkoopmanism. Chaos: An Interdisciplinary Journal ofNonlinear Science, 22(4):047510, 2012.

[8] Morgan T Gillespie, Charles M Best, Eric C Townsend,David Wingate, and Marc D Killpack. Learning nonlineardynamic models of soft robots for model predictivecontrol with neural networks. In 2018 IEEE InternationalConference on Soft Robotics (RoboSoft). IEEE, 2018.

[9] Ian A Gravagne, Christopher D Rahn, and Ian D Walker.Large deflection dynamics and control for planar contin-uum robots. IEEE/ASME transactions on mechatronics,8(2):299–307, 2003.

[10] LLC Gurobi Optimization. Gurobi optimizer referencemanual, 2018. URL http://www.gurobi.com.

[11] Ayonga Hereid and Aaron D Ames. Frost: Fast robotoptimization and simulation toolkit. In Intelligent Robotsand Systems (IROS), 2017 IEEE/RSJ International Con-ference on, pages 719–726. IEEE, 2017.

[12] Larry L Howell, Ashok Midha, and TW Norton. Evalu-ation of equivalent spring stiffness for use in a pseudo-rigid-body model of large-deflection compliant mecha-nisms. Journal of Mechanical Design, 118(1):126–131,1996.

[13] Filip Ilievski, Aaron D Mazzeo, Robert F Shepherd,Xin Chen, and George M Whitesides. Soft robotics forchemists. Angewandte Chemie, 123(8):1930–1935, 2011.

[14] Milan Korda and Igor Mezic. Linear predictors fornonlinear dynamical systems: Koopman operator meetsmodel predictive control. Automatica, 93:149–160, 2018.

[15] Andrzej Lasota and Michael C Mackey. Chaos, fractals,and noise: stochastic aspects of dynamics, volume 97.Springer Science & Business Media, 2013.

[16] Andrew D Marchese, Cagdas D Onal, and Daniela Rus.Autonomous soft robotic fish capable of escape maneu-vers using fluidic elastomer actuators. Soft Robotics, 1(1):75–87, 2014.

[17] MATLAB. version 7.10.0 (R2017a). The MathWorksInc., Natick, Massachusetts, 2017.

[18] Alexandre Mauroy and Jorge Goncalves. Linear identifi-cation of nonlinear systems: A lifting technique based onthe koopman operator. arXiv preprint arXiv:1605.04457,2016.

[19] Alexandre Mauroy and Jorge Goncalves. Koopman-based lifting techniques for nonlinear systems identifi-cation. arXiv preprint arXiv:1709.02003, 2017.

[20] Michael A Patterson and Anil V Rao. Gpops-ii: A mat-lab software for solving multiple-phase optimal controlproblems using hp-adaptive gaussian quadrature colloca-tion methods and sparse nonlinear programming. ACMTransactions on Mathematical Software (TOMS), 41(1):1, 2014.

[21] Joel A Paulson, Ali Mesbah, Stefan Streif, Rolf Find-eisen, and Richard D Braatz. Fast stochastic model pre-dictive control of high-dimensional systems. In Decisionand Control (CDC), 2014 IEEE 53rd Annual Conferenceon, pages 2802–2809. IEEE, 2014.

[22] Roger Penrose. On best approximate solutions of linearmatrix equations. In Mathematical Proceedings of theCambridge Philosophical Society, volume 52, pages 17–19. Cambridge University Press, 1956.

[23] Elijah Polak. Optimization: algorithms and consistentapproximations, volume 124. Springer Science & Busi-ness Media, 2012.

[24] James B Rawlings and David Q Mayne. Model predictivecontrol: Theory and design. Nob Hill Pub. Madison,Wisconsin, 2009.

[25] Peter J Rousseeuw and Annick M Leroy. Robust regres-sion and outlier detection, volume 589. John wiley &sons, 2005.

[26] Daniela Rus and Michael T Tolley. Design, fabricationand control of soft robots. Nature, 521(7553):467, 2015.

[27] M.W. Spong and M. Vidyasagar. Robot DynamicsAnd Control. Wiley India Pvt. Limited, 2008. ISBN9788126517800. URL https://books.google.com/books?id=PtxYAv7ZUYMC.

[28] Bartolomeo Stellato, Goran Banjac, Paul Goulart, Al-berto Bemporad, and Stephen Boyd. Osqp: An operatorsplitting solver for quadratic programs. In 2018 UKACC12th International Conference on Control (CONTROL),

pages 339–339. IEEE, 2018.[29] Russ Tedrake, Ian R Manchester, Mark Tobenkin, and

John W Roberts. Lqr-trees: Feedback motion planningvia sums-of-squares verification. The International Jour-nal of Robotics Research, 29(8):1038–1052, 2010.

[30] Thomas George Thuruthel, Egidio Falotico, FedericoRenda, and Cecilia Laschi. Model-based reinforcementlearning for closed-loop dynamic control of soft roboticmanipulators. IEEE Transactions on Robotics, 2018.

[31] Robert Tibshirani. Regression shrinkage and selectionvia the lasso. Journal of the Royal Statistical Society.Series B (Methodological), pages 267–288, 1996.

[32] Michael T Tolley, Robert F Shepherd, Bobak Mosadegh,Kevin C Galloway, Michael Wehner, Michael Karpelson,Robert J Wood, and George M Whitesides. A resilient,untethered soft robot. Soft robotics, 1(3):213–223, 2014.

[33] Bertrand Tondu. Modelling of the mckibben artificialmuscle: A review. Journal of Intelligent Material Systemsand Structures, 23(3):225–253, 2012.

[34] Deepak Trivedi, Amir Lotfi, and Christopher D Rahn.Geometrically exact models for soft robotic manipula-tors. IEEE Transactions on Robotics, 24(4):773–780,2008.

[35] Robert J Webster III and Bryan A Jones. Design andkinematic modeling of constant curvature continuumrobots: A review. The International Journal of RoboticsResearch, 29(13):1661–1683, 2010.

[36] Matthew O Williams, Ioannis G Kevrekidis, andClarence W Rowley. A data–driven approximation of thekoopman operator: Extending dynamic mode decompo-sition. Journal of Nonlinear Science, 25(6):1307–1346,2015.

[37] Pengcheng Zhao, Shankar Mohan, and Ram Vasudevan.Control synthesis for nonlinear optimal control via con-vex relaxations. In American Control Conference (ACC),2017, pages 2654–2661. IEEE, 2017.

Documents

Modeling and Control of Soft Robots Using the Koopman ...rss2019.informatik.uni-freiburg.de/papers/0179_FI.pdf · come the challenges of modeling and controlling soft robots. The