14
Neural Networks, Vo!. 6, pp, 919-932, 1993 Printed in the USA. All rights reserved. ORIGINAL CONTRIBUTION 0893-6080/93 $6.00 + .00 Copyright © 1993 Pergamon Press Ltd. A Neural Network Model for Arm Trajectory Formation Using Forward and Inverse Dynamics Models YASUHIRO WADA AND MITSUO KAWATO ATR Human Information Processing Research Laboratories (Received 25 May 1992; revised and accepted 14 January 1993) Abstract-The minimum torque-change model predicts and reproduces human multi-joint movement data quite well. However, there are three criticisms of the current neural network models for trajectory formation based on the minimum torque-change criteria: ( I) their spatial representation of time, (2) back propagation is essential, and (3) they require too many iterations. Accordingly, wepropose a new neural network model for trajectory formation based on the minimum torque-change criterion. Our neural network model basically uses a forward dynamics model, an inversedynamics model, and a trajectoryformation mechanism, whichgenerates an approximate minimum torque- change trajectory It does not require spatial representation of time or back propagation. Furthermore, there are less iterations required to obtain an approximate optimal solution. Finally, our neural network model can be broadly applied to the engineeringfield becauseit is a new methodfor solving optimization problems with boundary conditions. Keywords-Minimum torque-change model, Forward dynamics model, Inverse dynamics model, Optimization problem, Trajectory formation. 1. INTRODUCTION To control voluntary movements, one must solve the following three computational ill-posed problems: ( 1) desired trajectory determination, (2) transformation of the task-space coordinates of the desired trajectory to intrinsic body coordinates, and (3) motor command generation to achieve the desired trajectory. One way to solve these problems is to introduce a smoothness performance index. Twoexperimentally confirmed ob- jective functions for voluntary movement were pro- posed. Flash and Hogan (1985) proposed a mathe- matical model, the minimum-jerk model. They pro- posedthat the trajectory followed by the subject's arms tended to minimize the integral of the square of the jerk (rate of change of acceleration) of the hand position in Cartesian coordinate space, integrated overthe entire movement. The unique trajectory that yields the best performance is easily computed by applying the Euler- Acknowledgements: Part of this work was done at ATR Auditory and Visual Perception Research Laboratories. We would like to thank Dr. Y. Uno for discussing this work. We wish to express our thanks to thank Dr. Y. Tohkura of ATR Human Information Processing Research Laboratories for his encouragement. This work was sup- ported by a Human Frontier Science Project Grant to Mitsuo Kawato. Requests for reprints should be sent to Yasuhiro Wada, ATR Hu- man Information Processing Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan. 919 Lagrange equation because their model is based solely on the kinematics of movement, independent of the dynamics of the musculoskeletal system. Several hard- ware models that can compute minimum-jerk trajec- tories have been proposed using recurrent neural net- works (Jordan, 1989; Massone & Bizzi, 1989; Hoff & Arbib, 1994)' Based on the idea that the objective function must be related to dynamics, Uno, Kawato, and Suzuki ( 1989) proposed a minimum torque-change modelthat accounts for desired trajectory determination. The model is based on the theory that the trajectory of the human arm is determined so as to minimize the time integral ofthe square of the rate of torque change. Be- cause the dynamics of the human arm or a robotic manipulator are nonlinear, finding the unique trajec- tory based on the minimum torque-change model is a nonlinear optimization problem. This is a rather dif- ficult optimization problem because the smoothness criterion is represented in the motor command space on the one hand. On the other hand, movement con- ditions such as target point locations, via-points, and obstaclesare represented in task-oriented coordinates. Thus, the optimization problem is computationally very intensive to be solved using the Euler-Lagrange equation. Hitherto, two kinds of hardware models have been proposed. To generate a trajectory based on the minimum torque-change model, Kawato, Maeda, Uno,

A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

Neural Networks, Vo!.6, pp, 919-932, 1993Printed in the USA. All rights reserved.

ORIGINAL CONTRIBUTION

0893-6080/93 $6.00 + .00Copyright © 1993 Pergamon Press Ltd.

A Neural Network Model for Arm Trajectory Formation UsingForward and Inverse Dynamics Models

YASUHIRO WADA AND MITSUO KAWATO

ATR Human Information Processing Research Laboratories

(Received 25 May 1992; revised and accepted 14 January 1993)

Abstract-The minimum torque-change model predicts and reproduces human multi-joint movement data quitewell. However, there are three criticisms ofthe current neural network models for trajectory formation based on theminimum torque-change criteria: ( I ) their spatial representation oftime, (2) back propagation is essential, and (3)they require too many iterations. Accordingly, wepropose a new neural network modelfor trajectory formation basedon the minimum torque-change criterion. Our neural network model basically uses a forward dynamics model, aninverse dynamics model, and a trajectoryformation mechanism, whichgenerates an approximate minimum torque­change trajectory It does not require spatial representation oftime or back propagation. Furthermore, there are lessiterations required to obtain an approximate optimal solution. Finally, our neural network model can be broadlyapplied to the engineeringfield becauseit is a new methodfor solving optimization problems with boundary conditions.

Keywords-Minimum torque-change model, Forward dynamics model, Inverse dynamics model, Optimizationproblem, Trajectory formation.

1. INTRODUCTION

To control voluntary movements, one must solve thefollowing three computational ill-posed problems: ( 1)desired trajectory determination, (2) transformationof the task-space coordinates of the desired trajectoryto intrinsic body coordinates, and (3) motor commandgeneration to achieve the desired trajectory. One wayto solve these problems is to introduce a smoothnessperformance index. Twoexperimentally confirmed ob­jective functions for voluntary movement were pro­posed. Flash and Hogan (1985) proposed a mathe­matical model, the minimum-jerk model. They pro­posed that the trajectory followed by the subject's armstended to minimize the integral of the square of thejerk (rate of changeof acceleration)of the hand positionin Cartesiancoordinate space, integrated overthe entiremovement. The unique trajectory that yields the bestperformance is easily computed by applyingthe Euler-

Acknowledgements: Part of this work was done at ATR Auditoryand Visual Perception Research Laboratories. We would like to thankDr. Y. Uno for discussing this work. We wish to express our thanksto thank Dr. Y. Tohkura of ATR Human Information ProcessingResearch Laboratories for his encouragement. This work was sup­ported by a Human Frontier Science Project Grant to Mitsuo Kawato.

Requests for reprints should be sent to Yasuhiro Wada, ATR Hu­man Information Processing Research Laboratories, 2-2 Hikaridai,Seika-cho, Soraku-gun, Kyoto 619-02, Japan.

919

Lagrange equation because their model is based solelyon the kinematics of movement, independent of thedynamics of the musculoskeletal system. Several hard­ware models that can compute minimum-jerk trajec­tories have been proposed using recurrent neural net­works (Jordan, 1989; Massone & Bizzi, 1989; Hoff &Arbib, 1994)'

Based on the idea that the objective function mustbe related to dynamics, Uno, Kawato, and Suzuki( 1989) proposed a minimum torque-change modelthataccounts for desired trajectory determination. Themodel is based on the theory that the trajectory of thehuman arm is determined so as to minimize the timeintegral ofthe square of the rate of torque change. Be­cause the dynamics of the human arm or a roboticmanipulator are nonlinear, finding the unique trajec­tory based on the minimum torque-change model is anonlinear optimization problem. This is a rather dif­ficult optimization problem because the smoothnesscriterion is represented in the motor command spaceon the one hand. On the other hand, movement con­ditions such as target point locations, via-points, andobstaclesare represented in task-oriented coordinates.Thus, the optimization problem is computationallyvery intensive to be solved using the Euler-Lagrangeequation. Hitherto, two kinds of hardware models havebeen proposed. To generate a trajectory based on theminimum torque-change model, Kawato, Maeda, Uno,

Page 2: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

920 Y. Wada and M. Kawato

2. A NEURAL NETWORK FOR OPTIMALARM TRAJECfORY FORMATION

Here, (X, Y) are Cartesian coordinates of the hand,and [fis the movement time. Flash and Hogan (1985)showed that the unique trajectory yielding the best per­formance agreed with the experimental data on move­ment within the region just in front of the body. Theiranalysis was based solely on the kinematics of move­ment, independent of the dynamics of the musculo­skeletal system, and was successful only when formu­lated in terms of hand motion in extracorporeal space.

Uno, Kawato, and Suzuki (1989) proposed the fol­lowing alternative quadratic measure of performance.The objective function of the model is related to armdynamics:

and Suzuki (1990) proposed the cascade neural net­work, which is a cascade structure of the forward dy­namics model (FDM). Conversely, a neural networkmodel for the minimum torque-change criterion thatuses the inverse dynamics model (IDM) was proposedby Nakamura, Uno, Suzuki, and Kawato ( 1990). Thereare three criticisms of these neural networks: (1) theyuse a spatial representation of time, (2) back propa­gation is essential, and (3) they require too many it­erations to obtain an optimal trajectory. In this paper,we propose a new model for trajectory formation thatuses both the FOM and 10M. This model solves thethree shortcomings above, and can be implemented asa biologically plausible neural network. The proposednetwork model can be widely applied to the engineeringfield because it is a new method for solving generaloptimization problems with boundary conditions.

2.1. Minimum Torque-Change Criterion

This section briefly explains the minimum torque­change model. Trajectory formation is an ill-posedproblem because there is an infinite number of possibletrajectories by which the hand can move from the startto the target point. Therefore, a unique trajectory cannot be determined. However, humans can move thearm between two targets, selecting one trajectory fromamong an infinite number of trajectories. Therefore,the brain should be able to compute a unique solutionby attaching an appropriate criterion to the ill-posedproblem.

Flash and Hogan (1985) proposed the minimum­jerk modelthat is based on the kinematics of movement,independent of the dynamics of the musculoskeletalsystem. Their proposed performance index is the fol­lowing quadratic measure:

2.2. A Neural Network for Optimal Arm TrajectoryFormation Using Forward and Inverse Models

In this section, a new neural network model for trajec­tory formation is proposed. A performance index forthe movement between two targets is defined as thesum ofthe smoothness constraint energy multiplied by

where r! is the torque generated by the jth actuator ofM actuators, and {Jis the movement time. The objectivefunction is the sum of the square of the rate of changeof the torque, integrated over the entire movement. Theminimum torque-change model can predict and pro­duce human arm trajectories quite well. The optimi­zation problem is to find the torque that minimizes thecriterion CT' However, it is difficult to get an optimaltrajectory based on minimum torque-change becausetorque should be determined using complex nonlineardynamics. That is, a nonlinear optimization problemwith boundary conditions must be solved.

For movements between a pair oftargetsjust in frontof the body, predictions of both the models were closeto the experimental data. However, the trajectories pre­dicted by the minimum torque-change model werequite different from the minimum-jerk model in fourbehavioral situations. It was found that the minimumtorque-change model predicted the actual data betterthan the minimum-jerk model (Uno et al., 1989). Thefour situations were as follows: (1) discrete point-to­point movement: The starting point is an outstretchedarm to the side and the end point is in front of thebody, (2) movements between two points whileresistinga spring, one end of which is attached to the hand whilethe other is fixed, and (3 ) vertical movements affectedby gravity. In these three cases, the minimum-jerkmodel always predicts a straight path regardless of ex­ternal forces or gravity. On the other hand, the mini­mum torque-change model predicts a curved path andthese predictions are close to the experimental data.Finally, the most compelling evidence was examined:(4) a pair of via-point movements with identical startand end points, but with dictated mirror-image via­points. Because the objective function CJ of the mini­mum-jerk model does not vary under translation, ro­tation and roll, the minimum-jerk model predicts anidentical path for roll as well as identical speed profilesfor the two subcases. On the other hand, the minimumtorque-change model predicts two different trajectories.For a concave path, the speed profile has two peaks.However, for a convex path, the speed profile has onlyone peak. These predictions are close to the humandata (Uno et al., 1989).

However, the two objective functions, CJ and CT ,

are closely related because the rate of torque change islocally proportional to the jerk. If the arm dynamicsare approximated by a point mass system, the two per­formance indexes are identical.

(1)

(2)l lf M(dr J )2Cr = L - di,o J=I dt

= itf {( d3X )2 (d

3y )2}CJ d 3 + d 3 di.ott

Page 3: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

Arm Trajectory Formation

the regularization parameter A and three hard con­straints:

1 N M . . 1 M . .

E ="2 A L: L: (r1 - r1_d2 + '2 :L: (8~ - 8~)2;-1 )-1 j=1

+i ~ (8~-O{,)2+i ~ ((j~_ij{,)2, (3))-1 j=1

where r{ is the torque generated by thejth actuator ofM actuators at the time i, The performance index isformulated here using discrete time i, and N shows thefinal time. Let 0, il, and 0denote the position, velocity,and acceleration of the joint angle respectively; 8~,

il~, and O~ represent the desired position, desired ve­locity, and desired acceleration of the jth joint anglerespectively. 8~, 8~, and O~ are the position, velocity,and acceleration ofthejth joint angle at the end timeas predicted by the neural network model. The firstterm of eqn (3) is simply a discrete version of the min­imum torque-change criterion [eq n (2)]. The second,third, and fourth terms ofeqn (3) are hard constraintsregarding the movement conditions, that is, desirableposition, velocity, and acceleration at the end of themovement.

Here, the gradient descent of energy E [eqn (3)] iscalculated as follows:

dr{ _ 8E _ \) ) ) )d - - 8J - 1\(r 1+1 + r 1-1 - 2r1sri

921

Equation (5) shows the operation used to smooththe torque. The trajectory (0, il, 0) generated using thenew torque updated by eqn (5), does not usually satisfythe desired boundary conditions. Thus, an incremental,compensatory trajectory .6.0 is generated to cancel theerror in the terminal conditions. That is, .6.0 is generatedaccording to the following method, and the trajectoryo + .6.0 is obtained, that satisfies the terminal condi­tions. The trajectory .6.0 is computed by addressing alinear optimization problem instead of a nonlinear op­timization problem. When generating .6.0, the controlobject dynamics are approximated by a simple linearsystem. Furthermore, the trajectory is determined bysolvingthe linear optimization that minimizes criterionCr, whose terminal conditions are the terminal con­dition errors induced by eqn (5). The trajectory 0 +.6.0 then satisfies the terminal conditions. However, thetorque 7 and trajectory 0 + .6.0 do not satisfy the orig­inal nonlinear arm dynamics (neither FDM nor IDM).Here, the FDM and IDM are defined as the followingequations:

FDM: 0;+1 = F(0;, r,.)

IDM: r; = 1(0;)

• _ I 2 U "'2 'M "Iwhere 0; - «(); , 8;, '" , () i , () i , 0; , ... , () i , 0 i ,"2 "u _ 1 2 U0i, .. ·,()j ),Ti-(Tj,Ti, ... ,ri)'

Accordingly, the torque that satisfies both terminalconditions and nonlinear arm dynamics is obtained byusing the IDM:

wheres represents a relaxation time that is independentof movement time. The torque 71 that minimizes theperformance index E can be searched according to eqn(4). The second, third, and fourth terms of eqn (4)show that a time-backward calculation (error backpropagation) (Rumelhart et al., 1986) is needed to sat­isfythe desired conditions. In optimal control, this cal­culation is equivalent to solving the adjoint equationof the Euler-Lagrange equation. However, if the ter­minal conditions are satisfied, the time-backward cal­culation (error back propagation) is not needed, be­cause the second, third, and fourth term all vanish.Thus, if the terminal-condition errors can be madeequal to 0, the trajectory can be generated by justsmoothing the torque [the first term of eqn (4 )]. As­sume, for simplicity, that a trajectory at some momentsatisfies the terminal conditions. In this case, the re­laxation rule of eqn (4), that is, the gradient descentof energyE, is expressed as follows:

M OOk M ~k

+ L: (8~-Ot)----!!+ L (iJ~-iJ~)-2.k=1 Qr1 k=1 ar1

M '"k+ "\' (iF _ ijk)aO N

L, d Naj'k=1 r i

(4)

(5)

(6)

In this framework, the gradient descent of energy E iseasy to compute as it is simply expressed by the torquesmoothing term (eqn ( S) and error back propagationand spatial representation of time can be avoided.

The algorithm described above is shown in Figure1. Step 1: The torque and joint angle that satisfy theterminal condition and dynamics are calculated usingthe IDM, where 0 + .6.8 satisfies the terminal condi­tions. Step 2: The torque is smoothed. Step 3: The jointangle trajectory is generated from the torque smoothedin Step 2 through FDM. The trajectory does not satisfythe terminal conditions. The terminal-condition errorsare calculated. Step 4: Byfinding a solution to the linearoptimization problem, the compensatory trajectory ~0

which cancels the terminal-condition errors is obtained.The optimal trajectory based on minimum torque­change is obtained by repeating Steps 1-4.

Any initial trajectory can be chosen as the startingpoint of the calculation. However, if it is a good ap­proximation of the minimum torque-change trajectory,faster convergence is expected. Therefore, two kinds ofreasonable initial trajectories are used; one is a trajec­tory based on the minimum-jerk criterion (Flash et al.,1985). The other is O. When 0 is used as the initialtrajectory, the IDM in the first iteration ofthe proposedschema outputs the torque 0 over time. Therefore, the

Page 4: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

922 Y. Wada and M. Kawato

Step 3

FDM(Forward Dynamics Model)

HS(T)

Step 4 Step 2

Torque smoothing

Step 1

10M(Inverse Dynamics Model)

Approximated minimum torquechange for constraint error

S : smoothing operator

FIGURE 1. Neural network schema for arm trajectory formation using the forward dynamics model (FDM) and inverse dynamicsmodel (10M). Step 1 shows the 10Mwhose input is a trajectory that satisfies the boundary conditions, and output is a torque seriesthat satisfies the nonlinear arm dynamics. Torque is smoothed in Step 2. The terminal-condition errors are computed using theFOM (Step 3). In Step 4, the minimum torque-change trajectory for a linear apprOXimated dynamics model of the arm is derived.The output of Step 4 is the trajectory that satisfies the boundary conditions of the original nonlinear optimization problem.

hand position computed by the FDM in this iterationremains the initial position during movement time.Then, in the box "approximated minimum-torque­changefor constraint error" in Figure 1, a trajectory iscomputed by solving a linear optimization problemwith exactly the same boundary conditions as those ofthe nonlinear problem.

The smoothed torque is computed according to thenext equation, which is a discrete version of eqn (5).

. . dT~T~(S + 1) = 7~(S) + ds't..s

= des) + A(-r{+I(S) + .,.L(s) - 2·d(s», (7)

where lis is a time step of discrete time and assumedequal to 1.

In the following section, iteration of eqn (7) is usedas an example of the smoothing operation. The reasonfor this choice will be clarified in Section 5. Let k bean iterative computation index, the smoothed torqueis represented as follows:

des, k + 1)

= 71(s, k) + A(71+1(S, k) + 7~_I(S, k) - 2,d(s, k»

,d (s + 1) = 7~ (s, n + I) (8)

where k :=: 1,2, ... , n.Thus, as the number of iterative computations n in­

creases, the torque becomes smoother. If that numberis quite large, the torque approaches 0 over time. Al­though the above smoothing operation was used in thecomputer simulation, we note that a variety of smooth­ing methods can be applied to the proposed algorithm.

3. COMPUTER SIMULATION OF DISCRETEPOINT-TO-POINT MOVEMENT

This section presents the results of applying the pro­posed method to 2-joint arm trajectory formation. Inthis simulation, the following mathematical model ofFDM and IDM was used. It has already been dem­onstrated that both the FDM and IDM can be achievedusing neural networks (Kawato et al., 1987, 1990; Ka­wato, 1990; Jordan et al., 1992).

71 = (II + h + 2M2L\S2COS ()2 + M 2(L I )2)0,

+ (/2 + M2L1S2COS ()2)82

- M 2LIS2(Z81 + 82)82sin ()2 + biOI (9)

72 = (I2 + M2 L IS2COS 82 )0\

+M2L,S2(O\)2sin02+b202' (10)

M], L, , S, and I, represent the mass, length, distancefrom the mass center to the joint, and the rotary inertiaof the link i around the joint, respectively. Here, thesame physical parameter values as those in Uno's paperare used (Uno et a1., 1989). b, and 7j represent thecoefficients of viscosity and the actuated torque of thejoint i. Joints 1 and 2 correspond to the shoulder andthe elbow. Joint 1 is located at the origin of the X- Ycoordinates.

Two kinds of approximated dynamic arm modelsarm are used to calculate the compensatory trajectory.l8 in the computer simulation. The first model is alinear approximated model along the previous iterationtrajectory. Thus, the approximated dynamics is de­scribed by a linear differential equation with time­varying coefficients. In this case, the optimal trajectory

Page 5: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

Arm Trajectory Formation 923

is found by applying the Riccati equation (Bryson &Ho, 1975). The second model is a simple point-massmodel with time-invariant parameters and no inter­action between the joints. The minimum torque-changetrajectory for this model is equivalent to the minimum­jerk trajectory in the joint-angle-coordinate space. Thesecond model is a much poorer approximation thanthe first.

~I(O) = 0 Mtf) = M.

~I(O) = 0 ~I(tf) = t::.iJ 1

~2(0) = 0 h(tf) = M 2

b(o) = 0 ~2(tf) = t::.o2

1)1(0) = 0 1]1(tf) = t::.7 1

1)2(0) = 0 7/2 (tf) = t::.7 2 (13)

00000

a~(A) a~(A) a~(9) a~(A) a~(A) a~(A)

~ ao;- ae:- ao;- --a;:- a;;-ah(A) ali(A) ali(A) ali(A) ali(A) ali(A)ao:- fie;- ae:- ao;-~ BTZ

The linear approximated equation around the tra­jectory 0(t) = (81, 82 , 81,82 , 7 1, 72) , which is generatedby the smoothed torque 7 + S(7) = (71) 72), is de­scribed by eqn (12).

ddiX(t) = A(t)X(t) + BCt)U(t) (12)

XU) = (~ICt) ~2(t) ~ICt) ~2(t) TJl(t) TJ2(t))T

U(t) = (~I(t) ~2(t))T

ACt)

3.1. A Numerical Experiment Using the LinearApproximated Model

In this section, the optimality and convergence of thenew method are examined using the linear approxi­mated model. First, the analytical computation methodof the compensatory trajectory based on minimumtorque-change, is described using the linear approxi­mated model. Equations (9) and (10) are generallyrepresented by eqn ( 11).

d. . . .di O; =Ji(OI> O2, 01, O2, 7 1> T2) (I = 1,2). (11)

TABLE 1X-Y Coordinates of Initial

Target and Intermediate Points

A() I, 6.()2 represent the position errors at the end point,AO I , 6.02 represent the velocity errors at the end point,and A71' AT2 are the torque errors at the end point.These terminal condition errors are induced by thesmoothing operator S.

The minimum torque-change model is formulatedas the following optimization problem:

J = ~ f (UTQU) dt-« Min Q = (~ ~) (14)

The optimization problem of the linear system can besolved by applying the Riccati equation. As a result, acompensatory trajectory, that is an approximated min­imum torque-change trajectory, is generated.

The simulation conditions were as follows: (1 )Movement time: 0.75 s; (2) sample time: 0.01 s; (3)trajectory: from T2 to T6 (the x-y Cartesian coordi­nates of initial and target points are shown in Table 1);and (4) the number of smoothing operation iterations:n = 100. The convergence of the minimum torque­change criterion is shown in Figure 2. The x-axis is thenumber of iterative calculations, the y-axis is the cri­terion function value [eqn (2)]. Two initial values wereused in this simulation. One was a trajectory based onminimum-jerk (Hash and Hogan, 1985). The otherwas equal to O. The optimal value of this problem canbe calculated using a Newton-like method, that is, aniterative scheme to solve the two-point boundary-valueproblem (Uno et al., 1989). It is assumed that a nearlyoptimal solution is obtained by the Newton-likemethod. The criterion function value for the proposedneural network model converged near the optimalvalue, and the converged value ofthe performance index

o

oo

o

o

o

o

oo

o 1

o 0

oo

o 0

o 0

o

o

o

o 0

BCt) = (~

o

Here, let ~l, b denote positions, ~1' ~2 denote velocities,and 1]1,1]2represent torque respectively, The subscriptdenotes joint number. The position ()j + ~i' the velocityOi +L and the torque 7j + 1]1, (i = 1,2) satisfy theboundary condition of discrete point-to-point move­ment.

Therefore, X (t) satisfies the following boundaryconditions:

T1T2T3T4T5T6PiP2

X (em)

-0.92-24.33-19.94

0.1521.0921.24

0.151.31

Y (ern)

30.3630.8947.1158.9249.3332.6358.9336.96

Page 6: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

924 Y. Wada and M. Kawato

was obtained in less than 10 iterations. The convergedvalues for the two initial values were almost the same.Thus, the proposed method can produce a trajectoryclose to the minimum torque-change trajectory for theoriginal nonlinear arm dynamics in quite a small num­ber of iterations when the minimum torque-change so­lution of the linear approximated model is used as thecompensatory trajectory.

FIGURE 2. Convergence of the value of the minimum torque­change criterion when the new proposed schema is applied toa T2-T6 movement in front of the body using a 2-joint arm. Inthis simulation, the linear approximated model with time-variantparameters is used as a model to compute the compensatorytrajectory for satisfying the boundary conditions. Two kinds ofinitial trajectory are examined. One is the minimum-jerk tralee­tory. The other corresponds to 0 (no movement). Both resultsare almost the same. Satisfactory solutions are obtained thatapproximate the optimal solution computed using the Newton­like method.

a3 = {lO(~j(tJ) - ~;<O)) - 4~itJ)' tJ+ ~~)(tJ)' t}}lt}

a4 = {-15(~j(tJ) - ~;<O)) + 7k;<lj)' tr: VtJ)' t}}lt}

. r 2} 5as = {6(~itJ) - ~j(O)) - 3~;<tJ)' (J+ 2~)(tJ)' tJ Itf

where 'YJj, ~j> ~j, and ~j represent the torque, position,velocity and acceleration at the jth joint, respectivelyand I is the inertia of the link, a, (i = 0, ... , 5) areparameters that are determined by movement time andboundary conditions (position, velocity, and accelera­tion at the start and end point). The boundary con­ditions are the same as eqn ( 13); however, the boundarycondition for acceleration is given instead of the torqueconditions. In this simulation, mathematical equationsare used to obtain the compensatory trajectory. How­ever, it is known that a recurrent neural network canlearn approximate optimal trajectories (Massone et aI.,1989; Jordan, 1989; Hoff et aI., 1992). Furthermore,Hoff and Arbib ( 1992) have shown that the rigorousminimum-jerk trajectory can be generated using a re­current neural network.

The simulation conditions were as follows: (1)Movement time: 0.75 s (2) Sample time: 0.01 s (3)Trajectory: fivekinds ofmovement in front of the body(The start and end points are shown in Table I). Theparameters used in this simulation, and the number ofiterations to obtain a minimum for the objective func­tion, are shown in Table 2. The trajectories of fivemovements, the speed profile of the T4-T1 movement,and the torque profile of the T4-T1 movement areshown in Figures 3,4, and 5, respectively. Each trajec­tory was obtained in less than 5 iterations. The valueof the criterion function attained at the first minimumpoint during the iterative computation is shown in Ta­ble 3. Each objective function value of the trajectorywas close to the optimal value calculated using theNewton-likemethod. The number of iteration to obtainthe minimum objective function value is almost thesame in the case using the linear approximated model,

-

-

+30

I

I

25

III

I I I

10 15 20

ITERATIONS

I

5

I

- initial state =minimum jerk trajectory

'.".' initial state = 0 (no movement)1.5

1. 6 tt"

1.3 t- \\",":-"':'.:::••~••~-------------,Iresult of Newton-like method I----

1.4

1 .2 c:1+:....-_..L..._---.l__...I:-_~:__-='='-___::':

o

"a:wzw

3.2. A Numerical Experiment Using aPoint-Mass Model

In this section, a simple point-mass model is used. Thismodel is expressed as eqn ( 15). Here, the minimumtorque-change trajectory of a point-mass model [eqn( 15)] is represented as the 5th order polynomial intime [eqn ( 16)] , which is easily derived using the Euler­Lagrange equation. Because, in this case, the change intorque is exactly proportional to the jerk of the jointangle, eqn ( 16) is equivalent to the optimal solution ofthe minimum-jerk criterion in the joint-angle-coordi­nate space.

1]) = J)~) ( 15)

~j(t) = ao + a.t + azt2 + a3t3 + a4t4 + astS (16)

ao = ~JCO) at = 0, a2 = 0

TABLE 2Computer Simulation Parameters and Number of IterationsRequired to Calculate a Minimum for the Objective Function

of Each Movement

Number of Number ofMovement A Smoothings Iterations

T2-T6 0.3 30 5T3-T6 0.3 60 1T1-T3 0.3 30 2T4-T1 0.3 30 3T4-T6 0.3 30 1

Ais the smoothingoperator parameter. Numberof smoothingsshows the number of iterative computations in the smoothing op­eration. Number of iterations shows the number of iterative com­putations required to obtain a minimum for the objective functionpoint in the proposed model for trajectory formation.

Page 7: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

925

........... jerk method

- new method

... ,- Newton-like method

2

3

·2 Itorque 11 - new method

...... Newton-like method

-3 .......... jerk method

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

~

EzW:::J 0tF------.:"-"'=~~~~-----~oa:oI- -1

0.40.2-0.20.0�ooo_ 1..-__--"'-__--1 ---1

-0.4

Arm Trajectory Formation

0.8...... Newton-llke method

[ill0.6

[!I]

:[ 0.4>-

lID[IT) l!IJ

0.2

FIGURE 3. Trajectory of discrete point-to-point movements. Fivekinds of trajectory in front of the body using a 2-joint arm areshown. The origin of the X- Y coordinates represents the lo­cation of joint 1 (shoulder). The trajectory generated by theproposed method based on the minimum torque-change cri­terion, and that of the Newton-like method are compared.Unlikeminimum-jerk trajectories, neither trajectory is straight, butslightly convex.

TIME(s)

FIGURE 5. Torque profile of a T4-T1 movement in front of thebody. The torque profiles of the proposed model, the Newton­like method which is an optimal profile with respect to the min­imum torque-change model, and the minimum-jerk model arecompared. Torque 1 and torque 2 represent shoulder and elbowtorque, respectively. The profiles of the proposed model andthe Newton-like method are smoother than that of the minimum­jerk model.

TIME(s)

FIGURE 4. Speed profile of a T4-T1 movement in front of thebody. The speed profiles using the proposed model, the New­ton-like method which is an optimal profile with respect to theminimum torque-change model, and the minimum-jerk model,show a single-peaked, bell-shaped speed profile that agreeswith the speed profile observed in human arm movement. Theprofile of the proposed method is most similar to that of theNewton-like method. Thus, this method can generate a trajec­tory based on minimum torque-change.

4. AN EXTENSION TOVIA-POINT MOVEMENT

and the objective function value calculated using thelinear approximated model was closer to the optimalvalue. However, the values calculated using the point­mass model and the linear approximated model werenot so different. Each trajectory generated by the pro­posed method agreed with the hand paths calculatedusing the Newton-like method. For horizontal move­ment between two targets located approximately infront ofthe body, the minimum torque-change criterionpredicts hand paths that are not completely straight,that is, slightly convex. On the other hand, the handpaths based on the minimum-jerk model are completelystraight. Both models can also predict a single-peaked,bell-shaped speed profile of the hand. Accordingly, thehand paths generated by the proposed method wereslightlyconvex (Figure 3). This method also produceda single-peaked, bell-shaped speed profile (Figure 4).The torque produced by the proposed method wassmoother than that of the minimum-jerk model (Fig­ure 5).

4.1. A Point-Mass Model for Via-Point Movement

The proposed method is extended to generate a trajec­tory with via-points. In this case, the position con-

- new method

........... Newton-llke method

..... jerk melhod1.0

1.2 IT4-T11

Ul 0.8.....E>:!:: ,o 0.6 .0 .....I .W ..>

0.4 ,./,

0.2 ...

0.0 "0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Page 8: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

926 Y. Wada and M. Kawato

TABLE 3Value of the Minimum Torque-Change Criterion

MovementProposedMethod Newton-Like

MinimumJerk

T2-T6T3-T671-T3T4-T1T4·T6

1.3741.183

3.164 x 10-1

1.920 X 10- 1

7.515 X 10-1

1.2291.131

3.051 x 10-1

1.589 X 10- 1

7.156 X 10-1

1.5731.184

3.227 X 10-1

2.968 X 10- 1

7.814 X 10- 1

(t8 )

where

I ... 2= -32 (IOLl8via - IOLlOo - IOLlOO/via - 2A8ot via) ' (24)lvi.

(22)

(21 )

aJ- ..-=0.aLlOvia

r (d~)2J( AOvia, Aita) = [2 0 dt dt

= [' r (6b3 +24b4t + 60bst2

) 2 dt. (20)

For LiOviaand D-.Ovia to minimize eqn (20), the folIowingconditions are necessary:

Therefore, the velocity and acceleration at via-time areobtained as follows:

. _ I ... ,A8via - - (lOAOvia - IOA8o - 6A8otvia - AOotvia) (23)

4tvia

~(O) = ABo ~(tvia) = M via

~(O) = AOo ~(tvia) = AOvia

~(O) = AOo ~(tvia) = AOvia.

D-.0o, b.80 , and D-.80 represent the position, velocity, andacceleration errors for the boundary conditions, re­spectively, at time O. D-.{;Ivia, D-.Ovia, and D-.Ovia representthose errors at time V. However, the values of b.Ovia andLi8viaare not yet specified at this point and will begivenbelow. A solution to eqn ( 18) is given by eqn ( 19) sim­ilar to eqn (16).

W) = bo + bit + b2t' + b3t3 + b4t

4 + bst S (19)

where hi (i = 0, ... , 5) could be determined if theabove six constraints were to be given. Here, the torque­change criterion function of the point-mass model isexpressed as eqn (20). Since the values b.Ovia andD-.8viaare not given, eqn (20) is a function of the velocityand acceleration at via-time.

That is, the via-point time can be chosen and the ve­loeity and acceleration computed, so as to minimizethe torque-change criterion function for the approxi-

straints at the via-points are in addition to the problemof discrete point-to-point movement. The performanceindex of a via-point movement is defined as follows,instead of eqn (3). Trajectory formation using onlyone via-point is explained, however, it is easy to extendthis to a movement with several via-points.

l N M 1 M

E = 2. ALL (T~ - T~_1)2 + - L (8~ - 0-:V)2/-1 j-1 2 j_1

+ ~ ~ (O~ - 0-:V)2 + ~ ~ (O~ - (j~)2j=1 j~1

1 ~ . j 2+"2 LJ (O~ia - 0 v), (t7)j~1

where 8{ia represents the desired position at a via-pointof the jth joint angle. Oiv represents the position at Vtime (1 :s V s; N - 1) ofthejthjoint angle. There aretwo differences between discrete point-to-point move­ment and via-point movement. First, the movementtime between the start and the via-point is not given;however, the movement time is given for discrete point­to-point movement. Second, the velocity and acceler­ation constraints at the time passing through the via­point are not given. Thus, conversely, if the time passingthrough the via-point, the velocity, and the accelerationat the via-point can be determined, a trajectory withthe via-point can easily be generated using the samemethod for discrete point-to-point movements.

Here, the algorithm shown in Figure I is extendedfor via-point movements. The via-point time V for thecompensatory trajectory is chosen so that Lf!,l (O{ia ­{;IJV) 2 in eqn (17) is minimized after Step 3 (FDM) inFigure 1. The point-mass model is used as the approx­imate dynamics model for generating the compensatorytrajectory. The minimum torque-change trajectory ofthe point-mass model is expressed as eqn ( 16). Thus,the trajectory from the starting point to a via-pointcould be calculated if the movement time, position,velocity, and acceleration at the start and the Via-pointwere to be given. The torque-change criterion valuefrom time 0 to tvia when passing through the via-pointis expressed as follows:

rrvJ. (d~)2J = [2 Jo dt dt - Min

Page 9: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

Arm Trajectory Formation 927

0.8

0.2-new method- - - Newton-like method.......... ierk method

0.40.20.0X (m)

-0.2-0.4

0.0

of the via-point. In this case, the two via-points PI andP2 were located symmetrically with respect to the lineconnecting the common start and end points. Theminimum-jerk model predicted identical speed profilesfor both cases; however, the minimum torque-changemodel predicted two different profiles: that for via-pointPI had only one peak; however, that for P2 had twopeaks (Uno et aI., 1989). The speed profile for via­point P2 predicted by the proposed method had twopeaks, as shown in Figure 7. These simulation resultsshow that the proposed method for via-point movementcan generate approximately a trajectory passingthrough via-points based on minimum torque-changecriterion in only several iterations. We emphasize thatthe objective functions obtained by the proposedmethod are much smaller than those of the minimum­jerk model (Table 5). If one compares Table 3 andTable 5, it is suggested that the new method is moreefficient for complex movements than it is for simplemovements.

0.6

.s 0.4>-

FIGURE 6. Trajectory of via-point movement in front of the bOdy(T3-P1-T5 and T3-P2-T5). The trajectories of the proposedmethod and the Newton-like method, which are based on theminimum torque-change criterion, are compared to that of theminimum-jerk model. The three trajectories are almost identical.

4.2. Numerical Experiments

The simulation conditions were as follows: ( 1) Move­ment time: 1.0 s; (2) sample time: 0.01 s; and (3) startpoint T3, end point T5, and via point PI or P2 (Thex-y Cartesian coordinates of each point are shown inTable I). The iteration number for smoothings, thesmoothing parameter A, and the number of iterationsneeded to reach the first minimum for the objectivefunction, are shown in Table 4. The minimum-jerktrajectory in the Cartesian coordinates was chosen asthe initial trajectory. The trajectories of the two move­ments, and the speed profiles of T3-P2-T5 are shownin Figures 6 and 7 respectively. Minimal values of thecriterion function for the three schemes (proposedmethod, Newton-like method, and minimum-jerkmodel) are shown in Table 5. Each trajectory was ob­tained in less than five iterations, and the minimumvalue of the criterion function was close to the optimalvalue obtained using the Newton-like method. Thehand paths generated using the proposed method werealmost the same as the hand paths of the Newton-likemethod (minimum torque-change model), and theminimum-jerk model. These two models can predicta curved hand path with a single-peaked or double­peaked speed profile, and this depends on the location

TABLE 4Computer Simulation Parameters and Number of IterationsRequired to Calculate a Minimum for the Objective Function

of Each Movement

mate model. Furthermore, the algorithm describedabove can be applied to trajectory formation with morethan one via-point. Because the algorithm can deter­mine the velocity and acceleration errors at the firstvia-point using only the error of the position, velocity,and acceleration at the start point, the velocity and ac­celeration errors at the second via-point can be similarlydetermined from errors at the first via-point. Equationssuch as (23) and (24) can be derived straightforwardly.The reason for this straightforward extension to mul­tiple via-point cases is that only the objective functionfrom the start point to the first via-point is consideredin eqn (20) and the latter half of the integral is nottaken into account.

Ais thesmoothing operatorparameter. Number of smoothingsshowsthenumberof iterativecomputations in the smoothing op­eration. Number of Iterations shows the numberof iterative com­putations required to obtaina minimum for the objective functionin the proposed model for trajectory formation.

Movement

T3-P1-T5T3-P2-T5

0.30.3

Number ofSmoothings

3030

Number ofIterations

33

5. MATHEMATICAL CONSIDERATIONS OFTHE PROPOSED METHOD FOR

NONLINEAR OPTIMIZATION PROBLEMS

In this section, the proposed network is formulated asa general optimal algorithm, and the theoretical frame­work for this method is described. The optimality andconvergence of the new method applied to a nonlinear

Page 10: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

928 Y. Wada andM. Kawato

1.2 _--.,..-----r----.,...---,---.....

TIME(s)

FIGURE 7. Speed profile of a T3-P2-T5 movement. The speedprofile based on the minimum-jerk criterion has only one peak,and in a T3-Pi-T5 movement it also has only one peak. Thespeed profile based on the minimum torque-change criterionhas one peak in a T3-Pi-T5 movement, however, in a T3-P2­T5 movement, it has two peaks that correspond to the handtrajectory observed in human arm movement. Thus, in a T3­P1-T5 movement, the speed profiles based on the two kindsof criteria are not particularly different, but in a T3-P2·T5 theyare quite different, as shown in this figure.

where x and u represent a state variable and a controlvariable, respectively. xdrand trrepresent a desired ter­minal value and an end time, respectively. It is assumedthat the nonlinear function fis differentiable with re­spect to x and u.

Let us first illustrate the iteration rule of the controlvariable by generalizing the neural network model pro­posed in Section 2 as follows. We define an inverse dy­namics model G, that is an inverse function of the for­ward dynamics model F.

G(X) = u,

where (x, x) = X = F(u).The control variable at the j + 1th iteration, as shown

in Figure 1, is output by IDM, whose input is computedby adding the trajectory generated by the smoothedtorque at the jth iteration to the compensatory trajec­tory generated by an approximate linear dynamicsmodel. Thus, the control variable at the j + 1th iterationis derived as follows:

u j +1 = G(F(u j + S(u j » + Z), (29)

where u! + S( u j ) is the smoothed torque, and E = (~,

~) is the trajectory that compensates for the terminal­condition errors induced by u! + S( u j

) and is the so­lution for the linear optimization problem, defined be­low. The following equation is obtained using Taylor'sexpansion of function Fin eqn (29) around u',

u j +1 = G( F(u j ) + a~~tl) S(u j ) + o(S(uj » + Z) .Taking Taylor's expansion of G around X! = F(u i ) :

+1 . aG(Xj) (aF(U j) . . )

u! = ul + --ax-~ S(u J) + o(S(uJ»

aG(Xj) .....+----ax-~

. . aG(Xj) . aG(xJ)= u! + S( uJ

) +--ax- o( S( uJ» + ----ax E,

where a higher order term than the second order termis ignored because the trajectory change between thejth and thej + lth iteration is usually quite small. Thefourth term in the above equation is expressed as -'Y/j.1]) is obtained as the optimal solution to the followinglinear problem. If S( u j

) is small, the following equationis derived because o(S( u j ) ) becomes negligibly small.

u j +1 = u! + S(u j ) - 7/j •

(25)

(26)

(27)

(28)

1.00.80.60.4

dxdt = lex, u),

x(O) = 0, X(lf) = Xd!,

u(O) = 0, u(tf) = 0,

0.2

- new method. - - Newton-like method......... Jerkmethod1.0

0.4

0.0-----'----'----'----......--...0.0

0.8

0.6

0.2

Subject to

optimization problem is discussed mathematically inSections 5.1 and 5.2. The following discussion is forthe scalar case, but it can easily be extended to multi­variable cases.

First, we define the following nonlinear optimizationproblem which minimizes the criterion function J un­der boundary conditions:{Nonlinear Optimization Problem: N]

J(dU)2J= dt dt >» Min

In

E>­!::oo..JW>

TABLE 5Value of the Minimum Torque·Change Criterion

MovementProposedMethod Newton-Like

MinimumJerk

T3-P1-T5T3-P2-T5

3.850 X 10-1

4.348 X 10-13.168 X 10-1

3.322 X 10-16.709 X 10- 1

6.323 X 10-1

Page 11: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

Arm TrajectoryFormation 929

However, it is not assumed that S( uJ) is small in thecomputer simulation and mathematical considerationdescribed below. Therefore, we define the iteration ruleof the control variable as follows:

where,

HO) = 0, WJ) = t1x;

1/(0) = 0, 17(f:r) = S(U*)(tf)' (36)

(30)

where S(u i) is siur, + [oG(Xi)/aX]o(S(u i».Note that the two following trajectories satisfy ex­

actly the same boundary conditions.

aF(u j ) . .-- S(u}) + o(S(u}»

au,..,

-><,.....

The linear optimization problem is defined to com­pute a trajectory that compensates for boundary con­ditions.[Linear Optimization Problem: L]

J~ = J(~;r dt ... Min, (31)

Subject to

d~ aj(x j , u j ) alex}, uj )-;Ii = ax ~ + au 1/ (32)

~(O) = 0, WJ) = ~xj(tf) (33)

11(0) = 0, 1/(tJ) = S(Uj)(tf)' (34)

Here, xi, u! are the trajectory and the control variableat the jth iteration, respectively. ~xj(tJ) shows the ter­minal-condition errors when the input u! + S( ui ) isfed to the nonlinear dynamics eqn (26). Let (u*, x*)be an isolated optimal solution that minimizes the cri­terion function J, and let (1/*i, ~*J) be an isolated op­timal solution that minimizes the linear optimal prob­lem L. Thus, ~*i is the compensatory trajectory.

5.1. Optimality of the Converged Solution

In this section we will show that the convergence of theproposed algorithm is equivalent to the optimality ofthe solution. First we discuss the necessary condition:if u! is equal to u* (u l =: u*), then 'IJ*lbecomes equalto S(u"') (1/*i = S(ui)), thus the iteration converges.That is, if the control variable is equal to the optimalvalue, the iterative algorithm eqn (30) converges. Con­versely, we then discuss the sufficient condition: if S( ul )is equal to 'IJ*i ('IJ*i = S( ui)), then u! becomes equalto u* (u l =: u*). That is, if the control variable con­verges, the control variable u! becomes the optimal so­lution.

A linear dynamic equation around the optimal so­lution is represented by the following equation:

d~ _ al(x*, u*) + al(x*, u*)-;Ii - ax ~ au '11, (35)

It is evident that the above necessary condition isequivalent to the following Lemma 1.

LEMMA 1. The optimal solution 1/* lor the linear op­timization problem [L] at xi = x* is equal to S(u*).Proof We use reductio ad absurdum. Assume that theoptimal solution 1/* for [L] is not equal to S( u*)( 1/*+ S( u* )). The following shows that this assumptioncontradicts the assumption that u* is the optimal so­lution for [N]. The following new control variable isconstructed for [N].

tl=u*+e(S(u*)-71*) (Iel~l). (37)

If e is small, a- u" is approximated by the solutionto the linear eqn (35) with the input f(S(U*) - 1/*).Furthermore, the boundary conditions for S(u*) and1/* are exactly the same with that of [L]: according toeqn (36). Therefore, i1 satisfies the boundary conditionseqns (27) and (28), x(O) = 0, x(tJ) = Xd!, u(O) = 0,and U(lJ) = O. If the following inequality is shown, thisproof is completed, as the inequality contradicts theassumption that u* is the optimal solution for [N].

J(d£i)2 J(dU*)~dt dt < dt dt.

The next equation is obtained by ignoring the term E2•

f(~~Y dt= fl~{u* +E(S(U*)-71*nY dt

;; f(d~t* Ydi

+2 fdU*(dS(U*)_dll*)d (38)€ dt dt dt t.

Let I be an integral of the second term on the right­hand side of eqn (38). e's sign can be determined ac­cording to I's sign. That is,

if I > 0 then € < 0

if I < 0 then e > O.

Thus the required inequality can be derived.Here, if I=: 0, because [dS( u*)/dt] - (d1/*/ dt) is notalways equal to °over time ('.: 1/* +S( u*)), i1 is alsooptimal to the first approximation. Therefore, U* doesnot become an isolated extreme value in problem [N],because the first variation ofthe criterion function cor­responds to 0 in this case. (Q.E.D.)

The following Lemma 2 is equivalent to the sufficientcondition.

LEMMA 2. IIS( u i) is equal 1/*i (7/*i = S( ui)), then u!becomes equal to u* (u l =: u*).

Page 12: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

(40)

930

Proof We use reductio ad absurdum again. Let us as­sume that the control variable u! is not the optimalsolution for [N] (u l =1= u*), even when the iterationconverges.

The following new control variable is constructedfor [L].

~ = S(u l ) + €(u" - u'), (39)

First, it will be shown that 7j satisfies the boundary con­ditions of [L]. The next equation shows the dynamicequation at the optimal point.

dx"----;j[=j(x*, u*).

Equation (41 ) is the linear approximated equation ofeqn (40) around u l where u* - u ' ,

d~ = aj(xl , til) ~ + aj(x}, ul) (u* _ ul) . (4\)m ax au

Then, both u* and u! satisfy the final conditions x(tj)=xdjof the dynamic eqn (26) , so ~, that is, the differ­ence between the trajectory generated by u" and thatby u', satisfies the condition ~(tf) = 0. Furthermore,~ satisfies the boundary conditions ofeqn (33) becauseS( ul ) satisfies the boundary conditions of eqn (33).Moreover, 7j satisfies the boundary conditions of eqn(34) according to the following conditions:

1l*(0) = u*(tf} = lll(O) = Ul(tf) = O.

Equation (43) can be derived by using eqn (42) andignoring the term e2 •

d~ = dS(lIl) + €(dU* _ dUl) (42)

dt dt dt dt

f (~;r dt ~ f (dS~tUl)r dt

f dS(ul) (dU* dllJ ) d+2€ -d-t- &-dt t . (43)

Let I be an integral of the second term on the right­hand side of eqn (43) . E'S sign can be determined ac­cording to I's sign in the same manner as the previousproof.

if J > 0 then e < 0

if J < 0 then E > o.Therefore , the required inequality is obtained:

f (~;r dt < f (dS~~J)r dt .This contradicts the basic assumption that r;* l =S(u l)is optimal for [L].If! = 0,

dll* du l---

dt dt'

is not always equal to 0 over time. Therefore, 'rj....J

Y. Wada andM. Kawato

does not become an isolated extreme value for[L]. (Q .E.D.)

5.2. Convergence of the Solution

This section discusses the monotone convergence ofthe criterion function . We ask whether the proposedalgorithm decreases the criterion function for every it­eration (J (u l +1

) ~ J(ul ».The criterion function at the j + 1th iteration is

calculated using eqn (30 ).

J( u l + l) - J( u l )

= 2 J(duJ + dS(ul))(dS(U

J) _ d7l l) dt

dt dt dt dt

- J(dS(Ul) + d7l l)(dS(UJ

) _ d7l l) dt. (44)

dt dt dt dt

Also, the smoothing operator S in Sections 2, 3, and 4is represented in continuous time as follows:

. d2u{k)U{k+l ) = uhl + A----;R2 (k = 1,2, . .. , n A < I) ,

where n is the number of smoothings. For very largen, we assume that u! + S( ul ) is quite small. This isthe most important condition for convergence. We notethat this is not a reasonable assumption when the op­timal solutions to [N] and [L] are very different. Onthe other hand when u! has not converged yet, then

S(U J} - 71 l ", O [·:eqn(30)].

Furthermore, S( u l ) + 7]j = u! + S( ul ) - (u! - 7]1) isnot small since u! - 7]1 is not smalL Because the firstterm of the right-hand side of eqn (44) becomes neg­ligibly small compared to the second term, it Can beignored.

J( u l+! ) - .l(ul ) =J(~7 dt> J(dSd:1i)y dt ,

The following inequality holds because r;1 is optimalfor [L]

f (~lr dt - f (dS~tUl)r dt ~ O.

Finally, the warranted inequality that shows the mono­tone convergence is derived.

J(U)+I ) ~ J(uJ) .

Because J(u) is lower bounded, it converges to theminimum value.

6. DISCUSSION

A new neural network model for trajectory formationwas developed which basicallyuses a forward dynamicsmodel, an inverse dynamics model, and an approximate

Page 13: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

Arm Trajectory Formation

TargetMovement I-ill-[ End

velocityTime (start)Forward f+ positionacceleration Dynamics velocityModel I. accelera tion

Target (final) + i(Time?~l4rlacceleration olJ

velocity .::l-... -'- 6-~¢-'

via-position "'"Via-points SpositionI~~ 1a.timevia-position Time Torque

•r , Search Smoothing

Approximated MinimumTorque Change Model Cl.l

'& =r E'

acceleration ~+~ Inverse S

velocityDynamics ~

,..:f- + Modelposition 1-

~----_._-----_._.__._-_..._....-...._.__..__..._-_..._-_......._..._..-.._-_.

931

FIGURE 8. Neutral network structure of arm trajectory formation. The five main parts, that is, the forward dynamics model, inversedynamics model, approximated minimum torque change model, torque smoother and via-points Time Search, can all be implementedas neural networks. In this tigure, the tlow of the signals is illustrated in detail. The inputs of the proposed neural network arepositions, velocities and accelerations at the start and the end times, movement time, and via-point positions. The position, velocity,and acceleration time courses that satisfy the boundary conditions are fed Into the Inverse dynamics model and it outputs thetorque. The smoothed torque is ted into the forward dynamics model. It outputs the time series of the velocity and acceleration,and then the errors of the position, velocity and acceleration at the end point are computed when compared with the desired values.The position and velocity at the end point are calculated using integral operations from the start time to the end time. The via-pointpositions and via..point passing times are solved in the vla·polnts time search. All the boundary condition errors are fed intoapproximated minimum torque change model. It outputs the compensatory trajectory. Finally, the position, velocity, and accelerationthat satisfy the boundary conditions are computed by adding the compensatory trajectory to the output trajectory of the forwarddynamics model. t1Tshows the time step in computation.

minimum torque-change trajectory generation mech­anism. This model can solve the three difficult criti­cisms relating to the cascade, and other neural net­works. These are: ( I ) they use a spatial representationof time, (2) back propagation is essential, and (3) toomany iterations are required to obtain the optimal tra­jectory. However, this model does not use a spatial rep­resentation of time. It does not require back propaga­tion in iterative computation. It needs only an approx­imate linear model and an IDM that is an inversefunction of the FDM to satisfythe boundary conditions.Furthermore, this model can reach an optimal solutionin a few iterations as shown in the computer simulation.Accordingly, the new neural network model solves thethree criticisms of the previous models.

In the mathematical proofand computer simulation,a smoothing operator that needs many iterative cal­culations was used. Such an iterative smoothing oper­ator is not necessary, however, and one-shot-typesmoothing filters can easily be designed.

We emphasize that the entire model can be imple­mented using neural networks. The model consists offive main parts; an FDM, an IDM, an approximatedlinear model, a smoother, and a via-point seeker. Asalready mentioned, an FDM can be obtained usingJordan's recurrent neural network (Jordan et al., 1992).Kawato et al. (Kawato et al., 1987;Kawato, 1990) have

already reported that an IDM can be obtained by neuralnetwork learning. The torque smoother and via-pointseeker are simple enough to be implemented as neuralnetworks. In the case of the point-mass model, theminimum torque-change trajectory can be expressedas the minimum-jerk trajectory in the joint angle space.Hoff and Arbib ( 1992) have already pointed out thata minimum-jerk trajectory can be obtained using a re­current neural network when the final velocity and ac­celeration are equal to O. It is easy to extend their modelto general conditions, so that the velocity and accel­eration at the start and end points are not necessarilyO. These conditions are needed in via-point trajectoryformation. That is, the minimum-jerk trajectory underthe above conditions can be defined using the followingdynamics:

ddt 0 = A0 + Be v . (45)

A=[ ~ 6 ~]-60/D3 -36/D 2 -9/D

B=[ ~ ~ ~]60/D 3 -24/D2 3/D

0== (e e ~)T

e, == (eo eo 80 ) T,

Page 14: A neural network model for arm trajectory formation using … · 2020-05-24 · namics model (FDM). Conversely, a neural network model for the minimum torque-change criterion that

932

where v shows the end point and Bv , O,H and ev definethe given position, velocity, and acceleration at the endpoint. D represents the remaining movement time. Ac­cordingly, when Ov = 0 and 8v= 0, the above equationcorresponds to the dynamic equation proposed by Hoffand Arbib ( 1992). Thus, the point-mass model can beobtained using a recurrent network in the same manneras Hoff and Arbib. The detailed model for trajectoryformation and its five main parts obtained usingneuralnetworks, are shown in Figure 8.

The new method is a general method for nonlinearoptimization problems with boundary conditions. Ithas several advantages in engineering applications. Forexample, it does not require an inversion of matricesin iteration, it reaches an optimal solution in a shorttime, and the output after passing through the IDMalwayssatisfies the boundary conditions and dynamicsequation. Therefore, it can be extended to many otherengineering problems.

In the future, the potential applications of this pro­posed model to a minimum muscle-tension-changemodel and a minimum motor-command-change modelwill be studied. The efficiency of the proposed modelwill be checked when the IDM is not a perfect inversefunction of the FDM. Furthermore, it is expected thatthis trajectory formation model can be used as a patternrecognition network because a kind of duality existsbetween pattern formation and recognition in thisframework (Kawato, 1989).

REFERENCESBryson, A. E., & Ho, Y. C. (1975). Applied optimal control. New

York: Wiley.

Y. Wada andM. Kawato

Flash, T., & Hogan, N. (1985). The coordination of arm movements:An experimentally confirmed mathematical model, The Journalof Neuroscience, 5(7), 1688-1703.

Hoff, B., & Arbib, M. A. ( 1992). Models oftrajectory formation andtemporal interaction of reach and grasp. Manuscript submittedfor publication.

Jordan, M. 1., & Rumelhart, D. E. (1992). Forward models: Super­vised learning with a distal teacher. Cognitive Science, 16(3), 307­354.

Jordan, M. 1. (1989). Indeterminate motor skill learning problems.In M. Jeannerod (Ed.), Attention and performance, XIII. Cam­bridge, MA: MIT Press.

Kawato, M., Furukawa, K., & Suzuki, R. (1987). A hierarchicalneural-network model for control and learning of voluntarymovement, Biological Cybernetics, 57, 169-185.

Kawato, M. ( 1989). Motor theory of speech perception revisited fromminimum torque-change neural network model. Proceedings ofthe8th Symposium on Future Electron Devices. 141-150.

Kawato, M., Maeda, Y., Uno, Y, & Suzuki, R. (1990). Trajectoryformation of arm movement by cascade neural network modelbased on minimum torque-change criterion, Biological Cyber­netics, 62, 275-288.

Kawato, M. (1990). Computational schemes and neural networkmodels for formation and control of multijoint trajectory. In T.Miller, R. S. Sutton, & P. J. Werbos (Eds.), Neural networks forcontrol (pp. 197-228), Cambridge, MA: MIT Press.

Massone, L., & Bizzi, E. ( 1989). A neural network model for limbtrajectory formation. Biological Cybernetics, 61,417-425.

Nakamura, M., Uno, Y, Suzuki, R., & Kawato, M. (1990). Formationof optimaltrajectory in arm movement using inverse dynamicsmodel(Technical Report NC89-63) Japan: IEICE (in Japanese).

Rumeihart, D. E., Hinton, G. E., & Williams, R. J. ( 1986). Learningrepresentations by back propagating errors. Nature, 323 (9), 533­536.

Uno, Y., Kawato, M., & Suzuki, R. (1989). Formation and controlof optimal trajectory in human arm movement-minimum torque­change model. Biological Cybernetics, 61, 89-101.