GaitandTrajectoryOptimizationbySelf-Learningfor

Research ArticleGait and Trajectory Optimization by Self-Learning forQuadrupedal Robots with an Active Back Joint

Ariel Masuri Oded Medina Shlomi Hacohen and Nir Shvalb

Mechanical Engineering Ariel University Science Park 3 Ariel 40700 Israel

Correspondence should be addressed to Shlomi Hacohen shlomihaarielacil

Received 13 January 2020 Accepted 18 May 2020 Published 10 June 2020

Academic Editor L Fortuna

Copyright copy 2020 Ariel Masuri et al (is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

(is paper presents an efficient technique for a self-learning dynamic walk for a quadrupedal robot (e cost function for such atask is typically complicated and the number of parameters to be optimized is high(erefore a simple technique for optimizationis of importance We apply a genetic algorithm (GA) which uses real experimental data rather than simulations to evaluate thefitness of a tested gait (e algorithm actively optimizes 12 of the robotrsquos dynamic walking parameters (ese include the steplength and duration and the bending of an active back For this end a simple quadrupedal robot was designed and fabricated in astructure inspired by small animals (e fitness function was then computed based on experimental data collected from a cameralocated above the scene coupled with data collected from the actuatorsrsquo sensors (e experimental results demonstrate howwalking abilities are improved in the course of learning while including an active back should be considered to improvewalking performances

1 Introduction

(e main advantage of legged robots over their wheeledcounterparts is their ability to overcome obstacles such as stairsand hard outdoor terrains For example in [1] the authorspresented a continuous free gait generation method for qua-drupedal robots when walking on rough terrains based on theCoG trajectory planning method Upright robots can manip-ulate objects and interact in a human environment using theirhands [2] Nevertheless dynamic walking gaits are harder toachieve for these robots compared to quadrupedal (see [3 4] fora survey on robot learning from demonstrations) Also notethat there is a clear distinction between bipedal and quadrupedalmechanisms in the literature(e term gait which is often usedin characterization of legged robot walking patterns refers tothe patterns of the limbs during locomotion on a solid surface

Running and walking gaits differ by the interaction timewith the ground For walking the interaction period with theground is more than 50 of the entire gait cycle [5]Quadruped and hexapod robots have stability and speedadvantages over bipedal robots (see for example [6] whereJuarez-Campos et al implemented a PeaucellierndashLipkin

mechanism for a hexapedal robot) Moreover trafficabilityof these robots increases with their degrees of freedom [7ndash9]

An important approach in designing biped walking gaitis the zero moment point (ZMP) [10] It is defined as a pointon the floor where equilibrium is established while thehorizontal portion of the reaction-moments vanishes (emain idea is that when the ZMP is within the convex hull ofthe contact points between the feet and the floor (supportpolygon) stable walking may be maintained For examplesee [11] which used particle swarm optimization (PSO) inorder to maintain stable walking and [12] which used PSOfor quadruped gait adaptation using terrain classificationand gait optimization

(e reader is referred to [13] for some additional sta-bilization methods Controlling a real-life walking robotaccording to its dynamic model alone is difficult (see [14])(e associated dynamics is complicated with multidimen-sional vector-state which is basically nonlinear and time-variant Moreover uncertainties in the robotrsquos model andstate add complexity (see [15])

A considerable portion of research in the field aims toease calculations and enhance the robustness of walking

HindawiJournal of RoboticsVolume 2020 Article ID 8051510 7 pageshttpsdoiorg10115520208051510

Explicitly improvement is aimed at [16] (1) trajectoryplanning (2) achieving ZMP optimal walking gait (3)calculating the ZMPrsquos position feedback-force system and(4) planning reference trajectories of the center of gravity(CoG) of the body Many researchers apply reinforcementlearning to calculate the CoG of the robot such as [17] whichused reinforcement learning for posture stabilizing en-hancement by exerting random disturbances In [18] theresearchers implemented a hybrid adaptive fuzzy dynamicevolutionary neural network technique (eir performancesare demonstrated in simulations

Genetic algorithm is yet another approach for opti-mizing a set of parameters according to an objectivefunction For biped motion control Taherkhorsandi et al[19] presented a sliding control which was used to optimizean adaptive robust hybrid PID controller while a GA wasapplied to select the controllerrsquos coefficients (from the Paretofront) Kato et al [20] presented a research study where theyused a GA in a simulation for gait optimization andimplemented their results in a real model (ough not allgaits performed well some gaits generated by a GA on asimulated environment did (erefore it is preferable toapply a GA on real-world experiments (is is the main aimof this paper

In real implementations it is mostly common to closethe loop using sensors Loffler et al [21] used IMU sensorsand encoders for every motor shaft and also applied a 6-axisforce-torque sensor placed in the robotrsquos foot As expectedthey reported an overall improvement in walking and jog-ging (e same strategy is used for recent quadrupedalrobots such as HyQ2Max andHyQ2Centaur [22] or Cheetah2 [23] or the well-known BigDog [24] LittleDog [25] andWildCat [26] (ese robots also map out their environmentby using infrared cameras retroreflective markers and rangesensors (e usage of cameras may be implemented forposition and velocity sensing A camera may be placed onthe robot or alternatively as an external sensor and data onthe robotrsquos performance are gathered during its motionObviously using cameras as sensors in the gait optimizingprocedure requires real-time techniques for analyzing thevideo stream (see [27 28])

(e back joint (spine) was investigated in [29] where theresearch studies presented simulations showing that anactive back is a key driving factor for the improvement ofspeed and cost-of-transport in quadrupeds In [30] Khor-amshahi et al presented a simplified (wheeled) locomotionsystem which is behaviorally and structurally similar to agalloping quadruped and showed that fast locomotion re-quires a flexible spine

(e examples presented above present sophisticatedrobots in terms of their sensory equipment and mechanicalstructure (at least 3 degrees of freedom per each leg) whichlead to impressive trafficability and load-bearing abilities Inthis work we use a machine learning approach based on agenetic algorithm (GA) for improving a four-legged robotrsquoswalking abilities Such an approach was introduced in [31]for quadrupedal robots with a rigid back where the re-searchers optimized the forward speed as their objectivefunction Here we extend their work to the case where an

active back is present by optimizing ldquostraightnessrdquo ofmovement coupled with the power efficiency After opti-mizing these gaits the robot will have the ability to track on apredefined path such as in [32]

(is paper is organized as follows in Section 2 we in-troduce the mechanical concept of our quadrupedal robotIn Section 3 we specify the initial gaits for the first gen-eration and explain the genetic algorithm which we useSection 4 provides a description of the experimental setupand the optimization results We conclude the paper inSection 5

2 Mechanical Design

Nine servo motors were incorporated into the robotrsquosstructure (Figure 1) two MG90S microservos per leg andone VEX-EDR-393 for the back joint (see Figure 2(b)) (erobotrsquos body was designed and built as a two-dimensionalplatform of approximately 200mm length made of 4mmthick Perspex Each leg was connected to the robot body by ashaft with torsion springs (see Figure 2(a)) (ese are usedfor maintaining the elasticity during the robotrsquos movement(e robotrsquos body consists of two parts connected by a shaftactuated by a servo motor that enables the back curvature(see Figure 3)

(e leg was designed as a five-bar mechanism (seeFigure 2(b))mdasha two-DOF leg (e advantage of such astructure is that the motors are mounted on the robotrsquosskeleton rather than on its joints (see also [33]) (is designreduces the legrsquos physical volume andmass In addition sucha design enables scaling the motors without changing thelegrsquos structure and its moment of inertia [34] To enable aproper foot path x(t) y(t) where t isin [0 1] (Figure 2(b))one solves the inverse-kinematics problem (is yields themotorsrsquo angles θ1(t) θ2(t) for a single cycle duration

3 Genetic Algorithm for Self-Learning

(e set of parameters controlling the robotrsquos motion is

G Tφ1ψ1 φ4ψ4φbψb( 1113857 (1)

(e parameter T is the time duration of a single cycle(e rest of the parameters are dimensionless and are givenwith respect to T So Τφi indicates the i-th leg time phasefrom the cyclersquos beginning and Tψi indicates the timeduration of the i-th leg (i 1 2 3 4) to complete its indi-vidual cycle TφbΤψb indicates the time phase of the backjoint and its time duration respectively (see Figure 4)

We initiate the algorithm by selecting the gaitrsquos pa-rameter vector (1) To do so we manually generated thefollowing gaits walk rack amble canter trot and gallop(eir values were extracted from [35] see Figure 4 Addi-tionally 4 initial gaits were randomly chosen

In general the GA searches for an optimal gait by ap-plying genetic operators to a population of gaits [36] Gaitsthat perform well are rewarded and proliferate through thepopulation whereas gaits that perform poorly are removedIn our GA from the second generation onward in order toavoid s local stationary solution we used a random operator

2 Journal of Robotics

function to generate random new gaits Our GA maintains apopulation of the 6 best gaits and uses mutation and pairingto manipulate the gaits in the population Each generation oftests consists of a random function (20) mutation (40)and pairing (40)

(e random operator randomly generates a new gaitwith UB and LBmdashthe upper and lower parameter boundsrespectively (e quadrupedal feet are impacted by the floorduring the gait So when Τψi is too short the robotrsquos limbsjerk which may harm the mechanism On the contrary longtime periods are not desirable as well (see for example

Figure 1 Robotrsquos side view

(a)

35

42

68

35

(b)

Figure 2 (e overall robot structure (a) the passive DOF (b) the four-bar mechanism leg structure and the implemented footrsquos path

(a) (b)

Figure 3 (e active back joint

Le hind leg

Le foreleg

Right hind leg

Right foreleg

Τψi

Τ

Τφi

Figure 4 A quadrupedal robot gallop gait example Τ indicates thegait cycle time duration Here Τφi and Τψi indicate the time phaseand time duration of the left hind leg respectively

Journal of Robotics 3

Figure 4 for gallop gait) (ese lower and upper constraintsare used in order to reduce the search area and to avoid suchundesirable gaits

(e mutation operator acts on the gait by randomlyaltering its parameters with some predefined probability ρ

Gbest Gbest + ρ(UB minus LB) (2)

Here we used ρ 02(e pairing operator generates a new gait from two

ldquoparentrdquo gaits by performingmultipoint recombination of theirset of parameters (e gaits chosen for ldquoreproductionrdquo were (1)the gait Gbest which gained the best score paired with (3) and arandom gait Grand chosen from the top eight equation (3) Wedefine α (score(Gbest)score(Gbest) + score(Grand)) theprioritizing weight (e offspring gait is then


31 e GA Weight Function (e weight function (alsoknown as the evaluation or cost function) evaluates howclose a given solution is to the optimum solution of thedesired problem Each solution is given a score specifying itsfitness to the desired solution We are interested in a smoothtrajectory that requires minimal power consumption whilemaximizing the resulting velocity (e efficiency is the ratiobetween the input power and the power consumption (iethe ratio between the average kinetic energy sim v2 and theinput power W compare with [29]) We also include anadditional dimensionless ratio dT which measures theldquostraightnessrdquo of the path (compare with [37]) (e fitnessfunction used for scoring the robotrsquos gait is then

S vkv middot dkd

WkW middot T (4)

Here v d andT are the average velocity distance andoverall trajectory length respectively To comply with thedimensions indicated above we set kw kd 1 kv 2though these may be chosen differently for other purposes

(e power consumptionW of themotors is calculated bysumming the electrical current sensorrsquos value in everyconstant time interval Since the voltage is maintainedconstant the current summation will do (e power con-sumption was extracted from an INA219 high side DCcurrent sensor placed on the robot that sampled the currentconsumption of the motors during walking In addition an

IR LED was mounted to the robotrsquos back faced upward Awebcam with an IR filter was positioned above the exper-imental area in order to detect the robotrsquos location (eimages received from the camera underwent image pro-cessing for identifying the locomotion of the robot on theexperimental surface

4 Experimentation and Results

(e experimental setup (see a short movie in [38]) included a400mm times 600mm horizontal surface (Figure 5) In order toavoid skidding during the experiment the surface and therobotrsquos foot were covered with fabric (e computationlasted for fifteen generations in which the algorithm tested atotal of 146 different gaits and converged to a solution Itbegan with 10 initial gaits followed with 14 generationshaving 10 gaits each During the experiments the trajec-toriesrsquo length and smoothness were improved as the numberof generations increased (Figures 6 and 7) Each experimentbegins by placing the robot in a predefined starting point(e experiment ends after 7 T time periods or alternativelywhen a nonmotion scenario occurs Each gait was repeated 3times and results were averaged

(e results show that both the mutation and the randomfunctions helped the algorithm to converge (e authorsbelieve that the random procedure was required in order tomove ldquofreelyrdquo in the solution space at the first stages of thealgorithm preventing convergences into local minima so-lutions (compare with a simulation annealing approach)(e pairing function had little effect on finding the optimalgait (ese insights were manually examined in the course ofthe experimentation by tracking the convergence rate

(e back joint was found to be significant We per-formed two sets of experiments

(1) (e back joint was activated in accordance with theparameters Tφb and Tψb indicating the time phase ofthe back joint and its time duration respectively

(2) (e back joint was fixed to a flat angle during theoptimization

Figure 8 depicts the corresponding results of these twosetups At the first stages of the optimization the back-jointactivation did not play a significant role A possible ex-planation for this is that in these early stages robot loco-motion was far from optimal and thus improvements dueto any of the parameters were of the same importanceNevertheless in generations 7 to 10 where fine tuning of the

Figure 5 (e experimental setup A camera extracts the robotrsquos trajectory


parameters took place having an addition joint that com-pletely changes the robotrsquos dynamics such as the back jointis expected to be of importance and indeed it was

(e final walking gaits were found to beG (79 58 56 13 30 13 43 20 52 8 50) which in-clude the back parameters (e final walking gaits whichignore the back parameters were found to beG (95 61 46 31 42 11 0 26 96 0 0)

5 Conclusions

We introduced a real-world set of experiments for opti-mizing the robotrsquos walking gait using a genetic algorithm Inorder to demonstrate our solution a low-cost mechanicalmodel of a quadrupedal robot was designed and fabricatedOur results show that after 15 generations the robotrsquostrajectory improved significantly relative to the first gen-eration gaits (e improvement was shown in terms of therobotrsquos velocity trajectory length and smoothness as well aspower consumption (e authors believe that the conceptdescribed in this paper may be implemented in a totallyautonomous experimental system which releases the needto lift the robot to the start point at each test and enables alarger number of generations In addition as shown in theliterature [39] the back structure has a significant functionin the gait quality Here we implemented an active back toexamine these advantages

Future work will include a mechanism having the abilityto perform a large number of experiments without a human

(a) (b)

(c) (d)

Figure 6 Images of trajectory paths in different generations As the number of generations increases the linersquos length and smoothnessimprove (a) Gen 1 (b) Gen 4 (c) Gen 10 (d) Gen 15

0

1

2

3

4

Scor

e

126 8 10 14 1642Generation

times104

Figure 7 (e GA performance for the active back case (eresulting trajectories are shown in Figure 6

0

1

2

3

4times104

93 42 6 7 8 101 5Generation

Active backPassive back

Figure 8 Comparing performances of the case of an active backjoint (diamond markers) and the case where the back joint ispassive (circle markers)


in the loop Moreover we shall examine in addition to theactive back the qualities of an active tail

Data Availability

(e authors confirm that the data supporting the findings ofthis study are available within the article

Conflicts of Interest

No potential conflicts of interest were reported by theauthors

References

[1] S Zhang M Fan Y Li X Rong and M Liu ldquoGeneration of acontinuous free gait for quadruped robot over rough ter-rainsrdquo Advanced Robotics vol 33 no 2 pp 74ndash89 2019

[2] M Lee K-S Kim and S Kim ldquoDesign of robot hand forbipedalquadrupedal transformable locomotive robot insafety security and rescue robotics (SSRR)rdquo in Proceedings ofthe 2016 IEEE International Symposium on IEEE pp 160ndash165IEEE Lausanne Switzerland October 2016

[3] Q Wu C Liu J Zhang and Q Chen ldquoSurvey of locomotioncontrol of legged robots inspired by biological conceptrdquoScience in China Series F Information Sciences vol 52 no 10pp 1715ndash1729 2009

[4] B D Argall S Chernova M Veloso and B Browning ldquoAsurvey of robot learning from demonstrationrdquo Robotics andAutonomous Systems vol 57 no 5 pp 469ndash483 2009

[5] J Ashar D Claridge B Hall et al ldquoRobocup standardplatform league-runswift 2010rdquo in Proceedings of the Aus-tralasian Conference on Robotics and Automation Universityof New South Wales (UNSW) Brisbane Australia December2010

[6] I Juarez-Campos D A Nuntildeez-Altamirano L Marquez-Perez L Romero-Muntildeoz and B Juarez-Campos ldquoHexapodwith legs based on peaucellierndashlipkin mechanisms a math-ematical structure used in reconfiguration for path planningrdquoInternational Journal of Advanced Robotic Systems vol 15no 4 2018

[7] E Kagan N Shvalb and I Ben-Gal Autonomous MobileRobots and Multi-Robot Systems Motion Planning Commu-nication and Swarming John Wiley amp Sons Hoboken NJUSA 2019

[8] N Shvalb B B Moshe and O Medina ldquoA real-time motionplanning algorithm for a hyper-redundant set of mecha-nismsrdquo Robotica vol 31 no 8 pp 1327ndash1335 2013

[9] O Medina A Shapiro and N Shvalb ldquoMotion planning foran actuated flexible polyhedron manifoldrdquo Advanced Ro-botics vol 29 no 18 pp 1195ndash1203 2015

[10] S Park and J Oh ldquoReal-time continuous ZMP patterngeneration of a humanoid robot using an analytic methodbased on capture pointrdquo Advanced Robotics vol 33 no 1pp 33ndash48 2019

[11] F A Raheem and M K Flayyih ldquoQuadruped robot creepinggait stability analysis and optimization using PSOrdquo in Pro-ceedings of the 2017 Second Al-Sadiq International Conferenceon Multidisciplinary in IT and Communication Science andApplications (AIC-MITCSA) pp 79ndash84 IEEE Baghdad IraqDecember 2017

[12] J-J Kim and J-J Lee ldquoAdaptation of quadruped gaits usingsurface classification and gait optimizationrdquo in Proceedings ofthe 2013 IEEERSJ International Conference on Intelligent

Robots and Systems pp 716ndash721 IEEE Tokyo JapanNoveber 2013

[13] M Dekker Zero-Moment Point Method for Stable BipedWalking Eindhoven University of Technology EindhovenNetherlands 2009

[14] M Bucolo A Buscarino C Famoso L Fortuna andM Frasca ldquoControl of imperfect dynamical systemsrdquo Non-linear Dynamics vol 98 no 4 pp 2989ndash2999 2019

[15] S Hacohen S Shoval and N Shvalb ldquoProbability navigationfunction for stochastic static environmentsrdquo InternationalJournal of Control Automation and Systems vol 17 no 8pp 2097ndash2113 2019

[16] D Katic and M Vukobratovic ldquoSurvey of intelligent controltechniques for humanoid robotsrdquo Journal of Intelligent andRobotic Systems vol 37 no 2 pp 117ndash141 2003

[17] W Wu and L Gao ldquoPosture self-stabilizer of a biped robotbased on training platform and reinforcement learningrdquoRobotics and Autonomous Systems vol 98 pp 42ndash55 2017

[18] T T Huan P D Huynh C Van Kien and H P H AnhldquoImplementation of hybrid adaptive fuzzy sliding modelcontrol and evolutionary neural observer for biped robotsystemsrdquo in Proceedings of the International Conference onSystem Science and Engineering (ICSSE) pp 77ndash82 IEEE HoChi Minh City Vietnam July 2017

[19] M Taherkhorsandi M J Mahmoodabadi M Talebipour andK K Castillo-Villar ldquoPareto design of an adaptive robusthybrid of pid and sliding control for a biped robot via geneticalgorithm optimizationrdquo Nonlinear Dynamics vol 79 no 1pp 251ndash263 2015

[20] T Kato K Shiromi M Nagata H Nakashima andK Matsuo ldquoGait pattern acquisition for four-legged mobilerobot by genetic algorithmrdquo in Proceedings of the 41st AnnualConference of the IEEE Industrial Electronics Society No-vember 2015

[21] K Loffler M Gienger and F Pfeiffer ldquoSensor and controldesign of a dynamically stable biped robot in robotics andautomationrdquo in Proceedings of the IEEE International Con-ference on Robotics and Automation vol 1 IEEE TaipeiTaiwan pp 484ndash490 September 2003

[22] Y Yang C Semini N G Tsagarakis E Guglielmino andD G Caldwell ldquoLeg mechanisms for hydraulically actuatedrobots in intelligent robots and systemsrdquo in Proceedings of theIEEERSJ International Conference on IEEE pp 4669ndash4675IEEE St Louis MO USA October 2009

[23] H-W Park S Park and S Kim ldquoVariable-speed quadrupedalbounding using impulse planning untethered high-speed 3Drunning of MIT cheetah 2rdquo in Proceedings of the IEEE In-ternational Conference on Robotics and Automation (ICRA)pp 5163ndash5170 IEEE Seattle WA USA May 2015

[24] M Raibert K Blankespoor G Nelson and R PlayterldquoBigdog the rough-terrain quadruped robotrdquo IFAC Pro-ceedings Volumes vol 41 no 2 pp 10 822ndash10 825 2008

[25] M P Murphy A Saunders C Moreira A A Rizzi andM Raibert ldquo(e littledog robotrdquoe International Journal ofRobotics Research vol 30 no 2 pp 145ndash149 2011

[26] J R Rebula P D Neuhaus B V Bonnlander M J Johnsonand J E Pratt ldquoA controller for the littledog quadrupedwalking on rough terrain in robotics and automationrdquo inProceedings of the 2007 IEEE International Conference onRobotics and Automation pp 1467ndash1473 IEEE Roma ItalyMay 2007

[27] F Sapuppo M Bucolo M Intaglietta L Fortuna andP Arena ldquoCellular non-linear networks for microcirculation


applicationsrdquo International Journal of Circuit eory andApplications vol 34 no 4 pp 471ndash488 2006

[28] F Sapuppo M Bucolo M Intaglietta L Fortuna andP Arena ldquoA cellular nonlinear network real-time technologyfor the analysis of microfluidic phenomena in blood vesselsrdquoNanotechnology vol 17 no 4 pp S54ndashS63 2006

[29] S Bhattacharya A Singla D Dholakiya et al ldquoLearningactive spine behaviors for dynamic and efficient locomotion inquadruped robotsrdquo in Proceedings of the 28th IEEE Inter-national Conference on Robot and Human Interactive Com-munication (RO-MAN) October 2019

[30] M Khoramshahi H Jalaly Bidgoly S Shafiee A AsaeiA J Ijspeert andM Nili Ahmadabadi ldquoPiecewise linear spinefor speed-energy efficiency trade-off in quadruped robotsrdquoRobotics and Autonomous Systems vol 61 no 12 pp 1350ndash1359 2013

[31] N Kohl and P Stone ldquoMachine learning for fast quadrupedallocomotionrdquo in Proceedings of the Nineteenth NationalConference on Artificial Intelligence (AAAI) vol 4 pp 611ndash616 San Francisco CA USA July 2004

[32] S Hacohen S Shoval and N Shvalb ldquoApplying probabilitynavigation function in dynamic uncertain environmentsrdquoRobotics and Autonomous Systems vol 87 pp 237ndash246 2017

[33] C Zhou B Wang Q Zhu and J Wu ldquoAn online gaitgenerator for quadruped walking using motor primitivesrdquoInternational Journal of Advanced Robotic Systems vol 13no 6 2016

[34] F Luo G Xie QWang and LWang ldquoDevelopment and gaitanalysis of five-bar mechanism implemented quadrupedamphibious robotrdquo in Proceedings of the 2010 IEEEASMEInternational Conference on Advanced Intelligent Mecha-tronics pp 633ndash638 IEEE Montreal Canada July 2010

[35] T-C Huang Y-J Huang and W-C Lin ldquoReal-time horsegait synthesisrdquo Computer Animation and Virtual Worldsvol 24 no 2 pp 87ndash95 2013

[36] M Gen and R Cheng Genetic Algorithms and EngineeringOptimization Vol 7 John Wiley amp Sons Hoboken NJ USA2000

[37] M Azarkaman M Aghaabbasloo and M E Salehi ldquoEval-uating GA and PSO evolutionary algorithms for humanoidwalk pattern planningrdquo in Proceedings of the 22nd IranianConference on Electrical Engineering (ICEE) pp 868ndash873IEEE Tehran Iran May 2014

[38] A Masuri O Medina S Hacohen and N Shvalb ldquoGeneticalgorithm for quadrupedal robot walkingrunning gaitrdquo 2017httpswwwyoutubecomwatchvsLo64NP0dsA

[39] C Semini V Barasuol J Goldsmith et al ldquoDesign of thehydraulically actuated torque-controlled quadruped robotHyQ2Maxrdquo IEEEASME Transactions on Mechatronicsvol 22 no 2 pp 635ndash646 2017


Explicitly improvement is aimed at [16] (1) trajectoryplanning (2) achieving ZMP optimal walking gait (3)calculating the ZMPrsquos position feedback-force system and(4) planning reference trajectories of the center of gravity(CoG) of the body Many researchers apply reinforcementlearning to calculate the CoG of the robot such as [17] whichused reinforcement learning for posture stabilizing en-hancement by exerting random disturbances In [18] theresearchers implemented a hybrid adaptive fuzzy dynamicevolutionary neural network technique (eir performancesare demonstrated in simulations

Genetic algorithm is yet another approach for opti-mizing a set of parameters according to an objectivefunction For biped motion control Taherkhorsandi et al[19] presented a sliding control which was used to optimizean adaptive robust hybrid PID controller while a GA wasapplied to select the controllerrsquos coefficients (from the Paretofront) Kato et al [20] presented a research study where theyused a GA in a simulation for gait optimization andimplemented their results in a real model (ough not allgaits performed well some gaits generated by a GA on asimulated environment did (erefore it is preferable toapply a GA on real-world experiments (is is the main aimof this paper

In real implementations it is mostly common to closethe loop using sensors Loffler et al [21] used IMU sensorsand encoders for every motor shaft and also applied a 6-axisforce-torque sensor placed in the robotrsquos foot As expectedthey reported an overall improvement in walking and jog-ging (e same strategy is used for recent quadrupedalrobots such as HyQ2Max andHyQ2Centaur [22] or Cheetah2 [23] or the well-known BigDog [24] LittleDog [25] andWildCat [26] (ese robots also map out their environmentby using infrared cameras retroreflective markers and rangesensors (e usage of cameras may be implemented forposition and velocity sensing A camera may be placed onthe robot or alternatively as an external sensor and data onthe robotrsquos performance are gathered during its motionObviously using cameras as sensors in the gait optimizingprocedure requires real-time techniques for analyzing thevideo stream (see [27 28])

(e back joint (spine) was investigated in [29] where theresearch studies presented simulations showing that anactive back is a key driving factor for the improvement ofspeed and cost-of-transport in quadrupeds In [30] Khor-amshahi et al presented a simplified (wheeled) locomotionsystem which is behaviorally and structurally similar to agalloping quadruped and showed that fast locomotion re-quires a flexible spine

(e examples presented above present sophisticatedrobots in terms of their sensory equipment and mechanicalstructure (at least 3 degrees of freedom per each leg) whichlead to impressive trafficability and load-bearing abilities Inthis work we use a machine learning approach based on agenetic algorithm (GA) for improving a four-legged robotrsquoswalking abilities Such an approach was introduced in [31]for quadrupedal robots with a rigid back where the re-searchers optimized the forward speed as their objectivefunction Here we extend their work to the case where an

active back is present by optimizing ldquostraightnessrdquo ofmovement coupled with the power efficiency After opti-mizing these gaits the robot will have the ability to track on apredefined path such as in [32]

(is paper is organized as follows in Section 2 we in-troduce the mechanical concept of our quadrupedal robotIn Section 3 we specify the initial gaits for the first gen-eration and explain the genetic algorithm which we useSection 4 provides a description of the experimental setupand the optimization results We conclude the paper inSection 5

2 Mechanical Design

Nine servo motors were incorporated into the robotrsquosstructure (Figure 1) two MG90S microservos per leg andone VEX-EDR-393 for the back joint (see Figure 2(b)) (erobotrsquos body was designed and built as a two-dimensionalplatform of approximately 200mm length made of 4mmthick Perspex Each leg was connected to the robot body by ashaft with torsion springs (see Figure 2(a)) (ese are usedfor maintaining the elasticity during the robotrsquos movement(e robotrsquos body consists of two parts connected by a shaftactuated by a servo motor that enables the back curvature(see Figure 3)

(e leg was designed as a five-bar mechanism (seeFigure 2(b))mdasha two-DOF leg (e advantage of such astructure is that the motors are mounted on the robotrsquosskeleton rather than on its joints (see also [33]) (is designreduces the legrsquos physical volume andmass In addition sucha design enables scaling the motors without changing thelegrsquos structure and its moment of inertia [34] To enable aproper foot path x(t) y(t) where t isin [0 1] (Figure 2(b))one solves the inverse-kinematics problem (is yields themotorsrsquo angles θ1(t) θ2(t) for a single cycle duration

3 Genetic Algorithm for Self-Learning

(e set of parameters controlling the robotrsquos motion is

G Tφ1ψ1 φ4ψ4φbψb( 1113857 (1)

(e parameter T is the time duration of a single cycle(e rest of the parameters are dimensionless and are givenwith respect to T So Τφi indicates the i-th leg time phasefrom the cyclersquos beginning and Tψi indicates the timeduration of the i-th leg (i 1 2 3 4) to complete its indi-vidual cycle TφbΤψb indicates the time phase of the backjoint and its time duration respectively (see Figure 4)

We initiate the algorithm by selecting the gaitrsquos pa-rameter vector (1) To do so we manually generated thefollowing gaits walk rack amble canter trot and gallop(eir values were extracted from [35] see Figure 4 Addi-tionally 4 initial gaits were randomly chosen

In general the GA searches for an optimal gait by ap-plying genetic operators to a population of gaits [36] Gaitsthat perform well are rewarded and proliferate through thepopulation whereas gaits that perform poorly are removedIn our GA from the second generation onward in order toavoid s local stationary solution we used a random operator





(a)

35

42

68

35

(b)


(a) (b)


Le hind leg

Le foreleg

Right hind leg

Right foreleg

Τψi

Τ

Τφi










S vkv middot dkd

WkW middot T (4)















5 Conclusions



(a) (b)

(c) (d)


0

1

2

3

4

Scor

e

126 8 10 14 1642Generation

times104


0

1

2

3

4times104

93 42 6 7 8 101 5Generation





Data Availability




References















































(a)

35

42

68

35

(b)


(a) (b)


Le hind leg

Le foreleg

Right hind leg

Right foreleg

Τψi

Τ

Τφi










S vkv middot dkd

WkW middot T (4)















5 Conclusions



(a) (b)

(c) (d)


0

1

2

3

4

Scor

e

126 8 10 14 1642Generation

times104


0

1

2

3

4times104

93 42 6 7 8 101 5Generation





Data Availability




References



















































S vkv middot dkd

WkW middot T (4)















5 Conclusions



(a) (b)

(c) (d)


0

1

2

3

4

Scor

e

126 8 10 14 1642Generation

times104


0

1

2

3

4times104

93 42 6 7 8 101 5Generation





Data Availability




References














































5 Conclusions



(a) (b)

(c) (d)


0

1

2

3

4

Scor

e

126 8 10 14 1642Generation

times104


0

1

2

3

4times104

93 42 6 7 8 101 5Generation





Data Availability




References













































Data Availability




References


























































Documents

GaitandTrajectoryOptimizationbySelf-Learningfor