[IEEE 2008 9th Symposium on Neural Network Applications in Electrical Engineering - Belgrade, Serbia (2008.09.25-2008.09.27)] 2008 9th Symposium on Neural Network Applications in Electrical

9th Symposium on Neural Network Applications in Electrical Engineering, NEUREL-200BFaculty of Electrical Engineering, University of Belgrade, Serbia, September 25-27, 2008

IElill~IEEE ~~H

Trajectory Prediction and Path Planning ofIntelligent Autonomous Biped Robots Learning and Decision Making through

Perception and Spatial Reasoning

Aleksandar Rodic, Dusko Katic

Abstract-This paper is inspired by FP6 1ST RobotCubEuropean research project. The paper is addressed to thesynthesis of an intelligent autonomous locomotion (artificialgait) of biped robots in unknown and unstructured dynamicenvironments through perception, learning, environmentunderstanding and spatial reasoning. Focusing the researchactivities to the embodied cognition, this paper contributes tothe extension of the intelligent robot behavior through dynamicenvironment understanding, simultaneous localization andmapping, trajectory prediction and path planning, obstacleavoidance, collision avoidance and scenario-based behavior.The paper is of a conceptual character. It includes somesimulation results obtained by application of fuzzy algorithmsand artificial neural networks in learning and decision making.

Index Terms-Humanoid robots, cognitive systems,trajectory prediction, path planning

1. INTRODUCTION

N OWADAYS there are humanoid robots (e.g.commercially available or research-oriented platforms)

capable to walk, maintain dynamic balance and perform lackof manipulation tasks. Predominantly they are remotecontrolled by human operator or by remote computer andhave no cognitive capabilities to do the imposed tasks quiteautonomously. The aim of this paper related to the 1ST FP6RobotCub project [1] is to contribute the advancements inembodied cognition by enabling extended autonomy tohumanoid robots, better environment understanding,advanced reasoning and bio-inspired adaptation to theexternal world.

This paper is addressed to the synthesis of the intelligent,autonomous, bio-inspired locomotion (artificial biped gait)in unknown/unstructured dynamic environment. Suchcognition embodied in a form of the free, autonomous bipedlocomotion can be reached through the sensory perception,learning and imitation of the anthropomorphic, psychophysical (mental and locomotive) reactions, objectsrecognition, object-based localization and mapping in anunknown space, environment understanding and scenario-

Authors are with the Robotics Laboratory, Mihajlo Pupin Institute,Volgina 15, 11060 Belgrade, Serbia (e-mail: {roda,dusko}@robot.imp.bg.ac.yu).

978-1-4244-2904-2/08/$25.00 ©2008 IEEE

based reasoning.From the infant psychology it is well known that a child

rapidly increases its intellectual abilities in the moment whenit stands on the legs and begin walking, getting in such a waynecessary numerous information about 3D World. In thisperiod of life children intensively learn their locomotive,manipulative and cognitive skills through training, learningby imitation and learning by trial-error actions. This learningprocess is much effective if child is able to walkautonomously extending his/her space of exploration theexternal world. Our idea in this paper is to let the samechance to the humanoid robot to learn about 3D-world bychanging its position autonomously and doing desired tasksdepending on scenario-based behavior.

Free locomotion is one of the most important factors thatenable robotic system to develop its embodied cognitivecharacteristics - artificial intelligence capable to move robotautonomously in the unknown environment. Nowadays,there are few high-tech humanoid robot platforms(Asimo/Honda [2], Hoap-2/Fujitsu [3], QRIO/Sony [4],iCub [1]) with advanced technical performances dedicatedfor RID purposes (see Fig. 1). Among the essential researchchallenges to be considered in the paper is to develop bipedrobot control system that can mimic human behavior with alarge movement repertoire, variable speed, variousconstraints and many uncertainties in the dynamicenvironment in a fast and reactive manner. Hence, ourapproach will involve learning from experience and creatingadaptive intelligent control architecture. For that purpose,we will develop advanced algorithms for simultaneouslocalization and environment mapping as well as approachfor learning locomotion and navigation in the space of fixedobstacles through experimentation. The learning modelsallow the biped robot to predict or plan its own motions aswell as to interpret the motions performed by theexperienced people. As a result, the model allows the systemto imitate such observed motions in its own way. Significantamount of the research work will be oriented to design andto validatee of reinforcement learning algorithms and fuzzylogic algorithms to control locomotion and efficient controlpolicy estimation.

a) b)Fig.I. Contemporary humanoid robotsdeveloped for research purposes: a)

HOAP-2 and b) iCub

II. STRATEGY OF AUTONOMOUS LOCOMOTION

Main research goal in this paper is building of advancedbio-inspired methodologies of perception, understanding,artificial reasoning and autonomous locomotion in unknownenvironment with humanoid robots. In that sense, we beginfrom the following facts. Healthy adults move free, quiteautonomously in the 3D-world based on their naturalperception (visual, sound, vestibular, scent, etc.),knowledge, experience and skill of logical thinking

(reasoning). Infants learn skill of navigation and walking infree space through training (trial and error) and naturalinstinct. In such process of learning they have only theirperception and still entirely non-developed intellectualcapabilities. They have no significant experience aboutterrain topology and dimensional relationships betweenexisting objects in surrounding. In spite of that, infants learnquickly by exploration of the space around themselves.

In that, visual feedback, i.e. object-based localization isthe crucial natural principle enables humans to guidethemselves in the unknown environment. In the space,people determine their relative position and direction ofmotion with respect to the characteristic, well-displayedobject(s) as it is shown in Fig. 2. A lantern (light) representslandmark object in this example shown in Fig. 2. Under thenotion landmark object or marker we assume a real object orfigure/shape that dominates in certain a way comparing withother objects/shapes in surrounding by its dimension (large,high, width), brightness, color, etc. Conventionally, bipedrobots are equipped by a stereo-vision system (two cameras).

The role of cameras is to identify the relative position dand direction of motion (azimuth) p of biped robot with

respect to the chosen landmark object (Fig. 2) with asatisfactory accuracy.

obstacle

bipedrobot

a) b)Fig. 2. Bio-inspired localization of a biped robot: a) relative position and dimensions of surrounding objects, b) azimuth angle towards landmark/target

object

The choice of the appropriate marker can be made bytraining an appropriate artificial connectionist structure (akind of robot brain/memory) built into the robot's high-levelcontrol block. Such network structure is trained off-line torecognize the potential landmark objects, i.e. bright, large,high or colored objects that can be potentially well displayedin the unknown scene (depending on indoor or outdoorapplications). An arbitrary indoor scene with a biped robotand obstacles are presented in Fig. 2 in two geometryperspectives - side view and top view. Elevation angles of

robot eyes/cameras at and a2 as well as their attitude h

are known (measurable). Robot relative pOSItIon

(distance d ) is calculated from the relation d = h / tg(at )

while the height of the object H can be estimated as

H = d . tg(a2 ). Elevation angles at and a2 can be

obtained from the encoder sensor situated in the neck's pitchjoint (Fig. 2) or by measuring the tilt pitch angles of camerasas alternative. Azimuth angle, i.e. angle of direction of

motion p is estimated by measuring the relative yaw angle

of the neck joint as it is shown in Fig. 2. In such a way, abio-inspired, simple way of robot SLAM (Simultaneous

relative position is in the range "near-far" i.e.: very near(beside), near, moderately far, far, very far (indefinite far).Similar values gradation is appeared with the forward speed

v e.g.: immobile, very slow, slow, moderately fast, fast andvery fast. Concerning the dimensions of the obstacles(height, width, depth) the following descriptive indicatorsare of importance: very small, small, moderate, large andvery large/huge. The mentioned linguistic/symbolicindicators/states can be mathematically formulated usingfuzzy functions. A robot can be learnt to distinguish thementioned ranges of symbolic/linguistic indicators in orderto predict desired motion in a 3D-world free of collision.Implementing fuzzy rules and fuzzy reasoning in the scopeof the cognitive control block, robot will be capable tounderstand environment and to make appropriate decisionsto response to the real circumstances in surrounding. In thatsense, elements of artificial intelligence will be incorporatedinto the biped robot control structure to extend the existingcognitive system behavior. The fundament of the robotartificial intelligence to be built in the advanced intelligentcontrol structure makes a corresponding cognitive block. Itconsists of a corresponding artificial connectionist structureas well as a fuzzy system. Both tools enable robot fastlearning, environment understanding as well as decisionmaking capabilities. To build such an intelligent controlstructure, corresponding geometry and kinematical scenariomodel(s) should be developed. For that purpose, thefollowing scenario models of obstacle avoidance as well ascollision avoidance are developed. They are presented inFigs. 3a and 3b.

As an example of demonstration of the intelligentreasoning that is planned to be implemented with ahumanoid robot, some characteristic simulation results arepresented in Fig. 4. In that sense, CAD model of an arbitraryenvironment with stationary obstacles as well as a targettrajectory generated by the corresponding cognitive blockare illustrated. Simulation results presented in Fig. 4 wereobtained by implementation of the HRSP software [5] for abiped robot whose parameters are defined in advance.

Localization and Mapping) and advanced navigationalgorithms will be realized. Determined geometry values

at, a2 , p, d and H identified by corresponding robot

acquisition system are forwarded to the high-level(cognitive) control block of a humanoid robot. Beside thevisual feedback information ensured by a pair of videocameras (see Fig. 2b), additional information aboutexistence of obstacles is necessary for obstacle avoidance aswell as robot trajectory prediction. The accurate distance(s)

of the obstacle(s) inside the circle of r ~ 1.00 -1.50 [m]can be obtained from the appropriate distance sensors. Forthat purpose Ultrasonic Range Finder (USRF) or laserscanner receiver (LSR) are commonly used in robotic

t practice depending on desired accuracy, assemblingpossibilities to the mechanical structure, price, etc. Byimplementation of the USRF sensors it is possible to detectexistence of the obstacles in a robot collision zone as well asdirection of motion of possible mobile obstacles/objects inthe robot surrounding. By numerical differentiation of theidentified/measured distance(s) between the robot andmoving object(s) it is possible to estimate its/their speed(s)and acceleration(s) of motion. These are importantindicators to be used for making the strategy of collisionavoidance.

During motion in unknown environment people comes inzones close to the obstacles (Fig. 2b). In order to avoidobstacles they make appropriate actions: change the course,i.e. direction of motion £ , vary the forward velocity v, step

length s, step period T, foot lifting height hf' etc.

Mentioned variables £, v, s, T, hf represents the gait

parameters Gp • These parameters represent output variables

of the new cognitive block for robot trajectory predictionand planning (generation of feet cycloids) that will beintegrated in the robot's high-level control structure. Duringa walk, humans do not know numeric values how far theyare from the closest obstacle or how fast they run. They havea linguistic, i.e. symbolic information in the mind that their

J~

,,I

II

II

III.

I -{-- k

.~... \.¥.....,kp'~~~..LJ 1;.1

.( ::~I·f·PI;.2 mobile.-j/ agent

PI;.3

.~ .....

.",'

!'-~

'"/

/

q./

Fig.3. Scenario model of obstacle(s) avoidance - geometry and kinematical indices used for building of cognitive robot block: a) fixed obstacle avodance,b) collision avoidance of mobile objects

Step lengths s (successive distances between thefootprints shown in Fig. 4b) and the forward speed V of thebiped robot along the target trajectory is detenninedimplementing previously described fuzzy inference engine.For creating fuzzy rules, a simple human logic was imitated.That can be briefly described in few words. If there exist anobstacle in front, decrease the speed, checking possibility tooverstep the obstacle (check the height) or move left orright. Adapt the speed of cornering maneuver; track thecontour of the obstacle keeping the ultimate direction ofmoving towards the assumed landmark object (i.e. targetpoint). The actual forward speed of locomotion is adapted insuch a way that takes into account the distance from theobstacle as well as yaw rate of the cornering maneuver.Implementation aspects of collision avoidance areconditioned by the technical limits of the biped robot (e.g.robot speed during the maneuver) as well as by processingtime. Once, when the target trajectory is detennined by thecognitive block of the biped robot system, it is a relativelyeasy task to synthesize a robot gait. Joint trajectories ofbiped robot are detennined by calculation of its inversekinematics. For that purpose, using the experimentalmeasurements from the capture motion studio [6] (in orderto ensure bio-inspired, anthropomorphic locomotion), thesoft-computing algorithms based on artificial neural

networks as non-linear identifiers are derived. These,algorithms are based on implementation of the off-line fastconvergent Levenberg-Marquardt back-propagationalgorithm [6]. The multi-layer perceptron structure chosen totrain the inverse robot kinematics uses Cartesian coordinatesof hip joint centers, hip link mass centers and feet cycloidsof motion. At the output it gives generalized coordinates ofbiped legs that enable robot locomotion in ananthropomorphic way.

Joint angular velocities and corresponding jointaccelerations are calculated by numerical differentiation.Joint trajectory tracking, posture stability and dynamicbalance will be ensured using position/velocity feedback injoint space, impedance control as well as feedback upondynamic reactions (i.e. Zero Moment Point ZMP [7]) at thefeet soles. Control of biped robot dynamics will be realizedat the low-control level (servo level) using the correspondingsensor system (encoders, tension/torques sensor, gyro, etc.).Additional contact force/torque sensors attached to the feetsoles of the biped robot are necessary. For that purpose,industrial Force Sensing Resistors or 6-axial Force-Torquessensor are used.

Previously described control algorithms will beimplemented within the control system structure shown inFig. 5.

o 0

x 1.55

a) b)

Fig.4. Example of an intelligent autonomous locomotion in an unknown environment: a) CAD-model of the environment, b) Obstacle avoidance andtarget trajectory prediction with the corresponding footprints

Two control blocks represent a brain of the systemconsisting of high-level control block (i.e. cognitive block)and low-level servo control block. Control of robotdynamics (biped locomotion) will be designed at the servolevel while the intelligent control algorithms (cognitivebehavior) enabling non-restricted autonomous locomotionand advanced reasoning will be synthesized at the highcontrol level. Corresponding data-acquisition blocks ensurestate feedback (block 1) as well as infonnation (actualrelative position, distance range, obstacle position and

velocity, etc.) about world In surrounding. Relay stationenables reliable communication between these twohierarchical levels via Ethernet lines. Certain upgrade of thehuman-robot interface will be done according to the chosendemonstration (simulation) scenarios in order to enable taskdefinition: introducing of the start and goal sites/positions,memorizing of the object image to be found andmanipulated, understand of manual and/or sound commandsby human operator, etc.

Dataacquisition

block

----------------------------------1IIIIIIIII1IIIIIIIIIIIII1

disturbance

PERROpersonal robot

Relay station

nsp controller

Data-base

Perception

Mobileobjects

Cognitive block

Applicativesoftware

..................., .

Human-robotinterface

111111 1---------------------------------------1 11 1

: 1

11III11IIIIII1I11II11IIII

Relay &h metstation c~m""u*ication

I I: : I I,.................... ••.••••••••••.•••••••••.••••••••.•••••••••••••••••••.•••••. 1 I

I I1 II II I1 II II II II I IL ~ ~ J

High-level Low-levelFig.5. Control system architecture providing an intelligent autonomopus locomotion of biped robot

III. CONCLUSION

During autonomous locomotion human beings use theirexperience to improve the skill of navigation and to speed uplocomotion in a space with obstacles. Robots use theirmemory and cognitive capabilities to do the same. Every

path (i.e. its gait parameters Gp ) performed is stored in the

memory of the robot controller. After more experimentalwalking, robot's control system acquires information aboutdifferent paths. For example, memorized path enables robotto go back along the same trajectory much faster thanpreviously. In such a way, robot's cognitive block creates akind of topology map of available paths that can be used inthe later tasks. Using perception system, robot is able todetermine its relative position in the scene in any timeinstant. Then, high-control level compares robot's currentposition with the memorized discrete positions on the map ofpaths. Finding the closest point appearing on the other pathsaved in the map of paths, robot is able to continue motionalong the known trajectory. In that case, robot uses already

memorized gait parameters Gp as new trajectory

parameters. Using known trajectory parameters as desiredpath information, robot saves processing time used forlocalization and navigation. That potentially enables robot torun faster across the in advance known trajectory. The samecase happens with humans. Initially, humans move throughthe unknown environment slowly and carefully. Any othertime, they pass the same path faster because they have not todo navigation or to check the global position of obstaclesagain. To escape a collision with obstacles due to a small

deviation from the memorized desired path robot has tocheck permanently the distance to the obstacles in a zone ofcollision. For this purpose of learning the map of paths aswell as for adaptation of the actual robot position to themotion along the already generated paths it is suitable toimplement fast-convergent on-line reinforcement learningalgorithms. In such a way, an experience can be synthesizedas a kind of artificial machine-experience of humanoidrobots, too.

REFERENCES

[1] G. Metta, G. Sandini, et. all, "The RobotCub Project - an openframework for research in embodied cognition", Proc. Of the 2005IEEE-RAS Int. Conf. On Humanoid Robots, 2005.

[2] http://world.honda.comlASIMO/[3] http://jp.fujitsu.comlgroup/automationlenl services/ humanoid-

robot/hoap2/[4] http://www.plyojump.comlqrio.html[5] HRSP: Humanoid Robot Simulation Platform, Matlab/Simulink

software toolbox for modeling, simulation & control of biped robots,Robotics Lab., Mihajlo Pupin Institut http://www.institutepupin.coml RnDProfile/ ROBOTIKA/comprod.htm

[6] A. Rodic, M. Vukobratovic, K. Addi, G. Dalleau, "Contribution tothe Modeling of Non-smooth, Multi-point Contact DYnamics ofBiped Locomotion - Theory and Experiments", Robotica,CAMBRIDGE Univ. Press, Vol. 26, Issue, 02, pp. 157-175, March,2008.

[7] M. Vukobratovic, A. Rodic, "Contribution to the Integrated Controlof Biped Locomotion Mechanisms", International Journal ofHumanoid Robotics, World Scientific Publishing Company, NewJersey, London, Singapore, Vol. 4, No.1, pp. 49-95, (March 2007).

[8] D.Katic, A.Rodic, "DYnamic Control Algorithm for Biped WalkingBased on Policy Gradient Fuzzy Reinforcement Learning", 17th IFACWorld Congress, Seoul, Republic of Korea, July 2008.

Documents

[IEEE 2008 9th Symposium on Neural Network Applications in Electrical Engineering - Belgrade, Serbia (2008.09.25-2008.09.27)] 2008 9th Symposium on Neural Network Applications in Electrical