49
Jorge Munoz, German Gutierrez, Araceli Sanchis Presented by: Itay Presented by: Itay Bittan Bittan

Controller for TORCS created by imitation

  • Upload
    merton

  • View
    39

  • Download
    1

Embed Size (px)

DESCRIPTION

Controller for TORCS created by imitation. Jorge Munoz, German Gutierrez, Araceli Sanchis. Presented by: Itay Bittan. OVERVIEW. 1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works. - PowerPoint PPT Presentation

Citation preview

Page 1: Controller for TORCS created by imitation

Jorge Munoz, German Gutierrez, Araceli Sanchis

Presented by: Itay Presented by: Itay BittanBittan

Page 2: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 3: Controller for TORCS created by imitation

Initial approach to create a controller for TORCS by learning how another controller or humans play the game.

The data obtained from 3 controllers.One human player and two controllers:

The winner of the WCCI 2008 Simulated Car Racing

Hand coded controller that performs a complete lap in all tracks.

Page 4: Controller for TORCS created by imitation

First, each kind of controller is imitated separately, then a mix of data is used to create new controllers.

The imitation is performed by means of training a feed forward neural network with the data, using the backpropagation algorithm for learning.

ANN - Artificial Neural Networks

Page 5: Controller for TORCS created by imitation

Human players realize they are not playing vs. another human – and finds a way to beat the NPC (non-player character).

NPC sometimes cheats for winning humans. Another option is to play in Internet against

other human players

With a lot of cheats or playing versus experienced human make you lose in every game – boring!

Page 6: Controller for TORCS created by imitation

Create opponents as intelligent as a human player.

The AI must be able to adapt its behavior depending on the opponent (play in the same level of the human)

In this way the AI will provide a better entertainment for the player!

Page 7: Controller for TORCS created by imitation

Realistic game where human plays versus one or more NPC.

There is option to compare the results with other researchers.

Allows to analyze behaviors that take place in a short period of time.

Page 8: Controller for TORCS created by imitation

In all the experiments the controller created is a feed-forward ANN (Artificial Neural Networks) that was trained with data generated by the controllers.

The learning algorithm for the ANN was backpropagation.

Page 9: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 10: Controller for TORCS created by imitation

Wide area of researching is to create computational intelligence in games with ANN.

NEAT – NeuroEvolution of Augmenting Topologies

NEAT is an effective method (algorithm) to create ANN - it alters both the weighting parameters and structures of networks.

It starts with a small population of random ANN (with only input & output layers) that evolves to the problem.

Page 11: Controller for TORCS created by imitation

An approaches to adapt the AI of the game to the player

Examples:Rapidly adaptive game AI – method that

applies continuously small adaptations to the AI based on observations and evaluation of the user actions.

Dynamic Scripting – based on a set of rules that are used for the game, whose weights to select one or another rule are modified through a machine learning algorithm.

Page 12: Controller for TORCS created by imitation

One researcher clone the behavior of RoboCup player using case base reasoning (solving new problems based on the solutions of similar past problems)

RoboCup – simulated league of soccer.

Page 13: Controller for TORCS created by imitation

Other researcher program robosoccer agents by modelling human behaviors with successful results.

RoboCup – simulated league of soccer.

Page 14: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 15: Controller for TORCS created by imitation

Very realistic simulator that has a sophisticated physic engine that takes into account many aspects of the racing such as fuel consumption, collisions or traction.

Provides a lot of tracks, cars with different feature

TORCS is open software and that allows the researchers to make modifications to the game and adapt it to their requirements.

Page 16: Controller for TORCS created by imitation

Info. Provided:• The lap (current lap

time, best lap time, distance raced, race position)

• The car status (damage, fuel, actual gear, speed, lateral speed and R.P.M)

• Distanse between the car and the track edges

• More..

Action sent:• Acceleration level• Brake level• Gear• Steering of the wheel

Page 17: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 18: Controller for TORCS created by imitation

In the experiments we use the data obtained from three different controllers:

Human player The winner of the WCCI2008 competition Hand coded controller

Page 19: Controller for TORCS created by imitation

The information that the human gets from watching the game monitor is much richer.

He drove the car trying to go through the middle of the road, with soft accelerations and brakes, braking much before the next curve started, without fast and sharp turns.

The human tried to drive the car as programmed controller would do, but with the mistakes that human makes.

Page 20: Controller for TORCS created by imitation

Created my Matt Simmerson. This controller was the winner of the WCCI 2008 simulated car racing competition.

As INPUTs of the ANN created by NEAT he selected: the current speed, the angle to track axis, the track position with respect to left and right edges, the current gear selection, the four wheels spin sensors, the current R.P.M and 19 track sensors.

All these inputs were scaled to the range [0,1].

Page 21: Controller for TORCS created by imitation

The OUTPUTs of the ANN were:The power (accelerate and brake), the gear

change and the steering.The two first are range [0,1] and the last one

is in range [-1,1].The fitness function used to evaluate the ANN

in NEAT took into account the distance raced, the R.P.M, the maximum speed reached and a value to measure of how much the car stayed on the track.

Page 22: Controller for TORCS created by imitation

The idea of creating another controller was due to the human controller sometimes make mistakes and the Simmerson’s controller does not perform one complete lap in all the tracks and sometimes gets out from the track.

Thus, the requirements are:• To have same outputs for same inputs

(without mistakes) – deterministic• To perform a lap without getting out of the

track

Page 23: Controller for TORCS created by imitation

To calculate the values for the acceleration and the brake, we calculate the speed the car should have (estimated speed).

Where sum_semsors is the sum of the three front sensors (which give the distance between the car and the edge of the track).

Alpha and beta are predefined parameters.With this value (estimated speed) we

calculated the diffrence:

Page 24: Controller for TORCS created by imitation

The acceleration and brake values are proportional to the absolute value of the difference of the actual speed and the estimated speed:

Where again there are adjustment parameters.

Page 25: Controller for TORCS created by imitation

The steering value calculation:First, we check if the car is in straight or in a

curve. We suppose that the car is in a straight when any of the 3 front sensors has the max value, and in a curve otherwise.

Straight equation:

Curve equation:

Page 26: Controller for TORCS created by imitation

Finally the gear is calculated by:

Where lambda is an adjustment parameter and speed is the current speed of the car.

(for the gear the controller does not allow to change the gear twice in less than a second)

Page 27: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation

6. Results 7. Conclusions 8. Future works

Page 28: Controller for TORCS created by imitation

For the goal of learning the behavior of the controller we have used an ANN.

First, we obtain the data from the controller we want to imitate, then the ANN is trained with the backpropagation algorithm and finally the new controller is tested in the tracks.

Page 29: Controller for TORCS created by imitation

Inputs:• Current speed of the

car• Angle of the car with

the axis• The current gear• The lateral speed of

the car• The R.P.M• The 4 spins of the

wheels.• 19 sensors – distance

between the car and the adges of the track

Outputs:• The accelerate / brake

value• The gear• The steering

Page 30: Controller for TORCS created by imitation

For all the experiments the ANN has 3 hidden layers of 28 neurons each one, were trained during 1000 cycles and the learning rate starts in 0.9 and finished in 0.0001.

The data was taken from 17 roads tracks (next slide) – but only if the controller complete almost 3 laps.

Hidden Layer

Inputlayer

Outputlayer

Page 31: Controller for TORCS created by imitation
Page 32: Controller for TORCS created by imitation

That have been obtained per each controller:

Page 33: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 34: Controller for TORCS created by imitation

For the “controllers learning by imitation” we used the data of all tracks to train the ANN of the controller and then test it in each track.

Page 35: Controller for TORCS created by imitation

The time obtained by the controllers described before each track

Page 36: Controller for TORCS created by imitation

The results of the learnt controllers for all the 3-controllers.

Page 37: Controller for TORCS created by imitation
Page 38: Controller for TORCS created by imitation

A controller created with the data of human, Simmersons and handcoded controllers was not created because they did not get good result with mixed configurations, as shown in the last 3 tables.

Page 39: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 40: Controller for TORCS created by imitation

It is very complicated to learn the human behavior in a video game:

• The human makes different actions in same situations.

• He does not make all actions in the proper way.

• He makes a lot of mistakes that have to solve.

This sort of data makes completely impossible that an ANN could learn something useful.

Page 41: Controller for TORCS created by imitation

The gear problem: the gear change has not been learned, despite it is probably the easiest output to learn. Maybe it is because of the high amount of data and due to the gear is also an input of the ANN.

Page 42: Controller for TORCS created by imitation

The mixed data from two types of controllers – the result show that those controllers do not work.

The controller learned has mixed features, but non of these is learned properly, so the car goes out of the track easily and the simulation ends.

Page 43: Controller for TORCS created by imitation

1. Introduction 2. Related work 3. TORCS competition 4. Controllers 5. Controller learning by imitation 6. Results 7. Conclusions 8. Future works

Page 44: Controller for TORCS created by imitation

Pre-process of the data before use it to train the neural network. Two ideas:

To decrease the amount of data / remove duplicates.

Data pattern with same input and different output must be removed.

Page 45: Controller for TORCS created by imitation

Train the controllers with some data of one controller.

For example, the data of the straights of one controller that performs the straights good and the data of the curves of another controller that make well the turns.

Page 46: Controller for TORCS created by imitation

If we want to imitate the human behavior:We have to increase the information used to

train the ANN. There is a lack of information because the human player senses more information from the domain that the other controllers use. The human also remember and improve his/her behavior in each lap.

The human also not perform in the same action under the same circumstances.

Page 47: Controller for TORCS created by imitation

The ANN:There is no context information, the

controller does not remember its past actions and cannot take decision with that information.

Maybe recurrent neural networks need to be used and maybe we have to try different learning algorithms.

Page 48: Controller for TORCS created by imitation
Page 49: Controller for TORCS created by imitation