Hybrid Systems

Hybrid Systems

Hybridization

• Integrated architectures for machine learning have been shown to provide performance improvements over single representation architectures.

• Integration, or hybridization, is achieved using a spectrum of module or component architectures ranging from those sharing independently functioning components to architectures in which different components are combined in inherently inseparable ways.

Neural Expert System

Neural Network training with a genetic algorithm

Fuzzy Case Based System

Neural Network hidden layer complexification with a GA

Evolution of Neural Network game Players

Hybridization Research

Evolving Neural Networks andCompetitive Co-evolution

Blondie and Mankala

Mancala

The board is initialized with 3 markers in each of six bowls. One side and one empty bowl belong to each player. The two end bowls are initially empty and the object is for each player to accumulate markers their bowl. At each turn a player chooses a bowl and distributes he seeds into the bowls immediately in the counter-clockwise direction. There is a capturing rule and a rule for making a sequence of moves and the game ends when one side of the board is empty the winner holding the greatest number of markers.

Planning Ahead

Four pieces -- marbles or stones -- are placed in each of the 12 holes. Each player has a 'store' to the right side of the Mancala board. The game begins with one player picking up all of the pieces in any one of the holes on his side.Moving counter-clockwise, the player deposits one of the stones in each hole until the stones run out.If you run into your own store, deposit one piece in it. If you run into your opponent's store, skip it.If the last piece you drop is in your own store, you get a free turn.If the last piece you drop is in an empty hole on your side, you capture that piece and any pieces in the hole directly opposite.Always place all captured pieces in your store.The game ends when all six spaces on one side of the Mancala board are empty.The player who still has pieces on his side of the board when the game ends captures all of those pieces.Count all the pieces in each store. The winner is the player with the most pieces.Try to plan two or three moves into the future.

Competitive Co-evolution

Fitness is based on direct competition among individuals selected from two independently evolving populations

The Problem

• Conventional intelligent gaming systems rely on rule-based systems or heuristic functions with look ahead to challenge opponents.

• However such systems that do not incorporate learning lose the interest of players who find quickly ways to beat them.

• Game developers are currently developing integrated techniques that combine neural networks with evolutionary programming to enable products with self-improving strategies.

Gaming Agents

• Entertainment software is using competitively co-evolving populations of neural nets to evolve entertainment agents that can adapt their characters or learn to play competitively in games to extend the novelty of intelligent board games and other products [Fog1, Fog2, Ros].

• In such co-evolving system the fitness measure is determined by the outcome of competitions between individuals in the competition

• Hard to devise strategies to keep players interested in games

• Players are at different levels

• Players improve and get bored

• Typically have not got a set of examples

Gaming Agents

y

0.91

3

4

5

6

7

8

x1

x3

x22

-0.8

0.4

0.8

-0.7

0.2

-0.2

0.6-0.3 0.1

-0.2

0.9

-0.60.1

0.3

0.5

From neuron:To neuron:

1 2 3 4 5 6 7 8

12345678

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

0.9 -0.3 -0.7 0 0 0 0 0 -0.8 0.6 0.3 0 0 0 0 00.1 -0.2 0.2 0 0 0 0 00.4 0.5 0.8 0 0 0 0 00 0 0 -0.6 0.1 -0.2 0.9 0

Chromosome: 0.9 -0.3 -0.7 -0.8 0.6 0.3 0.1 -0.2 0.2 0.4 0.5 0.8 -0.6 0.1 -0.2 0.9

Evolving WeightsEvolving Weights

Crossover in weight optimisationCrossover in weight optimisation3

4

5

y6

x22

-0.3

0.9-0.7

0.5

-0.8

-0.6

Parent 1

x11

-0.2

0.1

0.4

3

4

5

y6

x22

-0.1-0.5

0.2-0.9

0.6

0.3

Parent 2

x11

0.9

0.3

-0.8

0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9 0.4 -0.3 0.3 0.2 0.3 -0.9 0.60.9 -0.5 -0.8 -0.1

0.1 -0.7 -0.6 0.5 -0.80.9 -0.5 -0.8 0.1

3

4

5

y6

x22

-0.1

-0.5-0.7

0.5

-0.8

-0.6

Child

x11

0.9

0.1

-0.8

Mutation in weight optimisationMutation in weight optimisation

Original network3

4

5

y6

x22

-0.3

0.9-0.7

0.5

-0.8

-0.6x11

-0.2

0.1

0.4

0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9

3

4

5

y6

x22

0.2

0.9-0.7

0.5

-0.8

-0.6x11

-0.2

0.1

-0.1

0.1 -0.7 -0.6 0.5 -0.8-0.2 0.9

Mutated network

0.4 -0.3 -0.1 0.2

Evolving the network topologyEvolving the network topology

From neuron:To neuron:

1 2 3 4 5 6

123456

0 0 0 0 0 00 0 0 0 0 01 1 0 0 0 01 0 0 0 0 00 1 0 0 0 00 1 1 1 1 0

3

4

5

y6

x22

x11

Chromosome: 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 0

A crossover operator randomly chooses a row index and simply swaps the Corresponding rows between two parents creating two offspring. A mutation operator flips a bit in the chromosome with a specified probability

NEAT ALGORITHM

• NeuroEvolution of Augmenting Topologies (NEAT) improved genetic algorithms by making including complexification and speciation in the algorithm

• Alignment during crossover through synapsis

• Speciation protects complexification

Co-evolution and Game Playing

• Can an ANN be used in the gaming situation?

• What set of examples can be used?

Problems

• Adding up the score in all competitions may cause the extinction of important types that are not widely present in the populations thus diminishing overall diversity.

• As genetic material transforms lack of diversity may result in newer individuals that gradually lose the ability to defeat older and perhaps weaker opponents from previous generations.

• To promote diversity authors have developed methods to compute fitness that lessens the possibility of extinction.

O1 O2 O3 O4 O5 O6 AddS FS

P1 1 0 1 1 0 0

P2 0 0 1 1 1 0

P3 1 1 1 1 0 0

P4 1 1 0 0 0 0

P5 1 0 0 1 0 0

P6 0 0 0 0 1 1

Exercise: Compute fitness in a tournament

A 1 in Px/Oy indicates that PX beat OY

Fitness Sharing

• Under fitness sharing rare individuals (that can defeat opponents that few others can defeat) receive a greater fitness that simply adding up scores.

• More prevalent types have their fitness shared among a larger number of individuals and thus score lower.

Fitness Sharing

• Using fitness sharing if a competitor i defeats all opponents j in set X then the fitness of competitor i is computed using

• where Nj is the total number of competitors defeating opponent j.

Opponent Sampling

• Choosing opponents wisely will reduce the number of competitions.

• Because information is available about the performance of each opponent at each generation ways exist to choose smaller number of opponents without losing performance.

• As with competitors selecting the best individuals may result in a lack of diversity in the opponents.

• This can be achieved by again choosing opponents with maximal shared fitness.

ElitismA finite population has a very short memory. A genotype must be successful almost every generation to stay in the population. Once gone it can be dicult to rediscover. In using the GA for static optimization problems this is not very signicant because old genotypes were strictly worse according to the fitness measure.

In competitive coevolution such old genotypes might have become successful again. And losing them from the population allows the opposing population to be successful with new types that would lose against old extinct genotypes.

The result is that progress towards the optimum is no longer guaranteed.

Hall of FameIn competitive coevolution we have two distinct reasons to save individuals.

One reason is to contribute genetic material to future generations this is important in any evolutionary algorithm. Selection serves this purpose Elitism serves this purpose directly by making complete copies of top individuals.

The second reason to save individuals is for purposes of testing. To ensure progress we may want to save individuals for an arbitrarily long time and continue testing against them.

To this end we introduce the hall of fame which extends elitism in time for purposes of testing. The best individual from every generation is retained for future testing. Hosts are tested against both current parasites and a sample of the hall of fame. Successful new innovations can not overspecialize. They are required to be robustly successful against old parasites.

Design of Blondie24

• Checkers neural network

• Values for input nodes– Red – positive– White – negative– Empty – zero

• Piece differential

Slides from Adam Duffy and Josh Hill

“A player’s move was determined by evaluating the presumed quality of the resultingfuture positions. The evaluation function was structured as an artificial neural networkcomprising an input layer, three internal processing (hidden) layers, and an output node (Figure 2). The first internal processing layer was designed to indicate the spatial characteristics of the checkerboard without indicating explicitly how such knowledge might be applied. The remaining internal and output neurons operated based on the dot product of the evolvable weighted connections between nodes and the activation strength that was emitted by each preceding node. Each node used a hyperbolic tangent (tanh, bounded by + and - 1) with an evolvable bias term. In addition, the sum of all entries in the input vector was supplied directly to the output node, constituting a measure of the relative piece differential”, (see Chellapilla and Fogel, website)

Updating the weights

Design of Blondie24

• Connections between squares

• Subsections

Refined Architecture

Design of Blondie24

• Search methods– Minimax with Alpha-Beta Pruning– Iterative deepening– Hash table of previously evaluated positions

(maximum of 270,000)

A strategy to determine good moves in computer game playing by choosing the move that minimizes the loss the player can expect to incur, under the assumption that the player's opponent is adopting the same strategy (and hence trying to cause the maximum loss). The possible gain or loss for a sequence of moves is usually assessed using a static evaluation function applied to the state of play at the end of the sequence.

Minimax Searching

• Initial population of 30

• Each neural network plays 5 games as red– +1 for a win– 0 for a draw– -2 for a loss

• Top 15 kept, lowest 15 eliminated

• Copy top 15 and mutate the weights

Blondie24: Advantages

• Can learn new strategies

• Doesn’t have human biases

Blondie24: Disadvantages

• Long time

• Doesn’t make use of expert knowledge

Tests

• Played games on zone.com

• 165 games total (84 as red, 81 as white)

Results

• zone.com rating: 2045.85

• In top 500 of over 120,000

• Better than 99.61% of registered players

Chinook

• Primary feature - piece count

• Looks for certain features

• Over 40,000 opening lines of play

• Every ending with <= 8 pieces

• Higher value to positions with more pieces

• Rated 2,814 at retirement in 1996

Other applications

• Industry, medicine, and defense

• Pattern recognition

• Cancer

More information

• Blondie 24: Playing at the Edge of AI by David B. Fogel

• Learning to play games using a PSO-based competitive learning approach by L. Messeischmidt and A.P. Engelbrecht

• The Advantages of Evolutionary Computation by David B. Fogel

• Solving the Game of Checkers by Jonathan Schaeffer and Robert Lake

Co-evolution of a Mancala Agent

The board is initialized with 3 markers in each of six bowls. One side and one empty bowl belong to each player. The two end bowls are initially empty and the object is for each player to accumulate markers their bowl. At each turn a player chooses a bowl and distributes he seeds into the bowls immediately in the counter-clockwise direction. There is a capturing rule and a rule for making a sequence of moves and the game ends when one side of the board is empty the winner holding the greatest number of markers.

Detailed Rules• Mancala is played with seven pits per player.

• Your pits are the 6 small pits on your side of the board, and the larger Kalaha pit on the right hand side.

• Each player starts the game by placing 3 stones into each of their 6 small pits.

• A turn consists of taking all the stones from one of your pits, and then dropping a stone into each successive pit in a counter-clockwise fashion

• If the final stone is placed in your Kalaha, then you get another turn.

• If the final stone ends in one of your empty pits, then that stone plus any stones in the opposite pit are placed into your Kalaha.

• If you drop a stone in your Kalaha, and have stones left, then you continue dropping stones anti clockwise into your opponent's pits.

• The winner is the person with the most stones in his Kalaha

• The game ends when all of a player's pits are empty. At that point, the other player places the remaining stones in her Kalaha.

Design of the Neural Network

Number of Inputs: 14Number of hidden Layers: 1Number of neurons in hidden layer: 10Activation Function: sigmoid

Results• The system is implemented with two populations of neural networks with

weights trained genetically. • Several experiments were done. In the first populations of 30 networks

competed in each generation taking the top 10 and randomly selecting the remaining using crossover and mutation. This method did not produce an interesting strategy.

• Secondly the populations were increased to 100 taking the top two in each generation and picking the rest based on their scores. After 5000 generations this method produced a strategy that could beat a college level opponent on jgames.com, but further iterations gave no improvement.

• Implementing fitness sharing produced a good college level player after 5 000 iterations and after 15 000 generations produced a player that beat the highest level opponent on jgames.com but had lost the ability to remain competitive against less competitive opponents

Observations

• By studying the agent in action several patterns and useful rules of play began to emerge that would help beginners improve.

• It would 'look ahead' for combinations to enable sequences of moves. It clearly denied the opponent good moves by avoiding placing beans across from the opponent's goal and it used a kind of 'starvation strategy' of stockpiling markers farthest away from the opponent's goal.

Conclusions• In competitive co-evolution the fitness selection function is based on direct

competition between individual strategies in the competing populations. • As in software testing the problem of sampling is very difficult for a number

of reasons. • New innovations may disappear before maturing. New competitor types

contain chromosomes that defeat older types but may lose the ability to defeat older opponents that have been eliminated from the opposing population.

• Some can defeat large numbers of opponents while others can defeat smaller numbers of perhaps better opponents. It is hard to keep a balanced and diverse gene pool.

• Fitness sharing and shared sampling are techniques that have been used to develop more diverse populations [Ros].

Documents

Hybrid Systems