Even More Random Number Generators Using Genetic Programming

Preview:

DESCRIPTION

Even More Random Number Generators Using Genetic Programming. Joe Barker. Topics. Genetic Programming Random Numbers Previous Efforts Design & Implementation Results Conclusion Future Bibliography. Genetic Programming. Evolve programs for solutions, instead of solutions - PowerPoint PPT Presentation

Citation preview

Even More Random Number Generators Using Genetic

Programming

Joe Barker

Topics

• Genetic Programming• Random Numbers• Previous Efforts• Design & Implementation• Results• Conclusion• Future• Bibliography

Genetic Programming

• Evolve programs for solutions, instead of solutions

• Difficulty of representation

• Higher level than standard EA compounds standard problems

Genetic Programming

Gene Expression Programming

• Encodes information in a similar way to genes(operation) to DNA(string)

• Mutation & Crossover obvious string operations

• Care required to avoid gibberish

(2)

Genetic Programming

Gene Expression Programming

• Example

(3)

AG-CC-GT-TA-CC2 + 1 * 3

Genetic Programming

Expression Trees• Encodes operations in a natural tree

structure– Internal nodes are operations– Leaf-nodes are variables or constants

• Mutation & Crossover follow from the structure

• Layout of the tree avoids non-sensical results

(4)

Genetic Programming

Expression Trees• Example

(5)

Random Numbers

• Why?– Evolutionary Algorithms– Monte-Carlo Simulations– Software Regression Testing– Game Playing

Random Numbers

• What?– “Random” is difficult to define– Even statistical definitions necessarily

describe what we would consider random– Uniform

• 1-2-3-4-5 is Uniformly distributed but not what we would consider random

– Tests exist to try and cover the important aspects of random

(2)

Random Numbers

• Tests– Chi-Squared test for closeness of fit

(3)

5.0

)1,(

*

)*(

2

2

nV

pn

pnYV

i i

ii

Random Numbers

• Tests– Frequency or Equidistribution test

• Break number space into a small number of blocks• Use the counts for these blocks in Chi-Squared

test for Uniform distribution

– Gap test• Break number space into to classes(Normally

upper and lower parts)• Count length of runs of class 2 between class 1• Use a Chi-Squared test with the following

distribution:

(4)

loneonel ppnp )1(**

Random Numbers

• Tests– Entropy

• Arrange the numbers as a bitstring and count occurrences with certain lengths

• 101111110101010110011001001110– 10-11-11-11-01-01-01-01-10-01-10-01-00-11-10– 101-111-110-101-010-110-011-001-001-110

• Use percent occurrences in the following formula:

(5)

nE

ppE

n

i ii

2

1

02

log

1log*

Random Numbers

• How?– Computers are deterministic, so we must

approximate– Several classes of pseudo-random number

generators(PRNGs)

(6)

Random Numbers

• PRNGs– Linear congruential randomizers

• Some of the earliest known

(7)

cx

cb

ca

cbxax ii

0

1

0

0

0

mod*)*(

• Common choices– Park-Miller: a=7^5 b=0 c=2^31-1– URN08/RANDU: a=65539 b=0 c=2^31

Random Numbers

• PRNGs– Shift register randomizers

• SR[a,b,c]

• A common choice is SR[3,28,31]

(8)

12

12

1

c

i

cii

tbtx

xaxt

Random Numbers

• PRNGs– Shuffling randomizer

• Uses two other PRNGs• The first PRNG re/fills a list of numbers• The second PRNG selects number from the list

– Inversive– Mersenne Twister

(9)

Previous Efforts

• This project is based largely on the work by John R. Koza– Used expression trees as individuals– The tree was executed on numbers 1..16K to

obtain a random sequence– Bit entropy (lengths 1..7)– Non-Terminals=+,-,*,/,%– Terminals=J,0,1,2,3

Design & Implementation

• Individuals– Expression Tree– Non-Terminals=+,-,*,/,%

• =XOR• Each non-terminal is equally likely in a random tree

– Terminals=J,0,1,2,3• 2^i = Power of 2 (i=1-31, uniform)• Each terminal is equally likely in a random tree

– Output range is 0..2^32-1– Aged some number of steps before mature

Design & Implementation

• Evaluation – Fitness– Bit entropy (lengths 5,6,7,8)– Frequency tests (512 blocks)– Gap test for “runs above the mean” (Up to 10)– The alpha value calculated from the above two tests

was adjusted by the formula:

– All three values were normalized to a maximum of 1 and summed

– These are then averaged over the life of the individual

(2)

25.0

5.01

2p

F

Design & Implementation

• Selection– 2 mature parents selected uniformly– There is a small chance of both crossover and

mutation, but most likely only one

• Crossover– Subtrees selected from each parent and

swapped

• Mutation– Replaced subtree with equal or smaller

random tree

(3)

Design & Implementation

• Crossover - Mutation

(4)

Design & Implementation

• Competition– Replaces the bottom, by fitness, two mature

population members

• Other– No termination, runs indefinitely– HUP signal causes population to be dumped

to file

(5)

Results

• Population size=100

• Mature Age=30

• Initial Maximum Tree Depth=10

• Crossover only chance=0.8

• Both chance=0.15

Results

• After 10 hours on 5 machines, the best candidate was:

(2)

Results

• Performance - Entropy Test

PRNG Avg. Entropy Std. Deviation

Stage 1 25.999974 5.52E-07

Stage 2 25.108252 0.0348352

R250 25.932738 8.43E-05

glibc rand() 25.932582 7.79E-05

Ideal 26.000000 0.000000

(3)

Results

• Performance - Frequency Test

PRNG Chi-Sq. Statistic Chi-Sq. Percentile

Stage 1 1.43748 0.0000%

Stage 2 509.50400 47.7160%

R250 537.42400 78.8859%

glibc rand() 488.47300 23.3957%

Ideal 511.33349 50.0000%

(4)

Results

• Performance - Gap Test

PRNG Chi-Sq. Statistic Chi-Sq. Percentile

Stage 1 3160280.00000 100.0000%

Stage 2 7.58965 33.1510%

R250 173.47400 100.0000%

glibc rand() 7.23900 29.7294%

Ideal 9.34182 50.0000%

(5)

Results

• Performance - Speed Test (random nums/sec)

PRNG Speed

Stage 2 290444

Compiled Stage 2 3.1546E6

R250 2.3256E7

glibc rand() 6.2893E6

(6)

Conclusion

• It appears that employing EAs in this manner has promise

• I hesitate to recommend using Stage 2 as a production randomizer as of now, but it does bear more investigation

Future Work

• Add prime numbers to the available terminals

• Add more tests, such as periodicity, to the fitness function

• Some type of runtime compilation instead of interpreting the expression trees

Bibliography• Koza, John R., Evolving a Computer Program to Generate Random Numbers Using the Genetic

Programming Paradigm, Proceedings of the Fourth International Conference on Genetic Algorithms, Morgan Kaufmann Publishers, Inc., pages 37-44, 1991. http://citeseer.nj.nec.com/john91evolving.html

• Knuth, D. E., The Art of Computer Programming, Volume 2, Second Edition, Addison-Wesley, pages 9-114, Reading, MA, 1981.

• Koza, John R., Genetically Breeding Populations of Computer Programs to Solve Problems in Artificial Intelligence, Proceedings of the Second International Conference on Tools for AI. Washington, November, 1990, IEEE Computer Society Press, Los Alamitos, CA 1990.http://citeseer.nj.nec.com/koza90genetically.html

• Kinnear, Kenneth E. Jr., Evolving a sort: Lessons in genetic programming. Proceedings of the 1993 International Conference on Neural Networks, volume 2. IEEE Press, 1993.http://citeseer.nj.nec.com/kinnear93evolving.html

• Kinnear, Kenneth E. Jr., Generality and Difficulty in Genetic Programming: Evolving a Sort, Proceedings of the Fifth International Conference on Genetic Algorithms, Morgan Kaufmann Publishers, pages 287-294, Inc., 1993.http://citeseer.nj.nec.com/kinnear93generality.html

Recommended