43
Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis

Random-Number Generation Andy Wang CIS 5930-03 Computer Systems Performance Analysis

Embed Size (px)

Citation preview

Random-Number Generation

Andy WangCIS 5930-03

Computer SystemsPerformance Analysis

Generate Random Values

• Two steps– Random-number generation

• Get a sequence of random numbers distributed uniformly between 0 and 1

– Random-variate generation• Transform the sequence to produce random

values satisfying the desired distribution

2

3

Background

• The most common method– Use a recursive function

xn = f(xn-1, xn-2, …)

Example

• xn = (5xn-1 + 1) %16– Suppose x0 = 5– The first 32 numbers

are between 0 and 15

• Divide xn by 15 to get numbers between 0 and 1

4

0 5 10 15 20 25 30 350

2

4

6

8

10

12

14

16

Nth number

Random number

0 5 10 15 20 25 30 350

0.10.20.30.40.50.60.70.80.9

1

Nth number

Random number

Basic Terms

• x0 = seed– Given a function, the entire sequence can

be regenerated with x0

• Generated numbers are pseudo random– Deterministic – Can pass statistical tests for randomness– Preferred to fully random numbers so that

simulated results can be repeated

5

0 5 10 15 20 25 30 350

0.10.20.30.40.50.60.70.80.9

1

Nth number

Random number

Cycle Length

• Note that starting with the 17th number, the sequence repeats– Cycle length of 16

6

More Terms

• Some generators do not repeat the initial part (tail) of the sequence

• Period of a generator = tail + cycle length

7

tail cycle length

period

Question

• How to choose seeds and random-number generation functions?1. Efficiently computable

• Heavily used in simulations

2. The period should be large3. Successive values should be

independent and uniformly distributed

8

Types of Random-Number Generators

• Linear-congruential generators• Tausworth generators• Extended Fibonacci generators• Combined generators• Others

9

10

Linear-Congruential Generators

• In 1951, Lehmer found residues of successive powers of a number have good randomness propertiesxn = an % m = aan-1 % m = axn-1 % m

• Lehmer’s choices of a and ma = 23 (multiplier)m = 108 + 1 (modulus)

• Implemented on ENIAC

(Mixed) Linear-Congruential

Generators (LCG)• xn = (axn-1 + b) % m

• xn is between 0 and m – 1• a and b are non-negative integers

• “Mixed” using both multiplication by a and addition by b

11

The Choice of a, b, and m

• m should be large– Period is never longer than m

• To compute % m efficiently– Make m = 2k

– Just truncate the result by k bits

12

The Choice of a, b, and m

• If b > 0, maximum period m is obtained when– m = 2k

– a = 4c + 1– b is odd– c, b, and k are positive integers

13

Full-Period Generators

• Generators with maximum possible periods

• Not equally good– Look for low autocorrelations between

successive numbers– xn = ((234 + 1)xn-1 + 1) % 235 has an

autocorrelation of 0.25– xn = ((218 + 1)xn-1 + 1) % 235 has an

autocorrelation of 2-18

14

Multiplicative LCG

• xn = axn-1 % m, b = 0• Can compute more efficiently when m =

2k

• However, maximum period is only 2k-2

• Problem: Cyclic patterns with lower bits

15

Multiplicative LCG with m = 2k

• When a = 8i ± 3– E.g., xn = 5xn-1 % 25

• Period is only 8• Which is ¼ of 25

• When a ≠ 8i ± 3– E.g., xn = 7xn-1 % 25

• Period is only 4

16

0 5 10 15 20 25 30 350

5

10

15

20

25

30

Nth number

Random number

0 5 10 15 20 25 30 350

5

10

15

20

25

30

Nth number

Random number

Multiplicative LCG with m ≠ 2k

• To get a longer period, use m = prime number– With proper choice of a, it is possible to get

a period of m – 1– a needs to be a prime root of m

• If and only if an % m ≠ 1 for n = 1..m - 2

17

Multiplicative LCG with m ≠ 2k

• xn = 3xn-1 % 31– x0 = 1– Period is 30– 3 is a prime root of 31

18

0 5 10 15 20 25 30 350

5

10

15

20

25

30

Nth number

Ra

nd

om

nu

mb

er

Multiplicative LCG with m ≠ 2k

• xn = 75xn-1 % (231 – 1)– 75 is a prime root of 231 – 1– But watch out for computational errors

• Multiplication overflow– Need to apply tricks mentioned in p. 442

• Truncation due to the number of digits available

19

20

Tausworthe Generations

• How to generate large random numbers?

• The Tausworthe generator produces a random sequence of binary digits– The generator then divides the sequence

into strings of desired lengths– Based on a characteristic polynomial

Tausworthe Example

• Suppose we use the following characteristic polynomialx7 + x3 + 1– The corresponding generation function is

• bn+7 bn+3 bn = 0Or• bn = bn-4 bn-7

– Need a 7-bit seed

21

Tausworthe Example

• The bit stream sequence1111111000011101111001011001….

• Convert to random numbers between 0 and 1, with 8-bit numbersx0 = 0.111111102 = 0.9921910

x1 = 0.000111012 = 0.1132810

x2 = 0.111001012 = 0.8945310

22

Tausworthe Generator Characteristics

• For the L-bit numbers generated+E[xn] = ½

+V[xn] = 1/12+The serial correlation is zero+ Good results over the complete cycle- Poor local behavior within a sequence

23

Tausworthe Example

• If a characteristic polynomial of order q has a period of 2q – 1, it is a primitive polynomial

• For x7 + x3 + 1• q = 7• Sequence repeats after 127 bits = 27 - 1• A primitive polynomial

24

Tausworthe Implementation

• Can be easily generated via linear-feedback shift-registers

• For x5 + x3 + 1

25

bn bn-1 bn-2 bn-3 bn-4 bn-5

26

Extended Fibonacci Generators

• xn = (xn-1 + xn-2) % m– Does not have good randomness

properties– High serial correlation

• An extension– xn = (xn-5 + xn-17) % 2k

27

Combined Generations

• Add random numbers by two or more generators– Can considerably increase the period and

randomness

xn = 40014xn-1 % 2147483563

yn = 40692yn-1 % 2147483399

wn = (xn - yn) % 2147483562– This generator has a period of 2.3 x 1018

28

wn = 157wn-1 % 32363

xn = 146xn-1 % 31727

yn = 142yn-1 % 31657

vn = (wn - xn + yn) % 32362– This generator has a period of 8.1 x 1012

– Can avoid the multiplication overflow problem

Combined Generators

29

• XOR random numbers by two or more generators

Combined Generators

30

• Shuffle– One sequence as an index

• To an array filled with random numbers generated by the second sequence

– The chosen number in the second sequence is replaced by a new random number

– Problem• Cannot skip to the nth random number

Combined Generators

31

A Survey of Random-number Generators

• Some published generator functionsxn = 75xn-1 % (231 – 1)– Full period of 231 – 2– Low-order bits are randomly distributed

• Many others (see textbook)– All have problems

• General lessons: Use established ones; Do not invent your own

32

Seed Selection

• If the generator has a full period– Only one random variable is required– Any seed value is good

• However, with more than one random variable, the story is different for multistream simulations– E.g., random arrival and service times– Should use two streams of random

numbers

Seed Selection Guidelines

• Do not use zero– Not good for multiplicative LCGs and

Tausworthe generators• Avoid even values

– Not good if a generator does not have a full period

• Do not use one stream for all variables– May yield strong correlations among

variables

33

Seed Selection Guidelines

• Use nonoverlapping streams– Each stream requires a separate seed– Otherwise…

• A long interarrival time may correlate with a long service time

– Suppose we need 10,000 random numbers for interarrival times; 10,000 for service times, use seeds 1 and 10,001

– xn = [anx0 + c(an – 1)/(a – 1)] % m• For multiplicative LCGs, c = 0

34

Seed Selection Guidelines

• Not to reuse seeds in successive simulation runs– No point to run a simulation again with the

same seed– Just continue with the last random number

as the seed for the successive runs

35

Seed Selection Guidelines

• Do not use random random-number generator seeds– E.g., do not use the time of day, or /dev/random to seed simulations

– Simulations should be repeatable – Cannot guarantee that multiple streams will

not overlap• Do not use numbers generated by

random-number generators as seeds

36

37

Myths About Random-number Generation

• A complex set of operations leads to random results– Hard to guess does not mean random

• Random numbers are not predictable– Given a few successive numbers from an

LCG– Can solve a, c, and m– Not suitable for cryptographic applications

38

• Some seeds are better than others– True– Avoid generators whose period and

randomness depend on the seed• Accurate implementation is not

important– Watch out for overflows and truncations

Myths about Random- number Generation

39

• Bits of successive words generated by a random-number generator are equally randomly distributed– Nope

Myths about Random- number Generation

40

• xn = (25173xn-1 + 13849) % 216

– x0 = 1– Least significant bit is always 1– Bit 2 is always 0– Bit 3 has a cycle of 2– Bit 4 has a cycle of 4– Bit 5 has a cycle of 8

Myths about Random- number Generation

n decimal binary

1 25173 01100010 01010101

2 12345 00110000 00111001

3 54509 11010100 11101101

4 27825 01101100 10110001

5 55493 11011000 11000101

6 25449 01100011 01101001

7 13277 00110011 11011101

41

• For all multiplicative LCGs• The Lth bit has a period that is at most 2L

• For LCGs, with the formxn = axn-1 % 2k

– The least significant bit is always 0 or always 1

• High-order bits are more random

Myths about Random- number Generation

More on Random Number Generations

• Mersenne twister– Period =~ 219937-1

• /dev/random– Extract randomness from physical devices– Truly random

42

43

White Slide