Download pdf - Statistical Programming with R - Islamic University of Gazasite.iugaza.edu.ps/biqelan/files/2019/10/06R_Prgramming... · 2019. 10. 25. · R has a built-in function, called polyroot()

Statistical Programming with RLecture 6: Programming Examples

Bisher M. [email protected]

Department of Mathematics, Faculty of Science,

The Islamic University of Gaza

2019-2020, Semester 1

mailto:[email protected]

Hilbert Matrix

We have explained the structure of Hilbert matrices of di�erent orders inLecture 4. Here we are going to write an R code to produce the Hilbertmatrix of any order.

H=function(n){

H=matrix(0,n,n)

for(i in 1:n){

for(j in 1:n){

H[i,j]=1/(i+j-1)

}

}

H

}

You can try it now:

> H(7)

Bisher M. Iqelan (IUG) Lecture 6: Programming Examples 1st Semester 2019 1 / 43

Selecting Random Samples in R: sample() Function

Sometimes, we require to sample data from a large population.R has a function called sample() to do the same. We need to provide thepopulation and the size we wish to sample.Additionally, we can specify if we want to do sampling with replacement.By default it is done without replacement.In its simplest form, the sample() function can be used to return arandom permutation of a vector.

> x <- 1:10

# Now, use sample to create a random permutation of the vector x.

> sample(x)

[1] 4 2 6 7 8 10 3 5 1 9

Note that if you give sample integer n; (e.g., n=10) that it will do theexact same thing as above, that is, create a random permutation of theintegers from 1 to 10.

> sample(10)

[1] 7 2 1 4 10 9 8 5 6 3


Warning!

This can be a source of confusion if you're not careful. Consider thefollowing example from the sample help �le.

> x <- 1:10

> sample(x[x > 8])

[1] 10 9

> sample(x[x > 9])

[1] 4 5 9 10 3 7 2 8 1 6

Notice how the �rst output is of length 2, since only two numbers aregreater than eight in our vector. But, because of the fact that only onenumber (that is, 10) is greater than nine in our vector, sample thinks wewant a sample of the numbers from 1 to 10, and therefore returns a vectorof length 10.


Sampling with option replace = TRUE

If you don't specify the replace argument, R will assume that you aresampling without replacement. In other words, each element can only besampled once.If you want to sample with replacement, use the replace = TRUE option:

# Draw 10 samples from the integers 1:5 with replacement

> sample(x = 1:5, size = 10, replace = TRUE)

[1] 5 5 1 4 1 4 5 4 2 4

But, If you try to draw a large sample from a vector without replacement,R will return an error because it runs out of things to draw:

# You CAN'T draw 10 samples without replacement from

# a vector with length 5

> sample(x = 1:5, size = 10)

Error in sample.int(length(x), size, replace, prob) :

cannot take a sample larger than the population when 'replace=FALSE'


Experiment: Simulating coin �ips

Let's simulate 10 �ips of a fair coin, were the probably of getting either a"Head" or "Tail" is p = 0.50. Because all values are equally likely, wedon't need to specify the prob=... argument.

> coin <- c("Head", "Tail")

> sample(coin, # The possible values of the coin

size = 10, # 10 flips

replace = TRUE) # Sampling with replacement

[1] "Tail" "Head" "Tail" "Tail" "Head" "Head" "Head" "Tail"

[9] "Tail" "Tail"

Now, let's perform our coin-�ipping experiment 100 times.

> sample(coin, size = 100)

Error in sample.int(length(x), size, replace, prob) :

cannot take a sample larger than the population when 'replace=FALSE'


Experiment: Simulating coin �ips Cont...

So, we can't take a sample of size 100 from a vector of size 2, unless weset the replace option to "TRUE".

> Result = sample(coin, size = 100, replace=TRUE)

> table(Result)

Result

Head Tail

52 48


The Low of Big Number

Think with the following R code about the probability of obtaining one ofthe six faces when tossing a fair die. It explain the fact that they should beequal for large and large number of tosses.

w=list()

freq=list()

n=c(20,100,500,1000,5000,20000)

for(i in 1:6){

w[[i]]=sample(1:6,n[i],replace=TRUE)

freq[[i]]=table(w[[i]])/length(w[[i]])

}

> freq

[[1]]

1 2 3 4 5 60.15 0.20 0.15 0.20 0.10 0.20

[[2]]

1 2 3 4 5 60.13 0.19 0.12 0.14 0.24 0.18

[[3]]

1 2 3 4 5 60.162 0.156 0.176 0.160 0.172 0.174

[[4]]

1 2 3 4 5 60.163 0.155 0.170 0.172 0.169 0.171

[[5]]

1 2 3 4 5 60.1706 0.1624 0.1644 0.1710 0.1644 0.1672

[[6]]

1 2 3 4 5 60.1668 0.1656 0.1698 0.1653 0.1720 0.1607


The Low of Big Number Continue

We can easily see the bar plot of the frequencies of these trials by applyingthe following R code (it depends on the previous code)

par(mfrow=c(3,2))

for(i in 1:6){

barplot(freq[[i]])

title( main = substitute(paste('No. of trials='

, a),list(a = n[i])))

}


Root Finding

Finding the solutions of equations of the form

h(x) = c

is a very important problem in applied mathematics (and statistics).

By writing f (x) = h(x)− c , it is clear that solving the above equation isequivalent to solving

f (x) = 0,

and we can restrict our attention to �nding the solutions of this type ofequation.

The solutions are called roots or zeros of the function f (x). A functionmay have zero, one, or many roots.


The Roots of a General Polynomial

R has a built-in function, called polyroot(), for �nding all the roots of apolynomial.

The coe�cients of the polygon

anxn + · · ·+ a1x + a0

are passed in a vector (a0, a1, . . . , an) to polyroot.

The computed roots are returned in a vector.

Since, in general, the roots of polynomials are complex-valued, this vectorhas mode "complex".


Examples

Let us �nd the roots or zeros of the polynomial

p(x) = 1− 2x + x2

> polyroot(c(1,-2,1))

[1] 1+0i 1-0i

and for the polynomial

p(x) = 3− 4x + 5x2 − 7x3 + x4

> polyroot(c(3,-4,5,-7,1))

[1] -0.0351879+0.7830124i

[2] -0.0351879-0.7830124i

[3] 0.7757827+0.0000000i

[4] 6.2945931-0.0000000i


Finding square roots

Suppose we want to create a function mySqrt for �nding the square root ofits argument x .Our purpose of this example is to introduce a general numerical method forsolving equations: The Newton�Rapson method.We want to �nd

√x , i.e. the number(s) z such that z2 = x . (Recall that√

4 = ±2.)Thus for a given value x (e.g. x = 10) we want to solve the equation

f (z) = z2 − x = 0

Consider the function f (z) = z2 − 10. The function f is nonlinear but itcan be approximated by a linear function (a straight line) in a given point.E.g. close to z = 2 the function is approximated by the line call it `(z).The linear approximation to f close to a speci�c point z0 (e.g. z0 = 2) isgiven by:

f (z) ≈ f (z0) + f′(z0)(z − z0) = [f (z0)− f

′(z0)z0] + f

′(z0)z

where f′denotes the derivative of f .


Newton�Rapson method

Hence the linear approximation is

`(z) = [f (z0)− f′(z0)z0] + f

′(z0)z = a+ bz ,

For the function f (z) = z2 − 10 we have f′(z) = 2z . With z0 = 2 we have

f′(z0) = 4 and f (z0) = −6 so `(z) becomes

`(z) = −14+ 4z

The idea in Newton�Rapson is the following: Suppose z0 = 2 is our initialguess for a solution to the equation f (z) = 0 (which is di�cult to solve).Now `(z) is an approximation to f and it is easy to solve `(z) = 0 :

z =−ab

=14

4= 3.5

So we may take 3.5 to be a new approximation to the solution to f (z) = 0.We write z1 = 3.5.



Now we �nd the line which approximates f close to z1 = 3.5. The line is(check this for yourself):

`(z) = −22.25+ 7z .

Solving `(z) = 0 gives

z = 22.25/7 = 3.1786

So the next approximation is z2 = 3.1786. Iterate this scheme until thevalues for z stop changing. This is in essence the Newton�Rapson method.



Before going to programming let us again consider the solution to`(z) = a+ bz = 0, i.e.

z =−ab

.

where,a = f (z0)− f

′(z0)z0 and b = f

′(z0).

Inserting the latter in the former gives the form in which theNewton�Rapson method is usually presented in the literature:

z =−ab

= − f (z0)− f′(z0)z0

f ′(z0)= z0 −

f (z0)

f ′(z0)


Implementing mySqrt

We shall content ourself with an implementation which only returns thepositive root.First, we need to implement our function f = z2 − x which is a function ofz and x :

f <- function(z, x) {

value <- z^2 - x

return(value)

}

> f(z = 2, x = 10)

[1] -6

We also need the derivative f′of f (which here doesn't depend on x):

deriv.f <- function(z) {

value <- 2 * z

return(value)

}


Implementing mySqrt

So now we are almost there. The Newton�Rapson steps (2) become:

z0 <- 2

x <- 10

z1 <- z0 - f(z0, x)/deriv.f(z0)

z1

[1] 3.5

z2 <- z1 - f(z1, x)/deriv.f(z1)

z2

[1] 3.178571

z3 <- z2 - f(z2, x)/deriv.f(z2)

z3

[1] 3.162319


Implementing mySqrt

It would be nice to write a single function for doing the iterations above,and such a function can be made in di�erent ways:We create a simple function which performs 5 Newton�Rapson steps:

NRsteps <- function(x, z0) {

z <- z0

for (i in 1:5) {

z <- z - f(z, x)/deriv.f(z)

print(z)

}

return(z)

}

Having done so, it is easy to implement our mySqrt function:

mySqrt <- function(x, z0) {

z <- NRsteps(x, z0)

return(z)

}


Implementing mySqrt

Calling the function gives

> mySqrt(x = 10, z0 = 2)

[1] 3.5

[1] 3.178571

[1] 3.162319

[1] 3.162278

[1] 3.162278

[1] 3.162278

Clearly, no changes after 4 iterations!

We want to re�ne our NRsteps function such that it does not always make5 iterations. Instead it should iterate as many times as needed to obtainthat the changes in the estimates is smaller than e.g. 0.001. We would alsolike the function to print the number of iterations:


Re�ning mySqrt

This can be done as follows:

NRsteps <- function(x, z0) {

z <- z0

itcount <- 1

repeat {

z.new <- z - f(z, x)/deriv.f(z)

if (abs(z.new - z) < 0.001) {

break

}

itcount <- itcount + 1

z <- z.new

}

cat("NR iterations:", itcount, "\n")

return(z.new)

}


Re�ning mySqrt

Having to specify a starting value is annoying. An option could be to takex itself as a default starting value. This can be implemented as:

mySqrt <- function(x, z0=x) {

z <- NRsteps(x, z0)

return(z)

}

> mySqrt(10)

NR iterations: 5

[1] 3.162278

Notice that, our method is not working for x < 0. Moreover, for x = 0 thefunction should return an informative error message.


Re�ning mySqrt


mySqrt<-function(x, z0=x) {

if (x < 0) {

cat("Can not take square root of a -ve number...\n")

}

else {

if (x == 0) {

return(0)

}

else {

z <- NRsteps(x, z0)

return(z)

}

}

}


Testing mySqrt function

Now, calling our function mySqrt() to obtain

> mySqrt(-7)Can not take square root of a negative number...

> mySqrt(17)NR iterations: 6

[1] 4.123106

> mySqrt(17)[1] 0


Try this one, it is as the R output


mySqrt<-function(x, z0=x) {

if (x < 0) {

cat("[1] NaN\n","Warning message:\n","In sqrt(",x,"):

NaNs produced\n")

}

else {

if (x == 0) {

return(0)

}

else {

z <- NRsteps(x, z0)

return(z)

}

}

}


Loops in R

For Loop

The most commonly used loop structures in R are for, while and apply

loops. Less common are repeat loops.

for(i in 1:n){

statements

}

OR, similarly;

for(variable in sequence) {

statements

}


For loops: Examples

Example: Print a statement

for(i in 1:3){

print("Hello R Students!")

cat("i=",i,"\n")

}

[1] "Hello R Students!"

i= 1


i= 2


i= 3

Everything inside the curly brackets {...} is done 3 times.Looped commands can depend on i (or anyother counter).R creates a vector i with 1:3 in it.

For Loop is �exible, but slow when looping over large number of �elds(e.g. thousands of rows or columns)


For loops with break statement: Example

As we can see from the output, the loop terminates when it encounters thebreak statement.

for (i in 1:5)

{

for (j in 1:5)

{

for (k in 1:5)

{

cat(i," ",j," ",k,"\n")

if (k ==3) break

}

}

}


For loops with next statement: Example

A next statement is useful when we want to skip the current iteration of aloop without terminating it. On encountering next, R skips furtherevaluation and starts next iteration of the loop.

x <- 1:5

for (i in x) {

if (i == 3){

next

}

print(i)

}

[1] 1

[1] 2

[1] 4

[1] 5


For loops: Examples

Calculating factorials for an integer x: x! = x(x − 1)(x − 2) . . . 3.2.1

fact=function(x){

fact=1

if (x<2) {fact=1}

else { for (i in 2:x) fact=fact*i }

return (fact)

}

> fact(5)

[1] 120

Another function without loop

My.factorial <- function(x) {

if (x == 0) return (1)

else return (x * My.factorial(x-1))

}

> My.factorial(5)

[1] 120


unique() function: Examples

unique() function removes duplicated elements/rows from a vector, dataframe or array.

> x <- c(2:8,4:10)

> unique(x)

[1] 2 3 4 5 6 7 8 9 10

> A=rbind(1:3,11:13,1:3,15:17,1:3)

> unique(A)

[,1] [,2] [,3]

[1,] 1 2 3

[2,] 11 12 13

[3,] 15 16 17


For loops: Examples

Example: Apply this example from R documents:

f = sample(letters[1:5], 10, replace=TRUE)

for( i in unique(f) ) print(i)

Exercises

Try these for loops:

> for(objs in names(cars)) print(objs)

> for(trig in c(sin,cos)) print(trig(pi))

> for (i in seq(2, 10, 3)) print(i)


while loops

This kind of looping structure is suitable when the number of times thecomputations contained within the loop is repeated is not known inadvance, and the termination of the loop is dependent on some othercriteria. The general form of the while loop is:

while(cond) expr

The simple or compound R expression, expr, is repeatedly executed untilthe logical expression cond evaluates to a FALSE value.

So, when using while, we need to set up an indicator variable and changeits value within each iteration.


while loops: Examples

Consider the following simplest example explaining while loop

> y = 1

> while(y < 7){

print(y)

y = y+1

}

[1] 1

[1] 2

[1] 3

[1] 4

[1] 5

[1] 6



Let us use break statement to end or stop the loop when x=3:

x <- 1

> while(x < 5) {x <- x+1; if (x == 3) break; print(x)}

[1] 2

Now, let us use next statement to skip one step when x=3:

> x <- 1

> while(x < 5) {x <- x+1; if (x == 3) next; print(x)}

[1] 2

[1] 4

[1] 5



Example:Calculation of the factorial

int=4

fact=1

ind=int

while (ind > 1){

fact=fact*ind

ind=ind-1

}

> fact

[1] 24

In the above example, ind is the indicator variable which change its valuewithin each iteration.Exercise: Generalize the above code to a function call it Myfact tocompute the factorial of any number.


while loops Examples

In practice, a while loop is preferred when the cond expression is used forother purposes than just counting. It can be used, for example, todetermine if a terminating condition such as that the error in a computedanswer has decreased to be smaller than a pre-speci�ed level and thereforeis acceptable. For example, the following loop accumulates a sum tocompute exp(5) using the power series 1+x+x2/2!+ . . . expansion:

> i=0; term=1; sum=1; x=5

> while(term>.0001){

i=i+1

term=x^i/factorial(i)

sum=sum+term

}

> sum

[1] 148.4131

> exp(5)

[1] 148.4132


while loops Examples

In the previous example, the loop is terminated when the value of the nextterm to be added to the series is less than or equal to :0001.

Note that this loop will not work for negative vales of x.

Of course, if we knew that we need 21 terms in the series to achieve thisaccuracy of the result, we could have used the expression:

> 1+sum(5^(1:20)/factorial(1:20))

[1] 148.4131


repeat loops

The repeat loop is similar to the while loop except that the condition fortermination is tested inside the loop. This allows for more than a singlecondition to be checked and for these conditions to occur at di�erentplaces in the loop. The general form of the repeat loop is:

repeat {

expression

if(condition) {

break

}

}

where the expression is usually an R compound expression. Theexpression is evaluated repeatedly so that at least one break statementmust be in the loop. The loop will be exited only when a given condition issatis�ed. and any number of these statements may appear at di�erentplaces in the loop with di�erent cond logical expressions.


repeat loops Examples

> x <- 1

> repeat {

print(x)

x = x+1

if (x == 7){

break

}

}

[1] 1

[1] 2

[1] 3

[1] 4

[1] 5

[1] 6



For example the while loop in the previous example may be rewritten asfollows:

i=0

term=1

sum=1

x=5

repeat{

i=i+1


if(term<=.0001) break

sum=sum+term

}

> sum

[1] 148.4131



Try this R code:

i=0

term=1

sum=1

x=5

repeat{

i=i+1


sum=sum+term

if(term<=.0001)break

cat("i = ",i, "Term = ", term, "Sum = ",sum,fill=T)

}

cat (" Final Value=", sum, fill=T)


Evaluating Polynomials

A polynomial f (x) of degree n is a function of the form

f (x) = a0 + a1x + a2x2 + · · ·+ anx

n.

This form is the standard method of representing polynomial functions formathematical purposes. Actually, this form is called the power form.

f (x) above can be re-written in the following representation, calledHorner's rule.

f (x) = a0 + (a1 + (a2 · · ·+ (an−1 + anx) . . . )x).


Evaluating Polynomials Examples

Consider the evaluation of the polynomial with power form:

f (x) = 7+ 3x + 4x2 − 9x3 + 5x4 + 2x5.

An expression for evaluating f(x) using Horner's rule is

7.0+ (3.0+ (4.0+ (−9.0+ (5.0+ 2.0 ∗ x) ∗ x) ∗ x) ∗ x) ∗ x

which requires only 5 multiplications. (How many multiplications are therein the power form?)

This representation can have the following R code:

n=6; x=2; a=c(7,3,4,-9,5,2); sum= a[n]

for (i in (n-1):1){

sum=sum*x+a[i]

}

cat(" Series sum = ",sum,fill=T)

where a is a vector containing the coe�cients a1, a2, . . . , anBisher M. Iqelan (IUG) Lecture 6: Programming Examples 1st Semester 2019 43 / 43

End of lecture 6. Thank you.!!!