Programming in R

These notes are based on Chapter 3-5 of JMR — Jones, O., Maillardet, R., & Robinson, A. (2009). Introduction to scientific programming and simulation using R. CRC Press.

Function

Just like functions in mathematics, R functions take arguments and return a value. For example, if you have a mathematical function: \[f:x\mapsto \sin(x^2+1)\] the corresponding R function can be written as f <- function(x) sin(x^2+1). To use multiple lines for the definition, use a pair of brackets { } and return keyword as follows:

f <- function(x) {
  tmp <- x^2 # tmp is a dummy name
  tmp <- tmp + 1
  tmp <- sin(tmp)
  return(tmp)
}

Notice the use of # for comments.

for loop

JMR 3.9.2

Let \(h(x,n) = 1 + x + x^2+\dots+ x^n = \sum_{i=0}^n x^i\). Write an R program to calculate \(h(x,n)\) using for loop.

h <- function(x,n) {
  sum <- 0
  for (i in 0:n) {
    sum <- sum + x^i
  }
  return(sum)
}

JMR 3.9.3

The function \(h(x,n)\) from JMR 3.9.2 is the finite sum of a geometric sequence. It has the following explicit formula, for \(x\neq1\), \[ h(x,n) = \frac{1-x^{n+1}}{1-x} .\] Test your program from JMR 3.9.2 against this formula using the following values

\(x\) \(n\) \(h(x,n)\) \(\frac{1-x^{n+1}}{1-x}\)
0.3 55 1.4285714 1.4285714
6.6 8 4.243335510^{6} 4.243335510^{6}

You may use format(h(6.6,8), scientific=FALSE) for a nicer look.

while loop

JMR 3.9.4

First write a program that achieves the same result as in JMR 3.9.2 but using a while loop. Then write a program that does this using vector operations (and no loops).

h2 <- function(x,n) {
  sum <- 0
  i <- 0
  while (i <= n) {
    sum <- sum + x^i
    i <- i + 1
  }
  return(sum)
}
x <- 6.6
n <- 8
h2(x,n)
## [1] 4243336
sum(x^(1:n)) # Just one line!
## [1] 4243335

Vector operations are faster and cool. We should use them when performance is required (e.g., Size of the Mandelbrot set). But, if they run for a second or two, I recommend using loops to be explicit about what you are doing.

if statement

Are you prime?

Write a function that returns whether an integer \(n\geq2\) is prime. (Recall that \(n\geq2\) is prime if it has only two factors: \(1\) and \(n\).)

prime <- function(n) {
  if (n > 2){
    for (m in 2:(n/2)) {
    # for (m in 2:(sqrt(n))) {
      if (n %% m == 0) {
        return(FALSE)
      }
    }
  }
  return(TRUE)
}
prime(7)
## [1] TRUE

We may replace n/2 with sqrt(n) and make it more efficient because if \(n=m_1m_2\), at least one of \(m_1\) and \(m_2\) must be less than or equal to \(\sqrt{n}\).

JMR 3.9.8

How does program provided in JMR p.32 behave if a2=0 and/or a1=0? Using if statements, modify the program so that it gives sensible answers for all possible (numerical) inputs.

a2 <- 0
a1 <- 2
a0 <- 1
discrim <- a1^2 - 4*a2*a0
if (discrim > 0) {
  roots <- c( (-a1 + sqrt(a1^2 - 4*a2*a0))/(2*a2),
              (-a1 - sqrt(a1^2 - 4*a2*a0))/(2*a2) )
} else {
  if (discrim == 0) {
    roots <- -a1/(2*a2)
  } else {
    roots <- c()
  }
}
roots
## [1]  NaN -Inf

When using the quadratic formula, a problematic case is a2=0, which makes the function no longer quadratic but linear. So, let’s separate this case using if.

if (a2 != 0) {
  if (discrim > 0) {
    roots <- c( (-a1 + sqrt(a1^2 - 4*a2*a0))/(2*a2),
                (-a1 - sqrt(a1^2 - 4*a2*a0))/(2*a2) )
  } else {
    if (discrim == 0) {
      roots <- -a1/(2*a2)
    } else {
      roots <- c()
    }
  }
} else if (a1 != 0) {
  roots <- -a0/a1
} else {
  roots <- c()
}
roots
## [1] -0.5

Recursive function

A recursive function is a function that calls itself. Although powerful and useful because many algorithms are recursive in nature, it sometimes takes time to follow the flow of executions.

JMR 5.5.1

A factorial function \(f(n)\) (commonly denoted by \(n!\)) is defined as follows: \[ f(n) = \begin{cases} n\cdot(n-1)\cdots1 & \text{for } n \geq 1 \\ 1 & \text{for } n = 0 \end{cases} \]

Notice the recursion for \(n\geq1\), \[f(n) = n\cdot(n-1)\cdots1 = n\cdot f(n-1) .\] That is, to return a value for an argument \(n\), the factorial function calls itself with another argument \(n-1\).

Let’s implement this simple recursive function.

factorial <- function(n) {
  if (n >= 1) {
    return(n*factorial(n-1)) # calling itself!
  } else {
    return(1)
  }
}

factorial(5)
## [1] 120

It is illuminating, but the use of recursion for factorial with fixed \(n\) seems like overkill as we can simply use for loop. The next example is more practical.

Which of you are prime?

Write a function that returns a subset of \(\{2,3,\dots,n\}\) which contains only primes.

An easy solution is to use a for loop to call prime() defined in Are you prime? \(n-1\) times.

primesieve1 <- function(n) {
  S <- c()
  for (i in 2:n) {
    if (prime(i)) {
      S <- c(S, i)
    }
  }
  return(S)
}

According to JMR 5.5.2, however, there is a lot more efficient algorithm called the Sieve of Eratosthenes, which uses a recursive function.

  1. Set \(p = 2\), \(P = \{\}\) and \(C = \{2,3,\dots,n\}\).
  2. Append \(p\) to \(P\) and remove all multiples of \(p\) from \(C\).
  3. Set \(p = \min(C)\) and \(n = \max(C)\).
  4. Return \(P+C\) if \(p > \sqrt{n}\). Otherwise, go back to #1.
primesieve2 <- function(P,C) {
  p <- C[1] # faster than min(C)
  n <- C[length(C)] # faster than max(C)
  if (p > sqrt(n)) {
    # cat("P = {", P, "}\n")
    # cat("C = {", C, "}\n")
    return(c(P,C))
  } else {
    P <- c(P, p) # append p
    C <- C[C%%p != 0] # removing multiples of p
    return(primesieve2(P,C))
  }
}

Let’s see how faster the recursive algorithm performs for the same result.

n = 1e5
tic()
r1 = primesieve1(n)
toc()
## 24.954 sec elapsed
tic()
r2 = primesieve2(c(), 2:n)
toc()
## 0.024 sec elapsed
if (identical(r1,r2)) {
  cat("Both produce the same set containing", length(r2), "primes." )
} else {
  cat("Well, something went wrong...")
}
## Both produce the same set containing 9592 primes.

We will see another recursive function in Numerical Integration

Plotting

For practice, let’s draw a hyperbolic tangent with three fixed points at \(x = 0, \pm1.915\). \[y = 2\tanh(x) = 2\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}\]

x <- c(-1.915,0,1.915)
y <- c(-1.915,0,1.915)
plot(x, y, pch=19, xlim=c(-2,2), ylim=c(-2,2))
title(expression(y==2*tanh(x)))
curve(2*tanh(x), from=-3, to=3, add=TRUE)
curve(x*1, from=-3, to=3, add=TRUE)
text(-1.55, -.8, "fixed point at -1.915")
text(0, -.2, "fixed point at 0", pos=4)
text(1.55, .8, "fixed point at 1.915")
arrows(-1.8, -1, -1.9, -1.8, length=0.1)
arrows(1.8, 1, 1.9, 1.8, length=0.1)

RStudio

The interface of RStudio may look like this.