These notes are based on Chapter 11 of Jones, O., Maillardet, R.,
& Robinson, A. (2009). *Introduction to scientific programming
and simulation using R*. CRC Press.

A definite integral \(\int_a^b
f(x)dx\) is easy to compute if we can find an antiderivative
\(F\) such that \(F'(x) = f(x)\) because \(\int_a^b f(x)dx = F(b)-F(a)\).
Unfortunately, for many functions, it is impossible to write down an
antiderivative in a closed form. This is a key motivation for numerical
integration, which could provide good approximations of definite
integrals. For example, the cumulative distribution function (CDF) of
the standard normal distribution \[\Phi(z) =
\int_{-\infty}^z\frac{1}{\sqrt{2\pi}}e^{-x^2/2}dx\] is commonly
computed via numerical integration. R has a built-in `pnorm`

for \(\Phi\) and, for example, the
following is a good approximation of \(\Phi(1.96)\).

`pnorm(1.96)`

`## [1] 0.9750021`

A general strategy in both analytical and numerical mathematics is
** when a function is too complex to handle, we approximate it
by simple functions**. However, there is a key difference
between two realms. In analytical integration, an integral is defined as
a limit. So, it usually does not matter whether we approximate the
integrand by functions or constant values. In numerical integration, on
the other hand, we always have to settle for finite granularity, so the
choice of simple functions affects the performance of approximation.

In these notes, we consider linear and quadratic approximations. Specifically, the following are three methods in ascending order of sophistication:

- Trapezoidal rule
- Simpsonâ€™s rule
- Adaptive quadrature

These are

deterministicmethods for numerical integration. In contrast, Monte Carlo integration is a class ofstochasticmethods.

A test problem is to integrate \(f(x) = x\sin(x)+5\) over \([0,3\pi]\), which can be analytically computed using integration by parts. \[\begin{align} \int_0^{3\pi} x\sin(x)+5 dx &= \int_0^{3\pi} x\sin(x) dx + \int_0^{3\pi} 5 dx\\ &= \left( -x\cos(x)\Bigr\rvert_0^{3\pi} + \int_0^{3\pi}\cos(x)dx \right) + 15\pi\\ &= 18\pi\\ &\approx 56.5486677646 \end{align}\]

```
<- function(x) x*sin(x)+5
f integrate(f, 0, 3*pi) # R built-in function
```

`## 56.54867 with absolute error < 8e-12`

Given a finite set of subintervals, the trapezoidal rule approximates the area under \(y=f(x)\) by the finite sum of trapezoid areas, which are easy to compute. For example, using \(n=6\) subintervals,

```
curve(x*sin(x)+5, xaxt="n", lwd=1.5, from=-.5, to=9.9, ylab="y")
abline(h=0)
title("Illustration of the trapezoidal rule")
<- seq(0, 3*pi, .5*pi)
x # How can we generate these expressions using a loop?
<- c(expression(x[0]==0), expression(x[1]), expression(x[2]),
letters expression(x[3]), expression(x[4]), expression(x[5]),
expression(x[6]==3*pi))
axis(1, at=x, labels=letters)
<- c(0, .5*pi, pi, 1.5*pi, 2*pi, 2.5*pi, 3*pi)
from_x <- rep(0,7)
from_y <- from_x
to_x <- sapply(from_x, function(x) x*sin(x)+5)
to_y segments(from_x, from_y, to_x, to_y, lty=3)
<- from_x[-length(from_x)]
from_x <- to_y[-length(to_y)]
from_y <- to_x[-1]
to_x <- to_y[-1]
to_y segments(from_x, from_y, to_x, to_y, lty=3)
```

and using \(n=12\) subintervals,

```
curve(x*sin(x)+5, xaxt="n", lwd=1.5, from=-.5, to=9.9, ylab="y")
abline(h=0)
title("Illustration of the trapezoidal rule")
<- seq(0, 3*pi, length.out=13)
x # How can we generate these expressions using a loop?
<- c(expression(x[0]==0), expression(x[1]), expression(x[2]),
letters expression(x[3]), expression(x[4]), expression(x[5]),
expression(x[6]), expression(x[7]), expression(x[8]),
expression(x[9]), expression(x[10]), expression(x[11]),
expression(x[12]==3*pi))
axis(1, at=x, labels=letters)
<- seq(0, 3*pi, length.out=13)
from_x <- rep(0,13)
from_y <- from_x
to_x <- sapply(from_x, function(x) x*sin(x)+5)
to_y segments(from_x, from_y, to_x, to_y, lty=3)
<- from_x[-length(from_x)]
from_x <- to_y[-length(to_y)]
from_y <- to_x[-1]
to_x <- to_y[-1]
to_y segments(from_x, from_y, to_x, to_y, lty=3)
```