Implementing a particle filter in C++ using Rcpp (part 1)

In one of my previous blog posts I demonstrated how to implement a particle filter in R.

In this post I’m going to implement the same particle filter, but this time I’ll program the filter in C++. The R packages Rcpp (version 0.12.11) and RcppArmadillo (version 0.7.800.2.0) will make it possible to subsequently run this filter in R, since these two R packages allow us to connect C++ to R.

However, for running this particle filter in R you need a C++ compiler. For installing a working C++ compiler you may want to check out this Rcpp manual by Hadley Wickham.

My next blog post will show how to implement in C++ backward-simulation particle smoothing and a method for forecasting with the particle filter.

Suggestions and/or questions? Please contact Stefan Gelissen (email: info at datall-analyse.nl).
See this page for an overview of all of Stefan’s R code blog posts.

 

Implementation of the particle filter in C++
The C++ code implements the particle filter (PF) as described by Crassidis and Junkins in their 2012 book Optimal estimation of dynamic systems. This PF is also referred to as the bootstrap filter.

Download the C++ code (called PF.cpp) for the particle/bootstrap filter from
here and save it to some folder on your computer.
For Example 1 to work, you also need to download this C++ header file (called PF_example1.h), and save it into the same folder on your computer as the PF.cpp file.

R code

library(Rcpp)
library(RcppArmadillo)

#Note that the implemented particle filter has the following arguments:
#-y is the data. The data has to be a n x d matrix, where n are the number of
# observations/time steps (i.e, sample size), and d the number of dimensions.
# The data may be one-dimensional (one-dimensional measurements / univariate time
# series) or multi-dimensional (multidimensional measurements / multivariate time
# series). In the multi-dimensional case, each of the columns contains the
# measurements on a single dimension (or, similarly, a single [univariate] time
# series).
# Missing values (NA's) are allowed.
#-mod is a list with the components m0 (initial state estimates),
# C0 (error covariances of the initial state estimates), V (measurement noise),
# and W (process noise). All components of this list have to be in matrix form.
#-inputControl is an optional matrix that specifies the control input for each of
# the states. For control input that is known to be constant over time, the matrix
# is 1 x p (where p is the number of states). For time varying control input, the
# matrix is n x p (where n is the number of observations/time steps, and p the number
# of states). 
#-whichGGfunction is an optional integer number that is used by the particle filter
# to select among differently specified GGfunctions. Note that the GGfunction
# specifies the state transition function.
#-whichFFfunction is an optional integer number that is used by the particle filter
# to choose among differently specified FFfunctions. Note that the FFfunction
# specifies the observation/measurement function.
#-N is the number of particles.
#-Nthreshold is a number specifiying the resampling threshold. That is, resampling
# takes place when the effective sample size falls below the threshold.
#-resampling is the resampling method to be applied: either "mult"
# (=multinomial resampling), "strat" (=stratified resampling), or
# "sys" (=systematic resampling).
#-roughening: if TRUE, then perform roughening.
#-Grough is a number specifying the roughening tuning parameter.
#-MCparticles: if TRUE, then include the state estimates (=xpt), and
# the importance weights (=wt) of each particle at each time step in the output.
#-logLik: if TRUE, then return the negative loglikelihood of the specified model.
#-pseudoInverse: if TRUE, then use the pseudo-inverse for computing the inverse
# of the V matrix (=measurement noise). Consider using the pseudo-inverse in case
# of a nearly singular V matrix.
#-simplify: if TRUE, then do not include the data (=y) in the output.

#Finally, note that this Particle Filter assumes (time invariant) additive
#Gaussian noise.



##EXAMPLE 1

#The following example can be found in Crassidis and Junkins's
#2012 book "Optimal estimation of dynamic systems".

#In this example we will estimate the state of the following
#nonlinear discrete-time system:
#x(k+1) = x(k)/2 + 25*x(k)/(1+x(k)^2) + 8*cos(1.2*k) + W
#where x(k+1) is the state at time step k+1, x(k) the state at time
#step k, and W the process noise (variance). We will set W to 10 in
#this example.

#The measurement equation is:
#y(k) = x(k)^2/20 + V
#where V is the measurement noise (variance). We will set V to 1.



##TRUE STATES AND MEASUREMENTS

#sample time
dT <- 1

#observation times
t <- seq(1, 75, dT)

#process noise (standard deviation)
Wk <- sqrt(10)

#measurement noise (standard deviation)
Vk <- sqrt(1)

#state at t=0
X0 <- 0 + rnorm(1)*sqrt(5)

#simulate true state for x at the observation times
Xk <- matrix(NA, nrow=length(t)+1, ncol=1)
Yk <- matrix(NA, nrow=length(t), ncol=1)
Xk[1] <- X0

for (i in 2:(length(t)+1)) {
  Xk[i] <- Xk[i-1]/2 + (25*Xk[i-1])/(1+Xk[i-1]^2) + 8*cos(1.2*(i-1)) + rnorm(1)*Wk
  Yk[i-1] <- (Xk[i]^2)/20 + rnorm(1)*Vk}

#plot simulated states
plot(c(0,t), Xk, type="o", xlab="Time, k", ylab="x(k)")

#plot measurements
plot(t, Yk, type="o", xlab="Time, k", ylab="y(k)")

#store measurement data
dataEx1 <- Yk



##PARTICLE/BOOTSTRAP FILTER (PF)

#Dynamic model:
#specifying 1 state, namely [x].
#Note that you may change the initial state estimate (at t=0) below for x
#and see how it influences the behavior of the particle filter.
ex1 <- list(m0=as.matrix(0), #initial state estimate
            #error covariance of the initial state estimate:
            #this vector reflects the uncertainty in our initial state estimate
            #you may change this value and see how it influences
            #the behavior of the particle filter
            C0=as.matrix(5),
            #measurement noise
            V=as.matrix(Vk^2),
            #process noise
            W=as.matrix(Wk^2))
#CAUTION: all the list components above have to be in the matrix form (if not,
#the PF filter in C++ won't work and subsequently returns an error message).


##Compute the filtered (a posteriori) state estimates with the PF

#Compile the C++ code using Rcpp
sourceCpp("yourFolder/PF.cpp")

#The GGfunction (=state transition function) and FFfunction (=observation/measurement
#function) for this dynamic model are coded in C++, and can be found in the
#header file called "PF_example1.h".
#The compiled Particle Filter will use the defined GGfunction and FFfunction
#due to the statement "#include <PF_example1.h>" (which can be found at the top
#of the C++ code in the "PF.cpp" file). 

#Run the particle filter in R
pf1 <- PFcpp(y=as.matrix(dataEx1), mod=ex1, N=500, resampling="strat",
             MCparticles=TRUE)

#Plot the filtered state estimate for the state x
plot(c(0,t), Xk, type="o", col=c(gray(level=.5)),
     ylim=range(cbind(pf1$m, Xk)),
     ylab="x(k)", xlab="Time, k")
lines(c(0,t), pf1$m, lty=2, col="blue", lwd=1)
legend("topleft", lty=c(1, 2),
       col=c(gray(level=.5), "blue"),
       legend=c("true state", "PF"),
       bty="n", y.intersp=1.2, cex=.7)


#We will also use a wrongly specified state transition function. This wrong model
#is also defined in the "PF_example1.h" file, and is selected by setting
#whichGGfunction=2.
pf1a <- PFcpp(y=as.matrix(dataEx1), mod=ex1, N=500, resampling="strat",
              whichGGfunction=2, MCparticles=FALSE)

#Plot the filtered state estimate for the wrongly specified state transition function
plot(c(0,t), Xk, type="o", col=c(gray(level=.5)),
     ylim=range(cbind(pf1a$m, Xk)),
     ylab="x(k)", xlab="Time, k")
lines(c(0,t), pf1a$m, lty=2, col="blue", lwd=1)
legend("topleft", lty=c(1, 2),
       col=c(gray(level=.5), "blue"),
       legend=c("true state", "PF"),
       bty="n", y.intersp=1.2, cex=.7)


#Construct error plots
par(mfrow=c(1, 2))

#Error plot for correct state transition function
rangeError <- max(abs(Xk-pf1a$m))
plot(c(0,t), abs(Xk-pf1$m), type="l", lty=1, col="blue",
     ylim=c(-1, rangeError), ylab="Error", xlab="Time, k",
     main="correct model")
abline(h=0, col=gray(level=.8), lty=2)
legend("topright", lty=c(1),
       col=c("blue"),
       legend=c("PF"),
       bty="n", y.intersp=1.2, cex=.7)

#Error plot for the wrong state transition function
rangeError <- max(abs(Xk-pf1a$m))
plot(c(0,t), abs(Xk-pf1a$m), type="l", lty=1, col="blue",
     ylim=c(-1, rangeError), ylab="Error", xlab="Time, k",
     main="wrong model")
abline(h=0, col=gray(level=.8), lty=2)
legend("topright", lty=c(1),
       col=c("blue"),
       legend=c("PF"),
       bty="n", y.intersp=1.2, cex=.7)

par(mfrow=c(1, 1))


#From now on, we will proceed with correct state transition function

#Compute the confidence intervals for the filtered state estimates of the PF
Clist <- sapply(1:dim(pf1$C)[3], function(x) pf1$C[ , , x], simplify=FALSE)
#95% ci for x
seFX <- sqrt(unlist(lapply(Clist, function(x) diag(x)[1])))
ciFX <- pf1$m[,1] + qnorm(.05/2)*seFX%o%c(1, -1)

#Plot confidence intervals for the state x
plot(c(0,t), Xk, type="l", col=c(gray(level=.7)), lwd=1.5,
     ylab="x(k)", xlab="Time, k")
lines(c(0,t), ciFX[,1], lty=2, col="blue")
lines(c(0,t), ciFX[,2], lty=2, col="blue")
legend("topright", lty=c(1, 2), 
       col=c(gray(level=.5), "blue"),
       legend=c("true state", "filtered state"),
       bty="n", y.intersp=1.2, cex=.7)


#Densities for the a posteriori states (estimated by the PF) at several
#specified time steps (i.e, k=20, k=30, k=40, and k=60).

par(mfrow=c(2, 2))

#at time, k=20
plot(density(pf1$xpt[,,20], weights=pf1$wt[20,]), main="Time, k=20")

#at time, k=30
plot(density(pf1$xpt[,,30], weights=pf1$wt[30,]), main="Time, k=30")

#at time, k=40
plot(density(pf1$xpt[,,40], weights=pf1$wt[40,]), main="Time, k=40")

#at time, k=60
plot(density(pf1$xpt[,,60], weights=pf1$wt[60,]), main="Time, k=60")

par(mfrow=c(1, 1))

In the following example we are going to employ a dynamic model that differs from the model we used in the previous example. As a consequence, we have to change our state transition and observation functions. More specifically, we have to change the body of our C++ code for the GGfunction (=state transition function) and the FFfunction (=observation function).
For the ease of use, I already changed the body of these functions, and you can download the C++ code of these newly defined functions from here.
After downloading this file, you must save it into exactly the same folder where you already saved the C++ code for the particle/bootstrap filter (the file that was called PF.cpp). Save the downloaded file as a header file (i.e., having the extension .h), and use a name such as PF_example2.h .
Subsequently, the particle filter code in PF.cpp needs to know that it has to use this new header file PF_example2.h . To this end, you have to change the the statement #include <PF_example1.h> (which can be found at the top of the C++ code in the PF.cpp file) into #include <PF_example2.h>.
Finally, note that every time you want to make changes to the body of the C++ header file of the GGfunction and FFfunction, you have to save the header file under some new name (using the extension .h) and change the statement #include <PF_example1.h> as I described above. In that way, you can be sure that the C++ code for the particle filter will certainly use the changed GGfunction and FFfunction when running the filter in R.

##EXAMPLE 2: ESTIMATING PARAMETERS OF A PARTICLE FILTER MODEL USING MAXIMUM LIKELIHOOD

#In this example we will apply the Particle Filter (PF) to the
#Lotka-Volterra system of nonlinear differential equations. The Lotka-Volterra
#equations describe how prey and predators interact.
#See the following Wikipedia article for more information on these equations:
#en.wikipedia.org/wiki/Lotka–Volterra_equations



##TRUE STATES

#Generate the amount of prey and predators based on the Lotka-Volterra equations.
#Data are generated using the forward Euler method.
dt <- 1/5000 #time step for Euler integration
tt <- 7 #upper bound of time window
st <- seq(0, tt, by=dt) #lower time bounds of the integration intervals
ns <- length(st) #number of Euler integrations
x <- matrix(0, ncol=2, nrow=ns) #specify matrix for states
x[1,] <- c(400, 200) #respective amounts of prey and predators at t=0
colnames(x) <- c("Prey", "Predators")

#Parameters in the Lotka-Volterra model
alpha <- 1
beta <- 1/300
delta <- 1/200
gamma <- 1

#Simulate true states
for (i in 2:ns) {
  #prey population
  x[i,1] <- x[i-1,1] + (alpha*x[i-1,1] - beta*x[i-1,1]*x[i-1,2])*dt
  #predator population
  x[i,2] <- x[i-1,2] + (delta*x[i-1,1]*x[i-1,2] - gamma*x[i-1,2])*dt
}

#Phase-space plot of simulated predator-prey system
plot(x[,1], x[,2], type="l", xlab="Prey", ylab="Predators")



##MEASUREMENTS:
#Take measurements of the prey population

#Take measurements with a sample time of .01
dT <- .01 #sample time for measurements
#You may change the value of dT and see how it influences the behavior of the
#particle filter.
#However, if you want to change this value for dT, then you also need to change
#the same value in the C++ code of the GGfunction. One such different value is
#implemented for whichGGfunction=2 in "PF_example2.h".
nm <- tt/dT #total number of measurements
mt <- seq(dT, tt, dT) #measurement times

#Standard deviations for the measurement noise
sigmaPrey <- 7 #prey

#Measurements at specified measurement times
yprey <- sapply(1:nm, function(i) x[i*((ns-1)/nm) + 1, 1] + rnorm(1, 0, sigmaPrey))

#Store measurement data
dataEx2 <- yprey

#Plot of time against measurements
plot(st, x[,1], type="l", xlab="Time", ylab="Prey",
     ylim=range(yprey))
lines(mt, yprey, col="darkgreen")



#However, we assume that the actual measurements are made at every 5th occasion (and
#not with the sample time of .1 that was used above). Therefore, we are going to
#replace the values in between the actual measurements with NA.
#Note, however, that the particle filter can handle missing values (NA's).
#Moreover, the particle filter will actually compute the two state
#values (for amount of prey and amount of predators) at the steps in between the
#measurements (i.e., where we will now set the values to NA).
dataEx2prey <- rep(NA, length(yprey))
#set the measurement values at intermediate steps to NA
dataEx2prey[c(seq(.05/dT, length(yprey), .05/dT))] <- yprey[c(seq(.05/dT, length(yprey), .05/dT))]
dataEx2a <- as.matrix(dataEx2prey)

#Check the data, and notice that we now have measurements at every 5th occasion
head(dataEx2a, 15)

#Plot of time against measurements
plot(st, x[,1], type="l", xlab="Time", ylab="Prey",
     ylim=range(yprey))
points(mt, dataEx2a, col="darkgreen", cex=.7)



##PARTICLE FILTER (PF)

#Dynamic model:
#specifying 2 states, namely [amount of prey, amount of predators].
#Note that the specified initial state estimates (at t=0) below for prey and
#predators deviate from the values that were used for generating the data
#(i.e., 400 for prey and 200 for predators).
#You may change these initial state estimates too and see how they
#influence the behavior of the extended Kalman filter.
ex2 <- list(m0=as.matrix(c(390, 220)), #initial state estimates
            #error covariances of the initial state estimates:
            #this matrix reflects the uncertainty in our initial state estimates
            #you may change the values in this matrix and see how they influence
            #the behavior of the Kalman filter
            C0=diag(rep(100,2)),
            #measurement noise
            V=as.matrix(sigmaPrey^2),
            #process noise
            W=diag(rep(0,2)))
#CAUTION: all the specified list elements above have to be in the matrix form (if not,
#the PF filter in C++ won't work and subsequently returns an error message).

#Furthermore, note that all covariances in the process noise matrix (W) are set to
#zero. This makes sense since the change in the amount of prey and predators at
#each time step is fully explained by our Lotka-Volterra model describing the
#prey-predator interaction.



##Compute the filtered (a posteriori) state estimates

#Compile the C++ code using Rcpp:
#before compiling the C++, don't forget to use the correct C++ code for the
#GGfunction and FFfunction (i.e., use "#include <PF_example2.h>" in the
#"PF.cpp" file).
sourceCpp("yourFolder/PF.cpp")

#One note on the state transition function (=GGfunction in the C++ code
#of the header file "PF_example2.h"):
#notice that we will use as state functions the difference equations
#given by Euler's forward method. These difference equations will yield valid 
#estimates for the amounts of prey and predators at each time step as long
#as the specified value for dT above is relatively small.
#In other words, these difference equations will approximate an exact solution
#to a system of nonlinear differential equations as long as dT is small.

#Run the particle filter in R
pf2 <- PFcpp(y=as.matrix(dataEx2a), mod=ex2, N=500, resampling="strat",
              Nthreshold=50, roughening=TRUE, Grough=.01, logLik=TRUE,
              MCparticles=FALSE)


#Plot the filtered state estimates
plot(x[,1], x[,2], type="l", lwd=2, col=gray(level=.5), 
     xlab="Prey", ylab="Predators")
lines(pf2$m[,1], pf2$m[,2], lty=2, col="blue", lwd=2)
legend("topright", lty=c(1, 2), lwd=c(2, 2),
       col=c(gray(level=.5), "blue"),
       legend=c("true state", "filtered state"),
       bty="n", y.intersp=1.2, cex=.7)

#Compute the 95% confidence intervals for the filtered state estimates
Clist <- sapply(1:dim(pf2$C)[3], function(x) pf2$C[ , , x], simplify=FALSE)
#95% ci for prey
sePe <- sqrt(unlist(lapply(Clist, function(x) diag(x)[1])))
ciPe <- pf2$m[,1] + qnorm(.05/2)*sePe%o%c(1, -1)
#95% ci for predators
sePr <- sqrt(unlist(lapply(Clist, function(x) diag(x)[2])))
ciPr <- pf2$m[,2] + qnorm(.05/2)*sePr%o%c(1, -1)

#Prey population
plot(st, x[,1], type="l", xlab="time", ylab="Prey",
     col=gray(level=.7), lwd=1.5)
lines(c(0,mt), ciPe[,1], lty=2, col="blue")
lines(c(0,mt), ciPe[,2], lty=2, col="blue")
legend("bottomright", lty=c(1, 2), 
       col=c(gray(level=.5), "blue"),
       legend=c("true state", "filtered state"),
       bty="n", y.intersp=1.2, cex=.7)

#Predator population
plot(st, x[,2], type="l", xlab="time", ylab="Prey",
     col=gray(level=.7), lwd=1.5)
lines(c(0,mt), ciPr[,1], lty=2, col="blue")
lines(c(0,mt), ciPr[,2], lty=2, col="blue")
legend("topright", lty=c(1, 2), 
       col=c(gray(level=.5), "blue"),
       legend=c("true state", "filtered state"),
       bty="n", y.intersp=1.2, cex=.7)



##Estimate parameters using Maximum likelihood:
#we will estimate the measurement variance (=V) of the prey population by ML. 

#First, however, we are going to explore the log-likelihood function in the
#neighborhood of the true value.

#Function for computing the negative log-likelihood
llikPF <- function (theta) {
  
  mod <- list(
    m0=as.matrix(c(390, 220)),
    C0=diag(100, 2),
    V=as.matrix(theta),
    W=diag(rep(0,2)))
  
  PFcpp(y=as.matrix(dataEx2a), mod=mod, N=500, resampling="strat",
        Nthreshold=50, roughening=TRUE, Grough=.01, logLik=TRUE,
        MCparticles=FALSE)$logLik
}


#Calculate log-likelihood values in the neighborhood of the true value
ll <- sapply(29:89, function (x) llikPF(x))

#Plot of the computed log-likelihood values
plot(29:89, ll, type="l",
     xlab=expression(sigma["prey"]), ylab="log-likelihood")
#Add true value
abline(v=sigmaPrey^2, col="red", lty=2)

#Notice that the curve of the log-likelihood function is very irregular.
#This irregular shape will be problematic for optimization methods such as
#BFGS. Nonetheless, we might try Simulated Annealing for finding
#the minimum of this irregular log-likelihood function.

library(GenSA)
#Use some plausible start value obtained from the plot of
#computed log-likelihood values
startValue <- 55
#Perform simulated annealing
mod2 <- GenSA(startValue, llikPF, lower=30, upper=80,
              control=list(threshold.stop=min(ll)-2, temperature=mean(ll),
                           visiting.param=2, acceptance.param=-1, max.time=60))
#Estimated parameter for V (true value: 50)
mod2$par
#Minimum value of the log-likelihood at the estimated parameter
mod2$value
#Compare with minimum value from the plot of the computed log-likelihood values
min(ll)