R code for estimating the parameters of an unscented Kalman filter model using likelihood maximization

In one of my previous blog posts I showed how to implement and apply the Unscented Kalman Filter (UKF) in R.
In this post I will demonstrate how to fit unknown parameters of an UKF model by means of likelihood maximization.

Suggestions and/or questions? Please contact Stefan Gelissen (email: info at datall-analyse.nl).
See this page for an overview of all of Stefan’s R code blog posts.

R code

#The R functions for the Unscented Kalman filter (UKF) and its
#smoother (=UKFsmooth) can be downloaded from: 
source("http://www.datall-analyse.nl/R/UKF.R")
#Take a look at the function UKF, and notice that at the beginning of
#the script you will find a description of the function's arguments.
UKF



##EXAMPLE 1

#The following example can be found in Julier, Uhlmann, and Durrant-Whyte's
#2000 paper "A New Method for the Nonlinear Transformation of Means
#and Covariances in Filters and Estimators".

#In this example we will estimate the altitude (=y), velocity (=vy),
#and a (constant) ballistic coefficient (=bc) of a vertically falling body.
#A radar is located at a specific altitude (=H) above ground level.
#The horizontal distance between the radar and the
#vertically falling body will be denoted by M.
#The radar measures the range (=r), which is equal to the
#hypotenuse of a triangle with base M and height y-H. Thus, the range
#can be computed as sqrt(M^2 + (y-H)^2).

#The 3 state equations for this system are:
#1) d(y)/dt = -vy + w1
#2) d(vy)/dt = -exp(-gamma*y)*vy^2*bc + w2
#3) d(bc)/dt = w3
#where w1, w2, and w3 are process noises, and gamma is a constant 
#defining the relationship between air density and altitude.
#In this example we will set gamma to 5e-5 ft.

#The measurement equation is:
#yk = sqrt(M^2 + (y-H)^2) + v
#where v is the measurement noise (variance), which is given as 10000 ft^2.
#Furthermore, both M and H are set to 100000 ft.  



##TRUE STATES

#Generate the true states for y (=altitude), vy (=velocity), and
#bc (=ballistic coefficient).
#Data are generated using the forward Euler method.
dt <- 1/5000 #time step for Euler integration
tt <- 30 #upper bound of time window
st <- seq(0, tt, by=dt) #lower time bounds of the integration intervals
ns <- length(st) #number of Euler integrations
x <- matrix(0, ncol=3, nrow=ns) #specify matrix for states
x[1,] <- c(3e+5, 2e+4, 1e-3) #state values at t=0
colnames(x) <- c("y", "vy", "bc")

#parameters in the model
gamma <- 5e-5
M <- 1e+5
H <- 1e+5

#simulate true states
for (i in 2:ns) {
  #altitude (=y)
  x[i,1] <- x[i-1,1] + (-x[i-1,2])*dt
  #velocity (=vy)
  x[i,2] <- x[i-1,2] + (-exp(-gamma*x[i-1,1])*x[i-1,2]^2*x[i-1,3])*dt
  #ballistic coefficient (=bc)
  x[i,3] <- x[i-1,3]
}

par(mfrow=c(2, 1))
#plot of altitude (y) against time
plot(st, x[,1], type="l", ylab="Altitude", xlab="Time (seconds)")

#plot of velocity (vy) against time
plot(st, x[,2], type="l", ylab="Velocity", xlab="Time (seconds)")
par(mfrow=c(1, 1))



##MEASUREMENTS

#Take measurements with a sample time of .1
dT <- .1 #sample time for measurements
#you may change the value of dT and see how it influences
#the behavior of the unscented Kalman filter
nm <- tt/dT #total number of measurements
mt <- seq(dT, tt, dT) #measurement times

#Standard deviation for the measurement noise
sigmaR <- sqrt(1e+4)

#Measurements at specified measurement times
r <- sapply(1:nm, function(i) sqrt(M^2 + (x[i*((ns-1)/nm) + 1, 1] - H)^2) +
              rnorm(1, 0, sigmaR))
head(cbind(mt, r),n=15) #check the generated measurements

#However, the actual measurements are made with a frequency of 2 Hz (and not
#with the sample time of .1 that was used above). Therefore, we are going to
#replace the values in between the actual measurements with NA.
#Note, however, that the Kalman filter can handle missing values (NA's).
#Moreover, the Kalman filter will actually compute the state
#values (for y, vy, and bc) at the steps in between the
#measurements (i.e., where we will now set the values to NA).
dataEx1 <- rep(NA, length(r))
#set the measurement values at intermediate steps to NA
dataEx1[c(seq(.5/dT, length(r), .5/dT))] <- r[c(seq(.5/dT, length(r), .5/dT))]
head(cbind(mt, dataEx1),n=15) #check the generated measurements

#plot the generated measurements (including the NA's)
plot(st, sqrt(M^2 + (x[,1]-H)^2), type="l", xlab="Time", ylab="Range",
     ylim=range(r, na.rm=TRUE), col=gray(level=.8))
points(mt, dataEx1, col="darkgreen", cex=.5)
legend("topright", pch=c(NA, 1), lty=c(1, NA),
       col=c(gray(level=.8), "darkgreen"),
       legend=c("true", "measured"),
       bty="n", y.intersp=1.2, cex=.7)



##ESTIMATING THE PARAMETERS OF AN UNSCENTED KALMAN FILTER (UKF) MODEL

#Specify the state transition function:
#note that we will use as state functions the difference equations
#given by Euler's forward method. These difference equations will yield valid 
#estimates for y, vy, and bc at each time step as long
#as the specified value for dT above is relatively small.
#WARNING: always use arguments x and k when specifying the GGfunction
GGfunction <- function (x, k){
  y <- x[1]; vy <- x[2]; bc <- x[3]
  c(y + (-vy)*dT,
    vy + (-exp(-gamma*y)*vy^2*bc)*dT,
    bc)}

#Specify the observation/measurement function:
#WARNING: always use arguments x and k when specifying the FFfunction
FFfunction <- function (x, k){
  y <- x[1]; vy <- x[2]; bc <- x[3]
  c(sqrt(M^2 + (y-H)^2))}



#Function for computing the negative log-likelihood of an UKF model
llikss <- function (x, data) {
  #Specify dynamic model having 3 states, namely [y, vy, bc].
  mod <- list(
    #Initial state estimates for altitude (=y), velocity (=vy), and
    #ballistic coefficient (=bc).
    #Note that the specified initial state estimate (at t=0) below for bc
    #deviates from the value that was used for generating the data (i.e., 1e-3).
    #You may change these initial state estimates too and see how they
    #influence the behavior of the unscented Kalman filter.
    m0=c(3e+5, 2e+4, 3e-5),
    #Error covariances of the initial state estimates:
    #this matrix reflects the uncertainty in our initial state estimates.
    #You may change the values in this matrix and see how they influence
    #the behavior of the Kalman filter.
    C0=diag(c(1e+6, 4e+6, 1e-2)),
    #Measurement noise (to be estimated)
    V=exp(x[1]),
    #Process noise
    W=diag(rep(0,3)))
  #Note that all covariances in the process noise matrix (W) are set to zero.
  #This makes sense since the change in y, vy and bc at each time step is fully
  #explained by the model describing the vertically falling object. In other
  #words, we assume that we correctly specified the model for our vertically
  #falling object, and we furthermore assume that there are no random disturbances
  #influencing the three states (y, vy, and bc) at each specified time step.
  
  #compute and return the negative log-likelihood
  UKF(y=data, mod=mod,
      FFfunction=FFfunction, GGfunction=GGfunction,
      simplify=TRUE, logLik=TRUE)$logLik
}

#Minimize the negative log-likelihood:
#instead of maximizing the log-likelihood, we will minimize the negative
#log-likelihood.
#Use some arbitrary starting value for the to be estimated measurement noise
#(here we will use log(1e+2), but it is possible to use some other
#arbitrary starting value).
mod <- optim(log(1e+2), llikss,
             lower=log(1e-8),
             method="L-BFGS-B", hessian=TRUE,
             data=dataEx1)
#In case the BFGS method generates an error, then try using the
#Brent method for this one-dimensional optimization problem.

#Check for convergence
mod$message
mod$convergence #successful optimization yields value 0

#Estimated measurement noise
exp(mod$par)

#Construct 95% confidence interval for the estimated measurement variance
seParms <- sqrt(diag(solve(mod$hessian)))
exp(mod$par + qnorm(.05/2)*seParms%o%c(1,-1))

#True value for the measurement noise
sigmaR^2




##EXAMPLE 2
#Similar to Example 1, but in this example we suppose that we also
#measured the velocity at each time step.

##MEASUREMENTS

#Take measurements with a sample time of .1
dT <- .1 #sample time for measurements
#you may change the value of dT and see how it influences
#the behavior of the unscented Kalman filter
nm <- tt/dT #total number of measurements
mt <- seq(dT, tt, dT) #measurement times

#Standard deviations for the measurement noise
sigmaR <- sqrt(1e+4)
sigmaVY <- sqrt(1e+5)

#Measurements at specified measurement times

#range
r <- sapply(1:nm, function(i) sqrt(M^2 + (x[i*((ns-1)/nm) + 1, 1] - H)^2) +
              rnorm(1, 0, sigmaR))
head(cbind(mt, r), n=15) #check the generated measurements

#velocity
vym <- sapply(1:nm, function(i) (x[i*((ns-1)/nm) + 1, 2]) + rnorm(1, 0, sigmaVY))
head(cbind(mt, vym), n=15) #check the generated measurements

#However, we assume that the actual measurements are again taken
#with a frequency of 2 Hz. Thus, we will set generated
#values at steps in between these actual measurements to NA.

#set the values at intermediate steps to NA for the range measurements
dataEx2r <-rep(NA, length(r))
dataEx2r[c(seq(.5/dT,length(r), .5/dT))] <- r[c(seq(.5/dT,length(r), .5/dT))]

#set the values at intermediate steps to NA for the velocity measurements
dataEx2v <-rep(NA, length(vym))
dataEx2v[c(seq(.5/dT,length(vym), .5/dT))] <- vym[c(seq(.5/dT,length(vym), .5/dT))]

#plot the generated measurements for velocity (including NA's)
plot(st, x[,2], type="l", xlab="Time", ylab="Velocity",
     ylim=range(vym, na.rm=TRUE), col=gray(level=.8))
points(mt, dataEx2v, col="darkgreen", cex=.5)
legend("topright", pch=c(NA, 1), lty=c(1, NA),
       col=c(gray(level=.8), "darkgreen"),
       legend=c("true", "measured"),
       bty="n", y.intersp=1.2, cex=.7)

#Store measurements
dataEx2 <- cbind(dataEx2r, dataEx2v)
head(cbind(mt, dataEx2), n=15) #check the generated measurements



##ESTIMATING THE PARAMETERS OF AN UNSCENTED KALMAN FILTER (UKF) MODEL

#Specify the state transition function:
#WARNING: always use arguments x and k when specifying the GGfunction
GGfunction <- function (x, k){
  y <- x[1]; vy <- x[2]; bc <- x[3]
  c(y + (-vy)*dT,
    vy + (-exp(-gamma*y)*vy^2*bc)*dT,
    bc)}

#Specify the observation/measurement function:
#WARNING: always use arguments x and k when specifying the FFfunction
FFfunction <- function (x, k){
  y <- x[1]; vy <- x[2]; bc <- x[3]
  c(sqrt(M^2 + (y-H)^2),
    vy)}


#Function for computing the negative log-likelihood
llikss <- function (x, data) {
  #Specify dynamic model having 3 states, namely [y, vy, bc].
  mod <- list(
    #Initial state estimates for altitude (=y), velocity (=vy), and
    #ballistic coefficient (=bc).
    #Note that the specified initial state estimate (at t=0) below for bc
    #deviates from the value that was used for generating the data (i.e., 1e-3).
    #You may change these initial state estimates too and see how they
    #influence the behavior of the unscented Kalman filter.
    m0=c(3e+5, 2e+4, 3e-5),
    #Error covariances of the initial state estimates:
    #this matrix reflects the uncertainty in our initial state estimates.
    #You may change the values in this matrix and see how they influence
    #the behavior of the Kalman filter.
    C0=diag(c(1e+6, 4e+6, 1e-2)),
    #Measurement noise (to be estimated)
    V=matrix(c(exp(x[1]), 0,
               0, exp(x[2])), nrow=2, byrow=TRUE),
    #Process noise
    W=diag(rep(0,3)))
  #Note that all covariances in the process noise matrix (W) are set to zero.
  #This makes sense since the change in y, vy and bc at each time step is fully
  #explained by the model describing the vertically falling object. In other
  #words, we assume that we correctly specified the model for our vertically
  #falling object, and we furthermore assume that there are no random disturbances
  #influencing the three states (y, vy, and bc) at each specified time step.
  
  #Before computing the negative log-likelihood for the current parameter
  #estimates, first check if the resulting covariance matrix
  #for the measurement noise (=V) is positive semidefinite.
  #If the resulting covariance matrix appears not to be positive
  #semidefinite, then assign NaN to the negative log-likelihood value.
  if (all(eigen(mod$V)$values >= 0))
    UKF(data, mod,
        FFfunction=FFfunction, GGfunction=GGfunction,
        simplify=TRUE, logLik=TRUE)$logLik
  else NaN
}


#Minimize the negative log-likelihood.
#Use some arbitrary starting values for the to be estimated measurement
#covariances (here we will use log(1e+2), but it is possible
#to use some other arbitrary starting value).
mod <- optim(rep(log(1e+2), 2), llikss,
             lower=rep(log(1), 2),
             method="L-BFGS-B", hessian=TRUE,
             data=dataEx2)

#Check for convergence
mod$message
mod$convergence #successful optimization yields value 0

#Estimated measurement covariances
exp(mod$par)

#Construct 95% confidence interval for the estimated measurement covariances
seParms <- sqrt(diag(solve(mod$hessian)))
exp(mod$par + qnorm(.05/2)*seParms%o%c(1,-1))

#True values for the measurement covariances
c(sigmaR^2, sigmaVY^2)

#Check by hand if the resulting V-matrix is indeed positive semidefinite
v <- matrix(c(exp(mod$par)[1], 0,
              0, exp(mod$par)[2]), nrow=2, byrow=TRUE)
all(eigen(v)$values >= 0)



#In case the BFGS method generates an error, then try using the
#more robust Nelder-Mead method
mod <- optim(rep(log(1e+2), 2), llikss,
             method="Nelder-Mead", hessian=TRUE,
             data=dataEx2)
#In case this Nelder-Mead generates an error, then try using
#the same Nelder-Mead but with hessian=FALSE

#Check for convergence
mod$convergence #successful optimization yields value 0

#Estimated measurement covariances
exp(mod$par)

#Construct 95% confidence interval for the estimated measurement covariances
seParms <- sqrt(diag(solve(mod$hessian)))
exp(mod$par + qnorm(.05/2)*seParms%o%c(1,-1))