In this practical you will get a taste of fitting (Bayesian) multilevel models for intensive longitudinal data, specifically a multilevel VAR(1) model. It takes a while to run such models, so it is wise to first run all of the code in one R session so that the analysis can proceed, and then more slowly walk through this practical as the other sessions is busy estimating things.
The analyses depend on four R
-packages: Hmisc, for loading the spss data file; rjags, for fitting the Bayesian models; foreign, for loading an SPSS data file; lme4, for running frequentist multilevel analyses; coda, for running summary stats and plots on the Bayesian analysis results. In order for these analyses to run on your computer, you will also need to download and install JAGS, because rjags depends on this program.
#install(foreign)
#install(rjags)
#install(coda)
#install(lme4)
library(foreign)
library(rjags)
library(coda)
library(lme4)
We will work with a simulated dataset that consists of two variables, measured for 50 individuals, with each 50 repeated measures. We will fit a model with random means, autoregressive effects, and cross-lagged effects. Take a look at the structure of the data.
mldata=read.table("multileveldata.txt", header=TRUE)
nr=6 ##number of random parameters
tt= 50 ## number of time points
np= 50 #number of subjects
For the Bayesian model, we need to specify priors for all parameters, and today we aim to specify uninformative priors. For most of the parameters in the model it is not difficult to specify uninformative priors, but for the precision (inverse covariance) matrix of the random means, autoregressive effects and cross-lagged effects this is more tricky.
We will specify a Wishart prior for this matrix, but the Wishart prior easily becomes very informative when variances are close to zero. Because our autoregressive effects are restricted in range (typically between 0 and 1 for psychological data) the variance for the coefficients will be small. The variance for cross-lagged effects is also typically small. To deal with this, we will specify a data based prior: we will obtain a pre-estimate of the precision matrix for a simplified model, and use that to specify our prior. Although this means we will use the data twice, and this will result in slightly too small credible intervals for this covariance matrix, this performs better than most other priors (Schuurman, Grasman, Hamaker, 2016). Mplus v8 solves this in a different way, by specifying default uninformative improper priors (this is not an option in JAGS), which works well for multilevel VAR models that are not very extensive (however problems may return with more complicated models, such as multivariate models with measurement error). It is always important to keep an eye on your priors.
To obtain a pre-estimate for the variances of our covariance matrix for the random effects, we will use lme4 to fit frequentist multilevel VAR models to the data. This is also what is done to fit multilevel VAR models in popular package MLVAR in R. We will fit the model equation by equation; one model for Y1 as the dependent variable, and one model with Y2 as the dependent variable. Of course, this means we ignore the associations between the residuals in the model, and we ignore potential correlations between the random effects in the first and second half of the model. Furthermore, missing observations are typically listwise deleted in these packages, which can be highly problematic in time series model. Here we have simulated data without missings, so this will not be an issue. We will extract the estimated variances for the random effects, specify a precision matrix based on this for our Wishart priors, and later we will feed this as data to our JAGS model code.
An alternative in practice to these lme4-based priors is (for instance) to fit a simplified Bayesian model with uniform priors on the variances of the random effects (ignoring the covariances between the random effects) to get pre-estimates of the variances of the random effects, or to use other (weakly) informative priors. Stan allows for specifying weakly informative priors on decompositions of covariance matrices, which may also help.
out1=lmer(Y1 ~ Y1lagcent + Y2lagcent + (1+Y1lagcent + Y2lagcent|ppn), data=mldata) ###estimate first univariate model in lme4
out2=lmer(Y2 ~ Y1lagcent + Y2lagcent + (1+Y1lagcent + Y2lagcent|ppn), data=mldata) ###estimate second univariate model in lme4
####Calculate ML prior input####
###Read out the estimated variances from the lme4 output and put it into matrix W, for the prior specification of the
###covariance matrix of the random parameters###
###see Schuurman, Grasman and Hamaker###
wmat1=VarCorr(out1)
wmat2=VarCorr(out2)
wvars1= c(wmat1$ppn[1],wmat1$ppn[5],wmat1$ppn[9])
wvars2= c(wmat2$ppn[1],wmat2$ppn[5],wmat2$ppn[9])
df=6 ###degrees of freedom for the Wishart in WinBUGS
W=matrix(0,6,6) #W is the scale matrix for the Wishart in WinBUGS/JAGS
W[1,1]= df*wvars1[1]
W[2,2]= df*wvars2[1]
W[3,3]= df*wvars1[3]
W[4,4]= df*wvars2[2]
W[5,5]= df*wvars1[2]
W[6,6]= df*wvars2[3]
Take a look at the multilevel model code for JAGS below:
MLVAR1model <-"
####WinBUGS/JAGS model specification for a bivariate multilevel autoregressive model with a Wishart prior for the
#covariance matrix of the random parameters (scale matrix is calculated in the accompanying Rscript).####
model{
###model for timepoint 2 to nt######
for (j in 1:np)
{
for (t in ((j-1)*tt+2):(j*tt))
{
f[t,1:2]~dmnorm(muf[t,1:2],Qpre[1:2,1:2]) ###the data is multivariate normal distributed
muf[t,1]<-b[j,1] + b[j,3]*z[t-1,2] + b[j,5]*z[t-1,1] ###for Y1
muf[t,2]<-b[j,2] + b[j,6]*z[t-1,2] + b[j,4]*z[t-1,1] ###for Y2
z[t,1]<- f[t,1] - b[j,1]
z[t,2]<- f[t,2] - b[j,2]
}
}
####Model specification for time point 1###
for (j in 1:np)
{
f[((j-1)*tt+1),1:2]~dmnorm(muf1[j,1:2],Qpre[1:2,1:2]) ###timepoint 1
z[((j-1)*tt+1),1]<- f[((j-1)*tt+1),1] - b[j,1] ###timepoint 1 z
z[((j-1)*tt+1),2]<- f[((j-1)*tt+1),2] - b[j,2] ###timepoint 1 z
muf1[j,1]<-b[j,1] + b[j,3]*z0[j,2] + b[j,5]*z0[j,1]
muf1[j,2]<-b[j,2] + b[j,6]*z0[j,2] + b[j,4]*z0[j,1]
z0[j,1] ~dnorm(0,.5)
z0[j,2] ~dnorm(0,.5)
}
####priors###########
###for the 2x2 covariance matrix of the innovations, we can avoid the Wishart using uniform priors on the variances and correlation###
Qpre[1:2,1:2]<- inverse(Qvar[1:2,1:2]) ###Qpre is the precision matrix for the innovations
Qvar[1,1] ~dunif(0,10)
Qvar[2,2] ~dunif(0,10)
Qvar[1,2] <- Qcor*sqrt(Qvar[1,1])*sqrt(Qvar[2,2])
Qcor ~ dunif(-1,1)
Qvar[2,1] <- Qvar[1,2]
###prior for random parameters###
for (j in 1:np)
{
b[j,1:6]~dmnorm(bmu[1:6],bpre[1:6,1:6]) ## the means and regression coefficients are multivariate normal distributed
}
####priors for the fixed effects###
bmu[1]~dnorm(0,.000000001)
bmu[2]~dnorm(0,.000000001)
bmu[3]~dnorm(0,.000000001)
bmu[4]~dnorm(0,.000000001)
bmu[5]~dnorm(0,.000000001)
bmu[6]~dnorm(0,.000000001)
###priors for the precision matrix of the random effects, and calculate the covariance matrix and correlations
bpre[1:6,1:6]~dwish(W[,],df) ###the scale matrix W for the Wishart prior was calculated in R and then fed to winbugs/JAGS.
df<- 6
bcov[1:6,1:6] <- inverse(bpre[1:6,1:6])
for (d in 1:6)
{
for (g in 1:6){
bcor[d,g] <- bcov[d,g] / ( sqrt(bcov[d,d]) * sqrt(bcov[g,g]) )
}
}
}
"
MLVAR1model.txt<-textConnection(MLVAR1model)
Next, we will specify intitial values and prepare the data so it can entered into JAGS.
f=mldata[,2:3] ##the observed data for our model
dat=list("np"=np,"f"=f,"tt"=tt,"W"=W)###list with all the data for JAGS; number of individuals, the observations, the number of time points, and our matrix that we prepared for the Wishart prior.
###specify some starting values, for example: ####
stbpre1 = matrix(c(.01, 0, 0, 0, 0, 0, ###starting values for the covariance matrix of the random effects
0,.01,0, 0,0,0,
0,0, .01, 0 ,0,0,
0, 0, 0,0.01, 0, 0,
0, 0, 0, 0, .01, 0,
0, 0, 0, 0, 0, .01),6,6,byrow=TRUE)
stbpre2 = matrix(c(.015, 0, 0, 0, 0, 0,
0,.015,0, 0,0,0,
0,0, .015, 0 ,0,0,
0, 0, 0,0.015, 0, 0,
0, 0, 0, 0, .015, 0,
0, 0, 0, 0, 0, .015),6,6,byrow=TRUE)
stbpre3 = matrix(c(.005, 0, 0, 0, 0, 0,
0,.005,0, 0,0,0,
0,0, .005, 0 ,0,0,
0, 0, 0,0.005, 0, 0,
0, 0, 0, 0, .005, 0,
0, 0, 0, 0, 0, .005),6,6,byrow=TRUE)
phistart = matrix(0,np,nr) ##inition values for the individual-specific coefficients
phistart2 = phistart + .1
phistart3 = phistart + .2
inits1 <- list( bmu=c(0,0,0,0,1,1), b=phistart, bpre=stbpre1) ###for chain1
inits2 <- list( bmu=c(0.1,0.1,0.1,0.1,2,2), b=phistart2, bpre=stbpre2) ###for chain 2
inits3 <- list( bmu=c(0.2,0.2,0.2,0.2,1.5,1.5), b=phistart3, bpre=stbpre3) ###for chain 3
ins <- list(inits1,inits2) ###inits that are provided to winbugs. Currently specified for two chains; list all three inits if running 3 chains
Now we will fit the model to the data using JAGS. We specify we will estimate our model based on the data and initial values we prepared in the previous step, and will fit 2 chains. We will use a burnin of 10000 samples/iterations, and the same number of samples we will use for the results (typically we would run more iterations, but we have limited time during the practical). We save samples for the parameters bmu, bcov, Qvar, Qcor, and b. Take a look at the model code to figure out which parameter is which.
jags <- jags.model('MLVAR1model.txt',
data = dat,
inits=ins,
n.chains = 2) ###compile and initialize jags AR model
update(jags, 10000) ####burnin
samps=coda.samples(jags,
c('bmu','bcov', 'Qvar','Qcor', 'b'),
10000) ### take samples next to burnin
Now that we have fitted the model, lets evaluate the convergence of the model using trace plots (should look like well mixing, fat hairy caterpillars), density plots (should look smooth, and usually normal), and the gelman-rubin statistic (potential scale reduction factor) can be run for specific parameters too (should be very very close to 1).
plot(samps,trace=FALSE) ##plot densities
#gelman.plot(samps) ###plot gelman rubin stats
#gelman.diag(samps) ###table gelman rubin stats
plot(samps,density=FALSE) ##traceplots
Now lets take a look at the estimated effects. In the summary results you find the mean estimate for each parameter and the posterior standard deviation, and the quantiles for the posterior distribution. The first and last quantile in the results can be used to form the lower and upper bound of a 95% credible interval for each parameter.
What does the model look like for the average person? Is there a lot of variance in the model parameters across persons? What does the model look like for individual 5? And what about individual 25? Draw a path model for the average person, and both individuals.
summary(samps) ##results parameters estimates
summary(samps)$quantiles["bmu[1]",] ##you can request the quantiles for specific parameters like this. Here we request the quantiles for parameter bmu[1] in the model.
This is the end of the multilevel practical. I hope you’ve been able to get an impression of multilevel modeling of intensive longitudinal data, and the individual differences in dynamic processes.