SOCR - User contributions [en]

SMHS BigDataBigSci SEM

2016-05-24T15:29:36Z

Pineaumi: /* See also */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Structural Equation Modeling (SEM) ==

SEM allow re-parameterization of random-effects to specify latent variables that may affect measures at different time points using structural equations. SEM show variables having predictive (possibly causal) effects on other variables (denoted by arrows) where coefficients index the strength and direction of predictive relations. SEM does not offer much more than what classical regression methods do, but it does allow simultaneous estimation of multiple equations modeling complementary relations.

SEM is a general multivariate statistical analysis technique that can be used for causal modeling/inference, path analysis, confirmatory factor analysis (CFA), covariance structure modeling, and correlation structure modeling.

===SEM Advantages===
* It allows testing models with multiple dependent variables
* Provides mechanisms for modeling mediating variables
* Enables modeling of error terms
* acilitates modeling of challenging data (longitudinal with auto-correlated errors, multi-level data, non-normal data, incomplete data)

SEM allows separation of observed and latent variables. Other standard statistical procedures may be viewed as special cases of SEM, where statistical significance less important, than in other techniques, and covariances are the core of structural equation models.

===Definitions===
*The disturbance, D, is the variance in Y unexplained by a variable X that is assumed to affect Y.
X → Y ← D

* Measurement error, E, is the variance in X unexplained by A, where X is an observed variable that is presumed to measure a latent variable, A.
A → X ← E

* Categorical variables in a model are exogenous (independent) or endogenous (dependent).

===Notation===

* In SEM observed (or manifest) indicators are represented by squares/rectangles whereas latent variables (or factors) represented by circles/ovals.

<center>[[Image:SMHS_BigDataBigSci1.png|500px]]</center>

*'''Relations: Direct effects''' (→), '''Reciprocal effects''' (↔ or ⇆), and '''Correlation or covariance''' (⤻ or ⤺) all have different appearance in SEM models.

===Model Components===

The measurement part of SEM model deals with the latent variables and their indicators. A pure measurement model is a confirmatory factor analysis (CFA) model with unmeasured covariance (bidirectional arrows) between each possible pair of latent variables. There are straight arrows from the latent variables to their respective indicators and straight arrows from the error and disturbance terms to their respective variables, but no direct effects (straight arrows) connecting the latent variables. The measurement model is evaluated using goodness of fit measures (Chi-Square test, BIC, AIC, etc.) Validation of the measurement model is always first.

Then we proceed to the structural model (including a set of exogenous and endogenous variables together with the direct effects (straight arrows) connecting them along with the disturbance and error terms for these variables that reflect the effects of unmeasured variables not in the model).

===Notes===

* Sample-size considerations: mostly same as for regression - more is always better.
* Model assessment strategies: Chi-square test, Comparative Fit Index, Root Mean Square Error, Tucker Lewis Index, Goodness of Fit Index, AIC, and BIC.>
* Choice for number of Indicator variables: depends on pilot data analyses, a priori concerns, fewer is better.

===[[SMHS_BigDataBigSci_SEM_Ex1|Hands-on Example 1 (School Kids Mental Abilities)]]===

===[[SMHS_BigDataBigSci_SEM_Ex2|Hands-on Example 2 (Parkinson’s Disease data)]]===

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM_sem_vs_cfa| Differences and Similarities between '''sem'''() and '''cfa'''() ]]
* [[SMHS_BigDataBigSci_GCM| Next Section: Growth Curve Modeling]]
* [[SMHS_BigDataBigSci_GCM| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_SEM}}

SMHS TimeSeriesAnalysis LOS

2016-05-24T14:37:07Z

Pineaumi: /* Footnotes */

==[[SMHS_TimeSeriesAnalysis| SMHS: Time-series Analysis]] - Applications ==

===Time series regression studies in environmental epidemiology (London Ozone Study 2002-2006)===
A time series regression analysis of a London ozone dataset including daily observations from 1 January 2002 to 31 December 2006. Each day has records of (mean) '''ozone''' levels that day, and the total number of '''deaths''' that occurred in the city.

====Questions====
*Is there an association between day-to-day variation in ozone levels and daily risk of death?
*Is ozone exposure associated with the outcome is death or other confounders - temperature and relative humidity?

'''Reference:''' Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. ''International Journal of Epidemiology''. 2013;42(4):1187-1195. doi:10.1093/ije/dyt092.
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3780998/

Load the Data
library(foreign)
#07_LondonOzonPolutionData_2006_TS.csv
#data <- read.csv("https://umich.instructure.com/files/720873/download?download_frd=1")
data <- read.dta("https://umich.instructure.com/files/721042/download?download_frd=1")

#Set the Default Action for Missing Data to na.exclude
options(na.action="na.exclude")

Exploratory Analyses

#set the plotting parameters for the plot

oldpar <- par(no.readonly=TRUE)
par(mex=0.8,mfrow=c(2,1))

#sub-plot for daily deaths, with vertical lines defining years

plot(data$\$$date,data$\$$numdeaths,pch=".",main="Daily deaths over time",
ylab="Daily number of deaths",xlab="Date")
abline(v=data$\$$date[grep("-01-01",data$\$$date)],col=grey(0.6),lty=2)

#plot for ozone levels

plot(data$\$$date,data$\$$ozone,pch=".",main="Ozone levels over time",
ylab="Daily mean ozone level(ug/m3)",xlab="Date")
abline(v=data$\$$date[grep("-01-01",data$\$$date)],col=grey(0.6),lty=2)
par(oldpar)
layout(1)

#descriptive statistics

summary(data)

#correlations

cor(data[,2:4])
#scale exposure
data$\$$ozone10 <- data$\$$ozone/10

Modelling Seasonality and Long-Term Trend

#option 1: time-stratified model 
#generate month and year

data$\$$month <- as.factor(months(data$\$$date,abbr=TRUE))
data$\$$year <- as.factor(substr(data$\$$date,1,4))

#fit a Poisson model with a stratum for each month nested in year 
#(use of quasi-Poisson family for scaling the standard errors)

model1 <- glm(numdeaths ~ month/year,data,family=quasipoisson)
summary(model1)

#compute predicted number of deaths from this model
pred1 <- predict(model1,type="response")

#Figure 2a: Three alternative ways of modelling long-term patterns in the data (seasonality and trends)

plot(data$\$$date,data$\$$numdeaths,ylim=c(100,300),pch=19,cex=0.2,col=grey(0.6),
main="Time-stratified model (month strata)",ylab="Daily number of deaths", xlab="Date")
lines(data$\$$date, pred1,lwd=2)

#Option 2: periodic functions model (fourier terms) 
#use function harmonic, in package '''tsModel'''

install.packages("tsModel"); library(tsModel)

#4 sine-cosine pairs representing different harmonics with period 1 year

data$\$$time <- seq(nrow(data))
fourier <- harmonic(data$\$$time,nfreq=4,period=365.25)

#fit a Poisson model Fourier terms + linear term for trend 
#(use of quasi-Poisson family for scaling the standard errors)

model2 <- glm(numdeaths ~ fourier +time,data,family=quasipoisson)
summary(model2)

#compute predicted number of deaths from this model

pred2 <- predict(model2,type="response")

#Figure 2b

plot(data$\$$date, data$\$$numdeaths,ylim=c(100,300),pch=19,cex=0.2,col=grey(0.6),
main="Sine-cosine functions (Fourier terms)",ylab="Daily number of deaths", xlab="Date")
lines(data$\$$date, pred2,lwd=2)

#Option 3: Spline Model: Flexible Spline Functions 
#generate spline terms, use function '''bs''' in package '''splines'''
library(splines) 
#A CUBIC B-SPLINE WITH 32 EQUALLY-SPACED KNOTS + 2 BOUNDARY KNOTS 
#Note: the 35 basis variables are set as df, with default knots placement. see '''?bs''' 
#other types of splines can be produced with the function ns. see '''?ns'''
spl <- bs(data$\$$time,degree=3,df=35) 
#Fit a Poisson Model Fourier Terms + Linear Term for Trend

model3 <- glm(numdeaths ~ spl,data,family=quasipoisson)
summary(model3)

#compute predicted number of deaths from this model

pred3 <- predict(model3,type="response")

#FIGURE 2C

plot(data$\$$date,data$\$$numdeaths,ylim=c(100,300),pch=19,cex=0.2,col=grey(0.6),
main="Flexible cubic spline model",ylab="Daily number of deaths", xlab="Date")
lines(data$\$$date,pred3,lwd=2)

Plot Response Residuals Over Time From Model 3

#GENERATE RESIDUALS
res3 <- residuals(model3,type="response")
#Figure 3: Residual variation in daily deaths after ‘removing’ (i.e. modelling) season and long-term trend.
plot(data$\$$date,res3,ylim=c(-50,150),pch=19,cex=0.4,col=grey(0.6),
main="Residuals over time",ylab="Residuals (observed-fitted)",xlab="Date")
abline(h=1,lty=2,lwd=2)

Estimate ozone-mortality association - controlling for confounders

#compare the RR (and CI using '''ci.lin''' in package '''Epi''')

install.packages("Epi"); library(Epi)

#unadjusted model

model4 <- glm(numdeaths ~ ozone10,data,family=quasipoisson)
summary(model4)
(eff4 <- ci.lin(model4,subset="ozone10",Exp=T))

#control for seasonality (with spline as in model 3)

model5 <- update(model4, .~. + spl)
summary(model5)
(eff5 <- ci.lin(model5,subset="ozone10",Exp=T))

#control for temperature - temperature modelled with categorical variables for deciles

cutoffs <- quantile(data$\$$temperature,probs=0:10/10)
tempdecile <- cut(data$\$$temperature,breaks=cutoffs,include.lowest=TRUE)
model6 <- update(model5,.~.+tempdecile)
summary(model6)
(eff6 <- ci.lin(model6,subset="ozone10",Exp=T))

Build a summary table with effect as percent increase

tabeff <- rbind(eff4,eff5,eff6)[,5:7]
tabeff <- (tabeff-1)*100
dimnames(tabeff) <- list(c("Unadjusted","Plus season/trend","Plus temperature"), c("RR","ci.low","ci.hi"))
round(tabeff,2)

#explore the lagged (delayed) effects

#SINGLE-LAG MODELS

#prepare the table with estimates

tablag <- matrix(NA,7+1,3,dimnames=list(paste("Lag",0:7), c("RR","ci.low","ci.hi")))

#iterate

for(i in 0:7) {
#lag ozone and temperature variables
ozone10lag <- Lag(data$\$$ozone10,i)
tempdecilelag <- cut(Lag(data$\$$temperature,i),breaks=cutoffs, include.lowest=TRUE)

#define the transformation for temperature

#lag same as above, but with strata terms instead than linear

mod <- glm(numdeaths ~ ozone10lag + tempdecilelag + spl,data, family=quasipoisson)
tablag[i+1,] <- ci.lin(mod,subset="ozone10lag",Exp=T)[5:7]</blockquote>
}
tablag

#Figure 4A: Modelling lagged (delayed) associations between ozone exposure and survival/death outcome.

plot(0:7,0:7,type="n",ylim=c(0.99,1.03),main="Lag terms modelled one at a time", xlab="Lag (days)",
ylab="RR and 95%CI per 10ug/m3 ozone increase")</blockquote>
abline(h=1)
arrows(0:7,tablag[,2],0:7,tablag[,3],length=0.05,angle=90,code=3)
points(0:7,tablag[,1],pch=19)

Model Checking

#generate deviance residuals from unconstrained distributed lag model

res6 <- residuals(model6,type="deviance")

#Figure A1: Plot of deviance residuals over time (London data)

plot(data$\$$date,res6,ylim=c(-5,10),pch=19,cex=0.7,col=grey(0.6),
main="Residuals over time",ylab="Deviance residuals",xlab="Date")
abline(h=0,lty=2,lwd=2)

#Figure A2a: Residual plot for Model6: the residuals relate to the unconstrained distributed lag model with ozone

#(lag days 0 to 7 inclusive), adjusted for temperature at the same lags. The spike in the plot of residuals relate to

#the 2003 European heat wave, and indicate that the current model does not explain the data over this period well.

pacf(res6,na.action=na.omit,main="From original model")

#Include the 1-Day Lagged Residual in the Model

model9 <- update(model6,.~.+Lag(res6,1))

#Figure A2b: residuals related to the unconstrained distributed lag model with ozone (lag days 0 to 7 inclusive),

#adjusted for temperature at the same lags

pacf(residuals(model9,type="deviance"),na.action=na.omit,
main="From model adjusted for residual autocorrelation")

====Irish Longitudinal Study on Ageing Example====

The Irish Longitudinal Study on Ageing (TILDA), 2009-2011 
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/34315 
Kenny, Rose Anne. The Irish Longitudinal Study on Ageing (TILDA), 
2009-2011. ICPSR34315-v1. Ann Arbor, MI: Inter-university Consortium 
Bibliographic Citation: for Political and Social Research [distributor], 2014-07-16. 
http://doi.org/10.3886/ICPSR34315.v1

The Irish Longitudinal Study on Ageing (TILDA) is a major inter-institutional initiative led by Trinity College, Dublin, to improve in the quantity and quality of data, research and information related to aging in Ireland. Eligible respondents for this study include individuals aged ≥ 50 and their spouses or partners of any age. Annual interviews on a two yearly basis (N=8,504 people) in Ireland, collecting detailed information on all aspects of their lives, including the economic (pensions, employment, living standards), health (physical, mental, service needs and usage) and social aspects (contact with friends and kin, formal and informal care, social participation). Survey interviews, physical, and biological data are collected along with demographic variables (e.g., age, sex, marital status, household composition, education, and employment), and activities of daily living (ADL), aging, childhood, depression (psychology), education, employment, exercise, eyesight, families, family life, etc.

# download the RDA data object (ICPSR_34315.zip)
# load in the data into RStudio
dataURL <- "https://umich.instructure.com/files/703606/download?download_frd=1"
load(url(dataURL))
head(da34315.0001); data_colnames <- colnames(da34315.0001)
vars <- da34315.0001

vars; head(vars); summary(vars); data_colnames

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|[1]||”ID"||”HOUSEHOLD”
|-
|[3]|| "CLUSTER"||"STRATUM"
|-
|[5]||”REGION”|| "CAPIWEIGHT"
|-
|[7]|| "IN_SCQ"||"SCQ_WEIGHT"
|-
|[9]|| "AGE"||"SEX"
|-
|[11]|| "NML"||"CM003"
|-
|...||||
|-
|[1673]||"HA_WEIGHT"||"IN_HA"
|-
|[1675]|| "SR_HEIGHT_CENTIMETRES"||"HEIGHT"
|-
|[1677]|| "SR_WEIGHT_KILOGRAMMES"||"WEIGHT"
|-
|[1679]||"COGMMSE"||"FRGRIPSTRENGTHD"
|-
|[1681]||"FRGRIPSTRENGTHND"||"VISUALACUITYLEFT"
|-
|[1683]||"VISUALACUITYRIGHT" ||"BPSEATEDSYSTOLIC1"
|-
|[1685]||"BPSEATEDSYSTOLIC2"||"BPSEATEDDIASTOLIC1"
|-
|[1687]||"BPSEATEDDIASTOLIC2"||"BPSEATEDSYSTOLICMEAN"
|-
|[1689]||"BPSEATEDDIASTOLICMEAN"||"BPHYPERTENSION"
|-
|[1691]||"FRBMI"||"FRWAIST"
|-
|[1693]||"FRHIP"||"FRWHR"
|-
|[1695]||"WEARGLASSES"||"WOREGLASSESDURINGTEST"
|-
|[1697]||"BLOODS_CHOL"||"BLOODS_HDL"
|-
|[1699]||"BLOODS_LDL"||"BLOODS_TRIG"
|-
|[1701]||"BLOODS_TIMEBETWEENLASTMEALANDASS"||"DELAY_HA"
|-
|[1703]||"PICMEMSCORE"||"PICRECALLSCORE"
|-
|[1705]||"PICRECOGSCORE"||"VISREASONING"
|-
|[1707]||"GRIPTEST1D"||"GRIPTEST2D"
|-
|[1709]||"GRIPTEST1ND"||"GRIPTEST2ND"
|-
|[1711]||"GRIPTESTDOMINANT"||"GRIPTESTSITTING"
|-
|[1713]||"TEMPERATURE"||"SCQSOCACT1"
|-
|...||||
|-
|[1981]||"SOCPROXCHLD4"||"SCRFLU"
|-
|[1983]||"SCRCHOL"||"SCRPROSTATE"
|-
|[1985]||"SCRBREASTLUMPS"||"SCRMAMMOGRAM"
|-
|[1987]||"BEHALC_FREQ_WEEK"||"BEHALC_DRINKSPERDAY"
|-
|[1989]||"BEHALC_DRINKSPERWEEK"||"BEHALC_DOH_LIMIT"
|-
|[1991]||"BEHSMOKER"||"BEHCAGE"
|}
</center>

# extract some data elements
df1 <- data.frame(vars)

df_Irish_small <- df1[, c("ID", "HOUSEHOLD", "AGE", "SEX" , "HA_WEIGHT", "HEIGHT" ,
"WEIGHT", "COGMMSE", "FRGRIPSTRENGTHD", "VISUALACUITYLEFT",
"VISUALACUITYRIGHT", "BPSEATEDSYSTOLIC1",
"BPSEATEDSYSTOLIC2", "BPSEATEDDIASTOLIC1",
"BPSEATEDDIASTOLIC2", "BPSEATEDSYSTOLICMEAN",
"BPSEATEDDIASTOLICMEAN", "BPHYPERTENSION",
"WEARGLASSES", "WOREGLASSESDURINGTEST",
"BLOODS_CHOL", "BLOODS_HDL",
"BLOODS_LDL", "BLOODS_TRIG",
"PICMEMSCORE", "PICRECALLSCORE",
"PICRECOGSCORE", "VISREASONING",
"TEMPERATURE", "SOCPROXCHLD4", "SCRFLU", "SCRCHOL", "SCRPROSTATE",
"SCRBREASTLUMPS", "SCRMAMMOGRAM",
"BEHALC_FREQ_WEEK", "BEHALC_DRINKSPERDAY",
"BEHALC_DRINKSPERWEEK", "BEHALC_DOH_LIMIT",
"BEHSMOKER", "BEHCAGE" )
]

summary(df_Irish_small); head(df_Irish_small)
write.table(df_Irish_small , "data.csv", sep=",")

===Applications===

====Frailty associations with sustained attention measures5====

Multinomial logistic regression analyses were used to examine frailty as the outcome variable were performed to determine associations between the sustained attention measures and prefrailty or frailty. Binary logistic regression analyses determined significant associations between the sustained attention measures and the individual frailty components. The regression models included age and gender and were also extended to include additional measures of cognitive processing speed (cognitive RT from CRT), executive function (Delta CTT), number of chronic conditions, and number of medications. We also included the quadratic term age2 to allow for any potential nonlinear effects of age on frailty in each regression model. For the independent variables in the multinomial logistic regression models, relative risk (RR) ratios with 95% confidence intervals (CIs) were provided. For the independent variables in the binary logistic regression models, OR with 95% CI were provided.

====Multivariable logistic regression examining the association between social relationships and depression, anxiety, and suicidal ideation6====

===Footnotes===

* 5 http://psychsocgerontology.oxfordjournals.org/content/early/2013/03/13/geronb.gbt009.full
* 6 http://www.jad-journal.com/article/S0165-0327%2815%2900145-7/fulltext

===Appendix===

==See also==
* [[SMHS_TimeSeriesAnalysis| Previous Section on Time-series analysis]]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=SMHS_TimeSeriesAnalysis_LOS}}

SMHS TimeSeriesAnalysis LOS

2016-05-24T14:36:55Z

Pineaumi: /* Multivariable logistic regression examining the association between social relationships and depression, anxiety, and suicidal ideation6 */

==[[SMHS_TimeSeriesAnalysis| SMHS: Time-series Analysis]] - Applications ==

===Time series regression studies in environmental epidemiology (London Ozone Study 2002-2006)===
A time series regression analysis of a London ozone dataset including daily observations from 1 January 2002 to 31 December 2006. Each day has records of (mean) '''ozone''' levels that day, and the total number of '''deaths''' that occurred in the city.

====Questions====
*Is there an association between day-to-day variation in ozone levels and daily risk of death?
*Is ozone exposure associated with the outcome is death or other confounders - temperature and relative humidity?

'''Reference:''' Bhaskaran K, Gasparrini A, Hajat S, Smeeth L, Armstrong B. Time series regression studies in environmental epidemiology. ''International Journal of Epidemiology''. 2013;42(4):1187-1195. doi:10.1093/ije/dyt092.
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3780998/

Load the Data
library(foreign)
#07_LondonOzonPolutionData_2006_TS.csv
#data <- read.csv("https://umich.instructure.com/files/720873/download?download_frd=1")
data <- read.dta("https://umich.instructure.com/files/721042/download?download_frd=1")

#Set the Default Action for Missing Data to na.exclude
options(na.action="na.exclude")

Exploratory Analyses

#set the plotting parameters for the plot

oldpar <- par(no.readonly=TRUE)
par(mex=0.8,mfrow=c(2,1))

#sub-plot for daily deaths, with vertical lines defining years

plot(data$\$$date,data$\$$numdeaths,pch=".",main="Daily deaths over time",
ylab="Daily number of deaths",xlab="Date")
abline(v=data$\$$date[grep("-01-01",data$\$$date)],col=grey(0.6),lty=2)

#plot for ozone levels

plot(data$\$$date,data$\$$ozone,pch=".",main="Ozone levels over time",
ylab="Daily mean ozone level(ug/m3)",xlab="Date")
abline(v=data$\$$date[grep("-01-01",data$\$$date)],col=grey(0.6),lty=2)
par(oldpar)
layout(1)

#descriptive statistics

summary(data)

#correlations

cor(data[,2:4])
#scale exposure
data$\$$ozone10 <- data$\$$ozone/10

Modelling Seasonality and Long-Term Trend

#option 1: time-stratified model 
#generate month and year

data$\$$month <- as.factor(months(data$\$$date,abbr=TRUE))
data$\$$year <- as.factor(substr(data$\$$date,1,4))

#fit a Poisson model with a stratum for each month nested in year 
#(use of quasi-Poisson family for scaling the standard errors)

model1 <- glm(numdeaths ~ month/year,data,family=quasipoisson)
summary(model1)

#compute predicted number of deaths from this model
pred1 <- predict(model1,type="response")

#Figure 2a: Three alternative ways of modelling long-term patterns in the data (seasonality and trends)

plot(data$\$$date,data$\$$numdeaths,ylim=c(100,300),pch=19,cex=0.2,col=grey(0.6),
main="Time-stratified model (month strata)",ylab="Daily number of deaths", xlab="Date")
lines(data$\$$date, pred1,lwd=2)

#Option 2: periodic functions model (fourier terms) 
#use function harmonic, in package '''tsModel'''

install.packages("tsModel"); library(tsModel)

#4 sine-cosine pairs representing different harmonics with period 1 year

data$\$$time <- seq(nrow(data))
fourier <- harmonic(data$\$$time,nfreq=4,period=365.25)

#fit a Poisson model Fourier terms + linear term for trend 
#(use of quasi-Poisson family for scaling the standard errors)

model2 <- glm(numdeaths ~ fourier +time,data,family=quasipoisson)
summary(model2)

#compute predicted number of deaths from this model

pred2 <- predict(model2,type="response")

#Figure 2b

plot(data$\$$date, data$\$$numdeaths,ylim=c(100,300),pch=19,cex=0.2,col=grey(0.6),
main="Sine-cosine functions (Fourier terms)",ylab="Daily number of deaths", xlab="Date")
lines(data$\$$date, pred2,lwd=2)

#Option 3: Spline Model: Flexible Spline Functions 
#generate spline terms, use function '''bs''' in package '''splines'''
library(splines) 
#A CUBIC B-SPLINE WITH 32 EQUALLY-SPACED KNOTS + 2 BOUNDARY KNOTS 
#Note: the 35 basis variables are set as df, with default knots placement. see '''?bs''' 
#other types of splines can be produced with the function ns. see '''?ns'''
spl <- bs(data$\$$time,degree=3,df=35) 
#Fit a Poisson Model Fourier Terms + Linear Term for Trend

model3 <- glm(numdeaths ~ spl,data,family=quasipoisson)
summary(model3)

#compute predicted number of deaths from this model

pred3 <- predict(model3,type="response")

#FIGURE 2C

plot(data$\$$date,data$\$$numdeaths,ylim=c(100,300),pch=19,cex=0.2,col=grey(0.6),
main="Flexible cubic spline model",ylab="Daily number of deaths", xlab="Date")
lines(data$\$$date,pred3,lwd=2)

Plot Response Residuals Over Time From Model 3

#GENERATE RESIDUALS
res3 <- residuals(model3,type="response")
#Figure 3: Residual variation in daily deaths after ‘removing’ (i.e. modelling) season and long-term trend.
plot(data$\$$date,res3,ylim=c(-50,150),pch=19,cex=0.4,col=grey(0.6),
main="Residuals over time",ylab="Residuals (observed-fitted)",xlab="Date")
abline(h=1,lty=2,lwd=2)

Estimate ozone-mortality association - controlling for confounders

#compare the RR (and CI using '''ci.lin''' in package '''Epi''')

install.packages("Epi"); library(Epi)

#unadjusted model

model4 <- glm(numdeaths ~ ozone10,data,family=quasipoisson)
summary(model4)
(eff4 <- ci.lin(model4,subset="ozone10",Exp=T))

#control for seasonality (with spline as in model 3)

model5 <- update(model4, .~. + spl)
summary(model5)
(eff5 <- ci.lin(model5,subset="ozone10",Exp=T))

#control for temperature - temperature modelled with categorical variables for deciles

cutoffs <- quantile(data$\$$temperature,probs=0:10/10)
tempdecile <- cut(data$\$$temperature,breaks=cutoffs,include.lowest=TRUE)
model6 <- update(model5,.~.+tempdecile)
summary(model6)
(eff6 <- ci.lin(model6,subset="ozone10",Exp=T))

Build a summary table with effect as percent increase

tabeff <- rbind(eff4,eff5,eff6)[,5:7]
tabeff <- (tabeff-1)*100
dimnames(tabeff) <- list(c("Unadjusted","Plus season/trend","Plus temperature"), c("RR","ci.low","ci.hi"))
round(tabeff,2)

#explore the lagged (delayed) effects

#SINGLE-LAG MODELS

#prepare the table with estimates

tablag <- matrix(NA,7+1,3,dimnames=list(paste("Lag",0:7), c("RR","ci.low","ci.hi")))

#iterate

for(i in 0:7) {
#lag ozone and temperature variables
ozone10lag <- Lag(data$\$$ozone10,i)
tempdecilelag <- cut(Lag(data$\$$temperature,i),breaks=cutoffs, include.lowest=TRUE)

#define the transformation for temperature

#lag same as above, but with strata terms instead than linear

mod <- glm(numdeaths ~ ozone10lag + tempdecilelag + spl,data, family=quasipoisson)
tablag[i+1,] <- ci.lin(mod,subset="ozone10lag",Exp=T)[5:7]</blockquote>
}
tablag

#Figure 4A: Modelling lagged (delayed) associations between ozone exposure and survival/death outcome.

plot(0:7,0:7,type="n",ylim=c(0.99,1.03),main="Lag terms modelled one at a time", xlab="Lag (days)",
ylab="RR and 95%CI per 10ug/m3 ozone increase")</blockquote>
abline(h=1)
arrows(0:7,tablag[,2],0:7,tablag[,3],length=0.05,angle=90,code=3)
points(0:7,tablag[,1],pch=19)

Model Checking

#generate deviance residuals from unconstrained distributed lag model

res6 <- residuals(model6,type="deviance")

#Figure A1: Plot of deviance residuals over time (London data)

plot(data$\$$date,res6,ylim=c(-5,10),pch=19,cex=0.7,col=grey(0.6),
main="Residuals over time",ylab="Deviance residuals",xlab="Date")
abline(h=0,lty=2,lwd=2)

#Figure A2a: Residual plot for Model6: the residuals relate to the unconstrained distributed lag model with ozone

#(lag days 0 to 7 inclusive), adjusted for temperature at the same lags. The spike in the plot of residuals relate to

#the 2003 European heat wave, and indicate that the current model does not explain the data over this period well.

pacf(res6,na.action=na.omit,main="From original model")

#Include the 1-Day Lagged Residual in the Model

model9 <- update(model6,.~.+Lag(res6,1))

#Figure A2b: residuals related to the unconstrained distributed lag model with ozone (lag days 0 to 7 inclusive),

#adjusted for temperature at the same lags

pacf(residuals(model9,type="deviance"),na.action=na.omit,
main="From model adjusted for residual autocorrelation")

====Irish Longitudinal Study on Ageing Example====

The Irish Longitudinal Study on Ageing (TILDA), 2009-2011 
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/34315 
Kenny, Rose Anne. The Irish Longitudinal Study on Ageing (TILDA), 
2009-2011. ICPSR34315-v1. Ann Arbor, MI: Inter-university Consortium 
Bibliographic Citation: for Political and Social Research [distributor], 2014-07-16. 
http://doi.org/10.3886/ICPSR34315.v1

The Irish Longitudinal Study on Ageing (TILDA) is a major inter-institutional initiative led by Trinity College, Dublin, to improve in the quantity and quality of data, research and information related to aging in Ireland. Eligible respondents for this study include individuals aged ≥ 50 and their spouses or partners of any age. Annual interviews on a two yearly basis (N=8,504 people) in Ireland, collecting detailed information on all aspects of their lives, including the economic (pensions, employment, living standards), health (physical, mental, service needs and usage) and social aspects (contact with friends and kin, formal and informal care, social participation). Survey interviews, physical, and biological data are collected along with demographic variables (e.g., age, sex, marital status, household composition, education, and employment), and activities of daily living (ADL), aging, childhood, depression (psychology), education, employment, exercise, eyesight, families, family life, etc.

# download the RDA data object (ICPSR_34315.zip)
# load in the data into RStudio
dataURL <- "https://umich.instructure.com/files/703606/download?download_frd=1"
load(url(dataURL))
head(da34315.0001); data_colnames <- colnames(da34315.0001)
vars <- da34315.0001

vars; head(vars); summary(vars); data_colnames

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|[1]||”ID"||”HOUSEHOLD”
|-
|[3]|| "CLUSTER"||"STRATUM"
|-
|[5]||”REGION”|| "CAPIWEIGHT"
|-
|[7]|| "IN_SCQ"||"SCQ_WEIGHT"
|-
|[9]|| "AGE"||"SEX"
|-
|[11]|| "NML"||"CM003"
|-
|...||||
|-
|[1673]||"HA_WEIGHT"||"IN_HA"
|-
|[1675]|| "SR_HEIGHT_CENTIMETRES"||"HEIGHT"
|-
|[1677]|| "SR_WEIGHT_KILOGRAMMES"||"WEIGHT"
|-
|[1679]||"COGMMSE"||"FRGRIPSTRENGTHD"
|-
|[1681]||"FRGRIPSTRENGTHND"||"VISUALACUITYLEFT"
|-
|[1683]||"VISUALACUITYRIGHT" ||"BPSEATEDSYSTOLIC1"
|-
|[1685]||"BPSEATEDSYSTOLIC2"||"BPSEATEDDIASTOLIC1"
|-
|[1687]||"BPSEATEDDIASTOLIC2"||"BPSEATEDSYSTOLICMEAN"
|-
|[1689]||"BPSEATEDDIASTOLICMEAN"||"BPHYPERTENSION"
|-
|[1691]||"FRBMI"||"FRWAIST"
|-
|[1693]||"FRHIP"||"FRWHR"
|-
|[1695]||"WEARGLASSES"||"WOREGLASSESDURINGTEST"
|-
|[1697]||"BLOODS_CHOL"||"BLOODS_HDL"
|-
|[1699]||"BLOODS_LDL"||"BLOODS_TRIG"
|-
|[1701]||"BLOODS_TIMEBETWEENLASTMEALANDASS"||"DELAY_HA"
|-
|[1703]||"PICMEMSCORE"||"PICRECALLSCORE"
|-
|[1705]||"PICRECOGSCORE"||"VISREASONING"
|-
|[1707]||"GRIPTEST1D"||"GRIPTEST2D"
|-
|[1709]||"GRIPTEST1ND"||"GRIPTEST2ND"
|-
|[1711]||"GRIPTESTDOMINANT"||"GRIPTESTSITTING"
|-
|[1713]||"TEMPERATURE"||"SCQSOCACT1"
|-
|...||||
|-
|[1981]||"SOCPROXCHLD4"||"SCRFLU"
|-
|[1983]||"SCRCHOL"||"SCRPROSTATE"
|-
|[1985]||"SCRBREASTLUMPS"||"SCRMAMMOGRAM"
|-
|[1987]||"BEHALC_FREQ_WEEK"||"BEHALC_DRINKSPERDAY"
|-
|[1989]||"BEHALC_DRINKSPERWEEK"||"BEHALC_DOH_LIMIT"
|-
|[1991]||"BEHSMOKER"||"BEHCAGE"
|}
</center>

# extract some data elements
df1 <- data.frame(vars)

df_Irish_small <- df1[, c("ID", "HOUSEHOLD", "AGE", "SEX" , "HA_WEIGHT", "HEIGHT" ,
"WEIGHT", "COGMMSE", "FRGRIPSTRENGTHD", "VISUALACUITYLEFT",
"VISUALACUITYRIGHT", "BPSEATEDSYSTOLIC1",
"BPSEATEDSYSTOLIC2", "BPSEATEDDIASTOLIC1",
"BPSEATEDDIASTOLIC2", "BPSEATEDSYSTOLICMEAN",
"BPSEATEDDIASTOLICMEAN", "BPHYPERTENSION",
"WEARGLASSES", "WOREGLASSESDURINGTEST",
"BLOODS_CHOL", "BLOODS_HDL",
"BLOODS_LDL", "BLOODS_TRIG",
"PICMEMSCORE", "PICRECALLSCORE",
"PICRECOGSCORE", "VISREASONING",
"TEMPERATURE", "SOCPROXCHLD4", "SCRFLU", "SCRCHOL", "SCRPROSTATE",
"SCRBREASTLUMPS", "SCRMAMMOGRAM",
"BEHALC_FREQ_WEEK", "BEHALC_DRINKSPERDAY",
"BEHALC_DRINKSPERWEEK", "BEHALC_DOH_LIMIT",
"BEHSMOKER", "BEHCAGE" )
]

summary(df_Irish_small); head(df_Irish_small)
write.table(df_Irish_small , "data.csv", sep=",")

===Applications===

====Frailty associations with sustained attention measures5====

Multinomial logistic regression analyses were used to examine frailty as the outcome variable were performed to determine associations between the sustained attention measures and prefrailty or frailty. Binary logistic regression analyses determined significant associations between the sustained attention measures and the individual frailty components. The regression models included age and gender and were also extended to include additional measures of cognitive processing speed (cognitive RT from CRT), executive function (Delta CTT), number of chronic conditions, and number of medications. We also included the quadratic term age2 to allow for any potential nonlinear effects of age on frailty in each regression model. For the independent variables in the multinomial logistic regression models, relative risk (RR) ratios with 95% confidence intervals (CIs) were provided. For the independent variables in the binary logistic regression models, OR with 95% CI were provided.

====Multivariable logistic regression examining the association between social relationships and depression, anxiety, and suicidal ideation6====

===Footnotes===

* 5http://psychsocgerontology.oxfordjournals.org/content/early/2013/03/13/geronb.gbt009.full
* 6http://www.jad-journal.com/article/S0165-0327%2815%2900145-7/fulltext

===Appendix===

==See also==
* [[SMHS_TimeSeriesAnalysis| Previous Section on Time-series analysis]]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=SMHS_TimeSeriesAnalysis_LOS}}

SMHS TimeSeriesAnalysis

2016-05-24T14:29:03Z

Pineaumi: /* Footnotes */

==[[SMHS| Scientific Methods for Health Sciences]] - Time Series Analysis ==

===Questions===
* Why are trends, patterns or predictions from models/data important?
* How to detect, model and utilize trends in longitudinal data?

Time series analysis represents a class of statistical methods applicable for series data aiming to extract meaningful information, trend and characterization of the process using observed longitudinal data. These trends may be used for time series forecasting and for prediction of future values based on retrospective observations. Note that classical linear modeling (e.g., regression analysis) may also be employed for prediction & testing of associations using the values of one or more independent variables and their effect on the value of another variable. However, time series analysis allows dependencies (e.g., seasonal effects to be accounted for).

===Time-series representation===

There are 3 (distinct and complementary) types of time series patterns that most time-series analyses are trying to identify, model and analyze. These include:

* Trend: A trend is a long-term increase or decrease in the data that may be linear or non-linear, but is generally continuous (mostly monotonic). The trend may be referred to as direction.
* Seasonal: A seasonal pattern is influence in the data, like seasonal factors (e.g., the quarter of the year, the month, or day of the week), which is always of a fixed known period.
* Cyclic: A cyclic pattern of fluctuations corresponds to rises and falls that are not of fixed period.

<center>[[Image:SMHS_TimeSeries1.png|300px]]</center>

For example, the following code shows several time series with different types of time series patterns.

par(mfrow=c(3,2))

n <- 98
X <- cbind(1:n) # time points (annually)
Trend1 <- LakeHuron+0.2*X # series 1
Trend2 <- LakeHuron-0.5*X # series 2

Season1 <- X; Season2 <- X; # series 1 & 2
for(i in 1:n) {
Season1[i] <- LakeHuron[i] + 5*(i%%4)
Season2[i] <- LakeHuron[i] -2*(i%%10)
}

Cyclic1 <- X; Cyclic2 <- X; # series 1 & 2
for(i in 1:n) {
rand1 <- as.integer(runif(1, 1, 10))
Cyclic1[i] <- LakeHuron[i] + 3*(i%%rand1)
Cyclic2[i] <- LakeHuron[i] - 1*(i%%rand1)
}

plot(X, Trend1, xlab="Year",ylab=" Trend1", main="Trend1 (LakeHuron+0.2*X)")
plot(X, Trend2, xlab="Year",ylab=" Trend2" , main="Trend2 (LakeHuron-0.5*X)")
plot(X, Season1, xlab="Year",ylab=" Season1", main=" Season1=Trend1 (LakeHuron+5(i%%4))")
plot(X, Season2, xlab="Year",ylab=" Season2", main=" Season2=Trend1 (LakeHuron-2(i%%10))")
plot(X, Cyclic1, xlab="Year",ylab=" Cyclic1", main=" Cyclic1=Trend1 (LakeHuron+3*(i%%rand1))")
plot(X, Cyclic2, xlab="Year",ylab=" Cyclic2", main=" Cyclic2 = Trend1 (LakeHuron-(i%%rand1))")

Note: If you get this run-time graphics error:
“Error in plot.new() : figure margins too large” 
You need to make sure your graphics window is large enough or print to PDF:

pdf("myplot.pdf"); plot(x); dev.off()

<center>[[Image:SMHS_TimeSeries2.png|300px]]</center>

Let’s look at the delta (Δ) changes - Lagged Differences, using diff, which returns suitably lagged and iterated differences.

## Default lag = 1
par(mfrow=c(1,1))
hist(diff(Trend1), prob=T, col="red") # Plot histogram
lines(density(diff(Trend1)),lwd=2) # plot density estimate
x<-seq(-4,4,length=100); y<-dnorm(x, mean(diff(Trend1)), sd(diff(Trend1)))
lines(x,y,lwd=2,col="blue") # plot MLE Normal Fit

===Time series decomposition===

Denote the time series $yt$ including the three components: a seasonal effect, a trend-cycle effect (containing both trend and cycle), and a remainder component (containing the residual variability in the time series).

Additive model:
$yt=St+Tt+Et,$ where $yt$ is the data at period $t, St$ is the seasonal component at period $t, Tt$ is the trend-cycle component at period $t$ and $Et$ is the remainder (error) component at period $t$. This additive model is appropriate if the magnitude of the seasonal fluctuations or the variation around the trend-cycle does not vary with the level of the time series.

Multiplicative model: $yt=St×Tt×Et$. When the variation in the seasonal pattern, or the variation around the trend-cycle, are proportional to the level of the time series, then a multiplicative model is more appropriate. Note that when using a multiplicative model, we can transform the data to stabilize the variation in the series over time, and then use an additive model. For instance, a log transformation decomposes the multiplicative model from:

$yt=St×Tt×Et$ 
to the additive model: 
$log(yt)=log(St)+log(Tt)+log(Et).$

We can examine the Seasonal trends by decomposing the Time Series by loess (Local Polynomial Regression) Fitting into Seasonal, Trend and irregular components using Loess - Local Polynomial Regression Fitting (stl function, in the default “stats” package):

# using Monthly Males Deaths from Lung Diseases in UK from bronchitis, emphysema and asthma, 1974–1979
mdeaths # is.ts(mdeaths)
fit <- stl(mdeaths, s.window=5)
plot(mdeaths, col="gray", main=" Lung Diseases in UK ", ylab=" Lung Diseases Deaths", xlab="")
lines(fit\$\$$time.series[,2],col="red",ylab="Trend")
plot(fit) # data, seasonal, trend, residuals

<center>“stl” function parameters</center>

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|x||Univariate time series to be decomposed. This should be an object of class "ts" with a frequency greater than one.
|-
|s.window||either the character string "periodic" or the span (in lags) of the loess window for seasonal extraction, which should be odd and at least 7, according to Cleveland et al. This has no default.
|-
|s.degree||degree of locally-fitted polynomial in seasonal extraction. Should be zero or one.
|-
|t.window||the span (in lags) of the loess window for trend extraction, which should be odd. If NULL, the default, nextodd(ceiling((1.5*period) / (1-(1.5/s.window)))), is taken.
|-
|t.degree||degree of locally-fitted polynomial in trend extraction. Should be zero or one.
|-
|l.window||the span (in lags) of the loess window of the low-pass filter used for each subseries. Defaults to the smallest odd integer greater than or equal to frequency(x) which is recommended since it prevents competition between the trend and seasonal components. If not an odd integer its given value is increased to the next odd one.
|-
|l.degree||degree of locally-fitted polynomial for the subseries low-pass filter. Must be 0 or 1.
|-
|s.jump, t.jump, l.jump||integers at least one to increase speed of the respective smoother. Linear interpolation happens between every *.jumpth value.
|-
|robust||logical indicating if robust fitting be used in the loess procedure.
|-
|inner||integer; the number of ‘inner’ (backfitting) iterations; usually very few (2) iterations suffice.
|-
|outer||integer; the number of ‘outer’ robustness iterations.
|-
|na.action||action on missing values.
|}
</center>

<center>[[Image:SMHS_TimeSeries3.png|400px]] [[Image:SMHS_TimeSeries4.png|400px]]</center>

monthplot(fit$\$$time.series[,"seasonal"], main="", ylab="Seasonal", lwd=5)
#As the “fit <- stl(mdeaths, s.window=5)” object has 3 time-series components (seasonal; trend; remainder)
#we can alternatively plot them separately:
#monthplot(fit, choice = "seasonal", cex.axis = 0.8)
#monthplot(fit, choice = "trend", cex.axis = 0.8)
#monthplot(fit, choice = "remainder", type = "h", cex.axis = 1.2) # histogramatic

<center>[[Image:SMHS_TimeSeries5.png|400px]]</center>

These are the seasonal plots and seasonal sub-series plots of the seasonal component illustrating the variation in the seasonal component over time (over the years).

Using historical weather (average daily temperature at the University of Michigan, Ann Arbor):
[http://weather-warehouse.com/WeatherHistory/PastWeatherData_AnnArborUnivOfMi_AnnArbor_MI_January.html]
(See meta-data description and provenance online: [http://weather-warehouse.com/WxWfaqs.html]).

<center>Mean Temperature, (F), UMich, Ann Arbor (1900-2015)</center>
<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
!Year||Jan||Feb||Mar||Apr||May||Jun||Jul||Aug||Sep||Oct||Nov||Dec
|-
|2015||26.3||14.4||34.9||49||64.2||68||71.2||70.2||68.7||53.9||NR||NR
|-
|2014||24.4||19.4||29||48.9||60.7||69.7||68.8||70.8||63.2||52.1||35.4||33.3
|-
|2013||22.7||26.1||33.3||46||63.1||68.5||72.9||70.2||64.6||53.2||37.6||26.7
|-
|2012||22.4||32.8||50.7||49.2||65.2||71.4||78.9||72.2||63.9||51.7||39.6||34.8
|-
|...|| || || || || || || || || || || ||
|-
|...||17||15.3||31.4||47.3||57||69||76.6||72||63.4||52.2||35.2||23.7
|-
|1900||21.4||19.2||24.7||47.8||60.2||66.3||72||75.4||67.2||59||37.6||29.2
|}
</center>

# data: 07_UMich_AnnArbor_MI_TempPrecipitation_HistData_1900_2015.csv
# more complete data is available here: 07_UMich_AnnArbor_MI_TempPrecipitation_HistData_1900_2015.xls umich_data <- read.csv("https://umich.instructure.com/files/702739/download?download_frd=1", header=TRUE)

head(umich_data)

# https://cran.r-project.org/web/packages/mgcv/mgcv.pdf
# install.packages("mgcv"); require(mgcv)

# install.packages("gamair"); require(gamair)
par(mfrow=c(1,1))

The data are in wide format – convert to long format for plotting

# library("reshape2")
long_data <- melt(umich_data, id.vars = c("Year"), value.name = "temperature")
l.sort <- long_data[order(long_data$\$$Year),]
head(l.sort); tail(l.sort)

plot(l.sort$\$$temperature, data = l.sort, type = "l")

Fit the GAMM Model (Generalized Additive Mixed Model)

<center>[[Image:SMHS_TimeSeries6.png|400px]]</center>

Fit a model with trend and seasonal components --- computation may be slow:

# define the parameters controlling the process of model-fitting/parameter-estimation
ctrl <- list(niterEM = 0, msVerbose = TRUE, optimMethod="L-BFGS-B")

# First try this model
mod <- gamm(as.numeric(temperature) ~ s(as.numeric(Year)) + s(as.numeric(variable)), data = l.sort, method = "REML", correlation=corAR1(form = ~ 1|Year), knots=list(Variable = c(1, 12)), na.action=na.omit, control = ctrl)

#Correlation: corStruct object defineing correlation structures in lme. Grouping factors in the formula for this 
#object are assumed to be nested within any random effect grouping factors, without the need to make this 
#explicit in the formula (somewhat different from the behavior of lme). 
#This is similar to the GEE approach to correlation in the generalized case. 
#Knots: an optional list of user specified knot values to be used for basis construction -- 
#different terms can use different numbers of knots, unless they share a covariate. 
#If you revise the model like this (below), it will compare nicely with 3 ARMA models (later) 
mod <- gamm(as.numeric(temperature) ~ s(as.numeric(Year), k=116) + s(as.numeric(variable), k=12),
data = l.sort, correlation = corAR1(form = ~ 1|Year), control = ctrl)

Summary of the fitted model:

summary(mod$\$$gam)

Visualize the model trend (year) and seasonal terms (months)

plot(mod$\$$gam, pages = 1)
t <- cbind(1: 1392) # define the time

<center>[[Image:SMHS_TimeSeries7.png|500px]]</center>

Plot the trend on the observed data -- with prediction:

pred2 <- predict(mod$\$$gam, newdata = l.sort, type = "terms")
ptemp2 <- attr(pred2, "constant") + pred2[,1]

# pred2[,1] = trend; pred2[,2] = seasonal effects
# mod$\$$gam is a GAM object containing information to use predict, summary and print methods, but not to use e.g. the anova method function to compare models
plot(temperature ~ t, data = l.sort, type = "l", xlab = "year", ylab = expression(Temperature ~ (degree*F)))
lines(ptemp2 ~ t, data = l.sort, col = "blue", lwd = 2)

<center>[[Image:SMHS_TimeSeries8.png|500px]]</center>

Plot the seasonal model

pred <- predict(mod$\$$gam, newdata = l.sort, type = "terms")
ptemp <- attr(pred, "constant") + pred[,2]

plot(l.sort$\$$temperature ~ t, data = l.sort, type = "l", xlab = "year", ylab = expression(Temperature ~ (degree*F)))
lines(ptemp, data = l.sort, col = "red", lwd = 0.5)

<center>[[Image:SMHS_TimeSeries9.png|500px]]</center>

Zoom in first 100 temps (1:100)

plot(l.sort$\$$temperature ~ t, data = l.sort, type = "l", xlim=c(0, 120), xlab = "year", ylab = expression(Temperature ~ (degree*F))); lines(ptemp, data = l.sort, col = "red", lwd = 0.5)

<center>[[Image:SMHS_TimeSeries10.png|500px]]</center>

To examine how much the estimated trend has changed over the 116 year period, we can use the data contained in pred to compute the difference between the start (Jan 1900) and the end (Dec 2015) of the series in the trend component only:

tail(pred[,1], 1) - head(pred[,1], 1) # subtract the predicted temp [,1] in 1900 (head) from the temp in 2015 (tail)

# names(attributes(pred)); str(pred) # to see the components of the GAM prediction model object (pred)

Assess autocorrelation in residuals

# head(umich_data); tail(umich_data)
acf(resid(mod$\$$lme), lag.max = 36, main = "ACF")
# acf = Auto-correlation and Cross-Covariance Function computes and plots the estimates of the autocovariance or autocorrelation function.
# pacf is the function used for the partial autocorrelations.
# ccf computes the cross-correlation or cross-covariance of two univariate series.
pacf(resid(mod$\$$lme), lag.max = 36, main = "pACF")

Looking at the residuals of this model, using the (partial) autocorrelation function, we see that there may be some residual autocorrelation in the data that the trend term didn’t account for. The shapes of the ACF and the pACF suggest an AR(p) model might be appropriate.

Fit and compare 4 alternative autoregressive models (original mod, AR1, AR2 and AR3)

## AR(1)
m1 <- gamm(as.numeric(temperature) ~ s(as.numeric(Year), k=116) + s(as.numeric(variable), k=12),
data = l.sort, correlation = corARMA(form = ~ 1|Year, p = 1), control = ctrl)

## AR(2)
m2 <- gamm(as.numeric(temperature) ~ s(as.numeric(Year), k=116) + s(as.numeric(variable), k=12),
data = l.sort, correlation = corARMA(form = ~ 1|Year, p = 2), control = ctrl)

## AR(3)
m3 <- gamm(as.numeric(temperature) ~ s(as.numeric(Year), k=116) + s(as.numeric(variable), k=12),
data = l.sort, correlation = corARMA(form = ~ 1|Year, p = 3), control = ctrl)

Note that the correlation argument is specified by corARMA(form = ~ 1|Year, p = x), which fits an ARMA (auto-regressive moving average) process to the residuals, where p indicates the order for the AR part of the ARMA model, and form = ~ 1|Year specifies that the ARMA is nested within each year. This may expedite the model fitting but may also hide potential residual variation from one year to another.

Let’s compare the candidate models by using the generalized likelihood ratio test via the anova() method for lme objects; see our previous mixed effects modeling notes 1 , 2. This model selection is justified as we work with nested models -- going from the AR(3) to the AR(1) by setting some of the AR coefficients to 0. The models also vary in terms of the coefficient estimates for the splines terms which may require fixing some values while choosing the AR structure.

<center>anova(mod$\$$lme, m1$\$$lme, m2$\$$lme, m3$\$$lme)</center>
<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Model||df||AIC||BIC||logLik||Test||L.Ratio||p-value
|-
|mod$\$$lme||1||7||7455.609||7492.228|| -3720.805|| || ||
|-
|m1$\$$lme||2|| 7||7455.609||7492.228|| -3720.805|| || ||
|-
|m2$\$$lme||3|| 8||7453.982||7495.832|| -3718.991||2 vs 3||3.627409||0.0568
|-
|m3$\$$lme||4|| 9||7455.966||7503.048|| -3718.983|| 3 vs 4||0.015687||0.9003
|}
</center>

Interpretation 

The AR(1) model (m1) does not provide a substantial increase in fit over the naive model (mod), and the AR(2) model (m2) only provides a marginal increase in the AR(1) model fit (m1). There is no improvement in moving from m2 to AR(3) model (m3).

Let’s plot the AR(2) model (m2) to inspect how over-fitted the naive model with uncorrelated errors was in terms of the trend term, which shows similar smoothness compared to the initial (mod) model.

plot(m2$\$$gam, scale = 0) # plot(mod2$\$$gam, scale = 0) # “scale=0” ensures optimal y-axis cropping of plot

<center>[[Image:SMHS_TimeSeries11.png|500px]]</center>

Investigation of residual patterns

layout(matrix(1:2, ncol = 2))
# original (mod) model
acf(resid(mod$\$$lme), lag.max = 36, main = "ACF"); pacf(resid(mod$\$$lme), lag.max = 36, main = "pACF")
# pACF controls for the values of the time series at all shorter lags, which contrasts the ACF which does not control for other lags.

<center>[[Image:SMHS_TimeSeries12.png|500px]]</center>

This illustrates that there is some (month=1) Auto-correlation (ACF) and partial auto correlation in the residuals.

# ARM(2) model (m2)
layout(matrix(1:2, ncol = 2))
res <- resid(m2$\$$lme, type = "normalized");
acf(res, lag.max = 36, main = "ACF - AR(2) errors"); pacf(res, lag.max = 36, main = "pACF- AR(2) errors")
layout(1)

<center>[[Image:SMHS_TimeSeries13.png|500px]]</center>

No residual auto-correlation remains in m2. The resulting fitted Generalized Additive Mixed Model (GAMM) object contains information about the trend and the contributions to the fitted values. The package '''mgcv'''3 can spit the information using predict() for each of the 4 models.

# require(mgcv); require(gamair)
# m2 <- gamm(as.numeric(temperature) ~ s(as.numeric(Year), k=116) + s(as.numeric(variable), k=12), data = l.sort, correlation = corARMA(form = ~ 1|Year, p = 2), control = ctrl)

pred2 <- predict(m2$\$$gam, newdata = l.sort, type = "terms")
pred_trend2 <- attr(pred2, "constant") + pred2[,1] # trend
pred_season2 <- attr(pred2, "constant") + pred2[,2] # seasonal effects
# plot(m2$\$$gam, scale = 0) # plot pure effects

# Convert the 2 columns (Year and Month/variable) to R Date object
# df_time <- as.Date(paste(as.numeric(l.sort$\$$Year), as.numeric(l.sort$\$$variable), "1", sep="-")); df_time

plot(x=df_time, y=l.sort$\$$temperature, data = l.sort, type = "l", xlim=c(as.Date("1950-02-01"),as.Date("1960-01-01")), xlab = "year", ylab = expression(Temperature ~ (degree*F)))
lines(x=df_time, y=pred_trend2, data = l.sort, col = "red", lwd = 2);
lines(x=df_time, y=pred_season2, data = l.sort, col = "blue", lwd = 2)

<center>[[Image:SMHS_TimeSeries14.png|500px]]</center>

===Moving average smoothing===

A moving average of order $m=2k+1$ can be expressed as:
$T_{t}=\frac{1}{2k+1}\sum_{j=-k}^{k}Y_{t+j}$ .

The ''m''-MA represents an order m moving average, $T_t$, or the estimate of the trend-cycle at time ''t'', obtained by averaging values of the time series within ''k'' periods (left and right) of ''t''. This averaging process denoises the data (eliminates randomness in the data) and produces a smoother trend-cycle component.

The 5-MA contains the values of $T_t$ with ''k''=2. To see what the trend-cycle estimate looks like, we plot it along with the original data

# print the moving average results (k=3 ↔ m=7)
# library("forecast")
plot(l.sort$\$$temperature, data = l.sort, type = "l", main=" UMich/AA Temp (1900-2015) ", ylab=" Temperature (F)", xlab="Year")
lines(ma(l.sort$\$$temperature, 12), col="red", lwd=5)
lines(ma(l.sort$\$$temperature, 36), col="blue", lwd=3)

legend(0, 80, # places a legend at the appropriate place
c("Raw", "k=12 smoother", "k=36 smoothest"), # puts text in the legend
lty=c(1,1,1), # gives the legend appropriate symbols (lines)
cex=1.0, # label sizes
lwd=c(2.5,2.5), col=c("black", "red", "blue")) # gives the legend lines the correct color and width

<center>[[Image:SMHS_TimeSeries15.png|500px]]</center>

The blue trend (''k''=36) (3 yrs) is smoother than the original (raw) data (black) and the 1-yr average (''k''=12). It captures the main movement of the time series without all the minor fluctuations. We can’t estimate $T_t$ where ''t'' is close to the ends as there is not enough data there to compute the averages. The red trend (''k''=12) is smoother than the original (raw) data (black) but more jagged than the 3-yr average. The order of the moving average (''m'') determines the smoothness of the trend-cycle estimate. A larger order implies a smoother curve.

===Simulation of a time-series analysis and prediction===

(1) Simulate a time series

# the ts() function converts a numeric vector into an R time series object.
# format is ts(vector, start=, end=, frequency=) where start and end are the times of the first and last observation
# and frequency is the number of observations per unit time (1=annual, 4=quarterly, 12=monthly, etc.)
Note that ''ling Rate'' = $\frac{1}{Frequency}$

# save a numeric vector containing 16-years (192 monthly) observations
# from Jan 2000 to Dec 2015 as a time series object
sim_ts <- ts(as.integer(runif(192,0,10)), start=c(2000, 1), end=c(2015, 12), frequency=12)
sim_ts

# subset the time series (June 2014 to December 2015)
sim_ts2 <- window(sim_ts, start=c(2014, 6), end=c(2015, 12))
sim_ts2

# plot series
plot(sim_ts)
lines(sim_ts2, col="blue", lwd=3)

<center>[[Image:SMHS_TimeSeries16.png|500px]]</center>

====Seasonal Decomposition====

*The additive and seasonal trends, and irregular components, of time-series may be decomposed using the stl() function. Series with multiplicative effects can by transformed into series with additive effects through a log transformation (i.e., '''ln_sim_ts <- log(sim_ts)).
# Seasonal decomposition
fit_stl <- stl(sim_ts, s.window="period") '''# Seasonal Decomposition of Time Series by Loess'''
plot(fit_stl)

# inspect the distribution of the residuals
hist(fit_stl$\$$time.series[,3]); # this contains the residuals: fit_stl$\$$time.series [,"remainder"], or seasonal, trend

<center>[[Image:SMHS_TimeSeries17.png|500px]]</center>

# additional plots
monthplot(sim_ts) # plots the seasonal subseries of a time series. For each season, a time series is plotted.

# library(forecast)
seasonplot(sim_ts)

====Exponential Models====

*The '''HoltWinters()''' function ('''stats''' package), and the '''ets()''' function ('''forecast''' package) can fit exponential models.
# simple exponential - models level
fit_HW <- HoltWinters(sim_ts, beta=FALSE, gamma=FALSE)

# double exponential - models level and trend
fit_HW2<- HoltWinters(sim_ts, gamma=FALSE)

# triple exponential - models level, trend, and seasonal components
fit_HW3 <- HoltWinters(sim_ts)

plot(fit_HW, col='black')
par(new=TRUE)
plot(fit_HW2, ann=FALSE, axes=FALSE, col='blue')
par(new=TRUE)
plot(fit_HW3, axes=FALSE, col='red')
# clear plot:
# dev.off()
<center>[[Image:SMHS_TimeSeries18.png|500px]]</center>

===Auto-regressive Integrated Moving Average (ARIMA) Models4 ===

There are 2 types of ARIMA time-series models: 
$ X_t= \mu+ \underbrace{\sum_{i=1}^{p}{φ_iX_{t-i}}}_\text{auto-regressive (p) part} +
\underbrace{\sum_{j=1}^{q}{θ_jε_{t-j}}}_\text{moving-average (q) part} +
\underbrace{ ε_t }_\text{error term}.$

====Non-seasonal ARIMA models====
The Non-seasonal ARIMA models are denoted by ARIMA(p, d, q), where parameters p, d, and q are positive integers,
* p = order of the auto-regressive model,
* d = degree of differencing, when ''d''=2, the '''''dth'' difference''' is $(X_t-X_{t-1})-(X_{t-1}-X_{t-2})= X_t-2X_{t-1}+X_{t-2}$. That is, the second difference of ''X'' (d=2) is not the difference between the current period and the value 2 periods ago. It is the first-difference-of-the-first difference, the discrete analog of a second derivative, representing the local acceleration of the series rather than its local trend (first derivative).
* q = order of the moving-average model.

====Seasonal ARIMA models====
The Seasonal AMIMA models are denoted by ''ARIMA(p, d, q)(P, D, Q)m,''
* m = number of periods in each season,
* uppercase P, D, Q represent the auto-regressive, differencing, and moving average terms for the seasonal part of the ARIMA model, and the lower case (p,d,q) are as with non-seasonal ARIMA.

If 2 of the 3 terms are trivial, the model is abbreviated using the non-zero parameter, skipping the "AR", "I" or "MA" from the acronym. For example,

*ARIMA(1,0,0) = AR(1), a stationary and auto-correlated series can be predicted as a multiple of its own previous value, plus a constant. $X_t=μ + φ_1 × X_{t-1}+ \epsilon_t.$ Note that $ε_t=X_t-\hat{X}_t.$

*An ARIMA(0,1,0) = I(1) model, not stationary series, a limiting case of an AR(1) model, the auto-regressive coefficient is equal to 1, i.e., a series with infinitely slow mean reversion, $X_t=μ+X_{t-1}+ε_t,$ a 1-step random walk.

For more complex models:
*An ARIMA(1,1,0), differenced first-order auto-regressive model. $X_t=μ+X_{t-1}+α×(X_{t-1}-X_{t-2})+ε_t.$

*An ARIMA(0,2,2) model is given by $X_t=2X_{t-1}-X_{t-2}+α×ε_{t-1}+β×ε_{t-2}+ ε_t,$ where $α$ and $β$ are the MA(1) and MA(2) coefficients (sometimes these are defined with negative signs). This is a general linear exponential smoothing model that uses exponentially weighted moving averages to estimate both a local level and a local trend in the series. The long-term forecasts from this model converge to a straight line whose slope depends on the average trend observed toward the end of the series.

*ARIMA(1,1,2), $X_t=μ+X_{t-1}+(X_{t-1}+X_{t-2})+α×ε{t}+β×ε_{t-1}$

The '''arima'''() function ('''stats''' package) can be used to fit an auto-regressive integrated moving averages model. Other useful functions include:
* lag(sim_ts, k)      lagged version of time series, shifted back k observations
* diff(sim_ts, differences=d)      difference the time series d times
* ndiffs(sim_ts)      Number of differences required to achieve stationarity (from the forecast package)
* acf(sim_ts)      auto-correlation function
* pacf(sim_ts)      partial auto-correlation function
* adf.test(sim_ts)      Augmented Dickey-Fuller test. Rejecting the null hypothesis suggests that a time series is stationary (from the tseries package)</li>
* Box.test(x, type="Ljung-Box")      Portmanteau test that observations in vector or time series x are independent.

The '''forecast''' package has alternative versions of '''acf()''' and '''pacf()''' called '''Acf()''' and '''Pacf()''' respectively.
# fit an '''ARIMA(P, D, Q) model''' of order:
* P, represents the AR order<
* D, represents the degree of differencing
* Q, represents the MA order.

fit_arima1 <- arima(sim_ts, order=c(3, 1, 2))
# predictive accuracy
library(forecast)
accuracy(fit_arima1)

# predict next 20 observations
library(forecast)
forecast(fit_arima1, 20)
plot(forecast(fit_arima1, 20))

<center>[[Image:SMHS_TimeSeries19.png|600px]]</center>

===Automated Forecasting===

The '''forecast''' package provides functions for the automatic selection of exponential and ARIMA models. The '''ets()''' (exponential TS) function supports both additive and multiplicative models. The '''auto.arima()''' function accounts for seasonal and nonseasonal ARIMA models according to criteria maximizing a cost function.

# library(forecast)

# Automated forecasting using an exponential model
fit_ets <- ets(sim_ts)

# Automated forecasting using an ARIMA model
fit_arima2 <- auto.arima(sim_ts)

# Compare the AIC (model quality) for both models
fit_ets$\$$aic; fit_arima2$\$$aic
accuracy(fit_ets); accuracy(fit_arima2);

'''Akaike’s Information Criterion (AIC)''' = ''-2Log(Likelihood)+2p,'' where ''p'' is he number of estimated parameters.
summary(fit_ets); summary(fit_arima2)

ACF plot of the residuals from the ARIMA(3,1,2) model shows all correlations within the threshold limits indicating that the residuals are behaving like white noise. A portmanteau test returns a large p-value, also suggesting the residuals are white noise.
# acf computes (and by default plots) estimates of the autocovariance or autocorrelation function
acf(residuals(fit_ets))

# Box–Pierce or Ljung–Box test statistic for examining the null hypothesis of independence in a given time series.
# These are sometimes known as ‘portmanteau’ tests.
Box.test(residuals(fit_ets), lag=24, fitdf=4, type="Ljung")
# plot forecast

plot(forecast(fit_arima2))
# more on ARIMA https://www.otexts.org/fpp/8/7

===Footnotes===
* 1 https://umich.instructure.com/files/689861/download?download_frd=1
* 2 https://umich.instructure.com/courses/38100/files
* 3 https://cran.r-project.org/web/packages/mgcv/mgcv.pdf
* 4 http://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf

==See also==
* [[SMHS_TimeSeriesAnalysis_LOS| Applications of Time-series]]

<hr>
* SOCR Home page: http://www.socr.ucla.edu
{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=SMHS_TimeSeriesAnalysis}}

SMHS BigDataBigSci CrossVal LDA QDA

2016-05-24T14:09:20Z

Pineaumi: /* See also */

==[[SMHS_BigDataBigSci_CrossVal| Big Data Science and Cross Validation]] - Foundation of LDA and QDA for prediction, dimensionality reduction or forecasting==

===Summary===
Both LDA (Linear Discriminant Analysis) and QDA (Quadratic Discriminant Analysis) use probabilistic models of the class conditional distribution of the data $P(X|Y=k)$ for each class $k$. Their predictions are obtained by using Bayesian theorem (http://wiki.socr.umich.edu/index.php/SMHS_BayesianInference#Bayesian_Rule):

\begin{equation}
P(Y=k|X)=\frac{P(X|Y=k)P(Y=k)}{P(X)}=\frac{P(X|Y=k)P(Y=k)}{\sum_{l=0}^∞P(X|Y=l)P(Y=l)'}
\end{equation}

and we select the class $k$, which '''maximizes''' this conditional probability (maximum likelihood estimation).

In linear and quadratic discriminant analysis, $P(X|Y)$ is modeled as a multivariate Gaussian distribution with density:

\begin{equation}
P(X|Y=k)=\frac{1}{(2\pi)^n|\sum_k|^{1/2}}×e^{\Big(-\frac{1}{2}(x-\mu_k)^T\sum_k^{-1}(X-\mu_k)\Big)}
\end{equation}

This model can be used to classify data by using the training data to '''estimate''':

(1) the class prior probabilities $P(Y = k)$ by counting the proportion of observed instances of class $k$,

(2) the class means $μ_k$ by computing the empirical sample class means, and

(3) the covariance matrices by computing either the empirical sample class covariance matrices, or by using a regularized estimator, e.g., lasso).

In the linear case (LDA), the Gaussians for each class are assumed to share the same covariance matrix:

$Σ_k=Σ$ for each class $k$. This leads to linear decision surfaces between classes. This is clear from comparing the log-probability ratios of 2 classes ($k$ and $l$):

$LOR=log\Big(\frac{P(Y=k│X)}{P(Y=l│X)}\Big)$
(the LOR=0 ↔the two probabilities are identical, i.e., same class)

$LOR=log\Big(\frac{P(Y=k│X}{P(Y=l│X)}\Big)=0 ⇔ (\mu_k-\mu_l)^T\sum^{-1}(\mu_k-\mu_1)=\frac{1}{2}({\mu_k}^T\sum^{-1}\mu_k-{\mu_l}^T\sum^{-1}\mu_l) $

But, in the more general, quadratic case of QDA, there are no assumptions on the covariance matrices $Σ_k$ of the Gaussians, leading to quadratic decision surfaces.

==LDA (Linear Discriminant Analysis)==

#LDA is similar to GLM (e.g., ANOVA and regression analyses), as it also attempt to express one dependent variable as a linear combination of other features or data elements, However, ANOVA uses categorical independent variables and a continuous dependent variable, whereas LDA has continuous independent variables and a categorical dependent variable (i.e. Dx/class label). Logistic regression and probit regression are more similar to LDA than ANOVA, as they also explain a categorical variable by the values of continuous independent variables.

predfun.lda = function(train.x, train.y, test.x, test.y, neg)
{
require("MASS")
lda.fit = lda(train.x, grouping=train.y)
ynew = predict(lda.fit, test.x)$\$$class
out.lda = confusionMatrix(test.y, ynew, negative=neg)
return( out.lda )
}

==QDA (Quadratic Discriminant Analysis)==

predfun.qda = function(train.x, train.y, test.x, test.y, neg)
{
require("MASS") # for lda function
qda.fit = qda(train.x, grouping=train.y)
ynew = predict(qda.fit, test.x)$\$$class
out.qda = confusionMatrix(test.y, ynew, negative=neg)
return( out.qda )
}

==k-Nearest Neighbors algorithm==

k-Nearest Neighbors algorithm (''k''-NN) is a non-parametric method for either classification or regression, where the input consists of the ''k'' closest '''training examples''' in the feature space, but the output depends on whether ''k''-NN is used for classification or regression:

*In ''k''-NN '''classification''', the output is a class membership (labels). Objects in the testing data are classified by a majority vote of their neighbors. Each object is assigned to a class that is most common among its ''k'' nearest neighbors (''k'' is always a small positive integer). When ''k''=1, then an object is assigned to the class of its single nearest neighbor.

*In ''k''-NN '''regression''', the output is the property value for the object representing the average of the values of its ''k'' nearest neighbors.

#X = as.matrix(input)     # Predictor variables X = as.matrix(input.short2)

#Y = as.matrix(output)     # Outcome

'''#KNN (k-nearest neighbors)'''

library("class")
#knn.fit.test <- knn(X, X, cl = Y, k=3, prob=F); predict(as.matrix(knn.fit.test), X) $\$$class 
#table(knn.fit.test, Y); confusionMatrix(Y, knn.fit.test, negative="1")
#This can be used for polytomous variable (multiple classes)

predfun.knn = function(train.x, train.y, test.x, test.y, neg)
{
require("class")
knn.fit = knn(train.x, test.x, cl = train.y, prob=T) # knn is already a prediction function!!!
#ynew = predict(knn.fit, test.x)$\$$class # no need of another prediction, in this case
out.knn = confusionMatrix(test.y, knn.fit, negative=neg)
return( out.knn )
}
cv.out.knn = '''crossval::crossval'''(predfun.knn, X, Y, K=5, B=2, neg="1")

Compare all 3 classifiers (lda, qda, knn, and logit)

diagnosticErrors(cv.out.lda$\$$stat); diagnosticErrors(cv.out.qda$\$$stat); diagnosticErrors(cv.out.qda$\$$stat);
diagnosticErrors(cv.out.logit$\$$stat);

[[Image:SMHS BigDataBigSci CrossVal5.png|500px]]

'''Now let’s look at the actual prediction models!'''

There are different approaches to split the data (partition the data) into Training and Testing sets.

#TRAINING: 75% of the sample size

sample_size <- floor(0.75 * nrow(input))
##set the seed to make your partition reproducible
set.seed(1234)
input.train.ind <- sample(seq_len(nrow(input)), size = sample_size)
input.train <- input[input.train.ind, ]
output.train <- as.matrix(output)[input.train.ind, ]

#TESTING DATA

input.test <- input[-input.train.ind, ]
output.test <- as.matrix(output)[-input.train.ind, ]

==k-Means Clustering (k-MC)==

k-MC aims to partition ''n'' observations into ''k'' clusters where each observation belongs to the cluster with the nearest mean which acts as a prototype of a cluster. The k-MC partitions the data space into Voronoi cells. In general there is no computationally tractable solution (NP-hard problem), but there are efficient algorithms that converge quickly to local optima (e.g., expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms2).

kmeans_model <- kmeans(input.train, 2)
layout(matrix(1,1))
plot(input.train, col = kmeans_model$\$$cluster)
points(kmeans_model$\$$centers, col = 1:2, pch = 8, cex = 2)

##cluster centers "fitted" to each obs.:
fitted.kmeans <- fitted(kmeans_model); head(fitted.kmeans)
resid.kmeans <- (input.train - fitted(kmeans_model))
#define the sum of squares function
ss <- function(data) sum(scale(data, scale = FALSE)^2)

##Equalities
cbind(kmeans_model[c("betweenss", "tot.withinss", "totss")], # the same two columns
c (ss(fitted.kmeans), ss(resid.kmeans), ss(input.train)))

#validation
stopifnot(all.equal(kmeans_model$\$$totss, ss(input.train)),
all.equal(kmeans_model$\$$tot.withinss, ss(resid.kmeans)),
##these three are the same:
all.equal(kmeans_model$\$$betweenss, ss(fitted.kmeans)),
all.equal(kmeans_model$\$$betweenss, kmeans_model$\$$totss - kmeans_model$\$$tot.withinss),
##and hence also
all.equal(ss(input.train), ss(fitted.kmeans) + ss(resid.kmeans))
)
kmeans(input.train,1)$\$$withinss     # trivial one-cluster, (its W.SS == ss(input.train))

2http://escholarship.org/uc/item/1rb70972

'''(1)## k-Nearest Neighbor Classification'''

library("class")
knn_model <- knn(train= input.train, input.test, cl=as.factor(output.train), k=2)
plot(knn_model)
summary(knn_model)
attributes(knn_model)

#cross-validation
knn_model.cv <- knn.cv(train= input.train, cl=as.factor(output.train), k=2)
summary(knn_model.cv)

==Appendix: R Debugging==

Most programs that give incorrect results are impacted by logical errors. When errors (bugs, exceptions) occur, we need explore deeper -- this procedure to identify and fix bugs is “debugging”.

R tools for debugging: traceback(), debug() browser() trace() recover()

'''traceback():''' Failing R functions report to the screen immediately the error. Calling traceback() will show the function where the error occurred. The traceback() function prints the list of functions that were called before the error occurred.
The function calls are printed in reverse order.

f1<-function(x) { r<- x-g1(x); r }

g1<-function(y) { r<-y*h1(y); r }

h1<-function(z) { r<-log(z); if(r<10) r^2 else r^3}

f1(-1)

Error in if (r < 10) r^2 else r^3 : missing value where TRUE/FALSE needed In addition: Warning message:
In log(z) : NaNs produced

traceback()
3: h(y)
2: g(x)
1: f(-1)

debug()

traceback() does not tell you where is the error. To find out which line causes the error, we may step through the function using debug().

debug(foo) flags the function foo() for debugging. undebug(foo) unflags the function.

When a function is flagged for debugging, each statement in the function is executed one at a time. After a statement is executed, the function suspends and user can interact with the R shell.

This allows us to inspect a function line-by-line.

'''Example''': compute sum of squared error SS

## compute sum of squares
SS<-function(mu,x) { d<-x-mu; d2<-d^2; ss<-sum(d2); ss }
set.seed(100); x<-rnorm(100); SS(1,x)

## to debug
debug(SS); SS(1,x)
debugging in: SS(1, x) debug: {
d <- x - mu d2 <- d^2
ss <- sum(d2) ss
}

In the debugging shell (“Browse[1]>”), users can:

• Enter '''n''' (next) executes the current line and prints the next one;

• Typing '''c''' (continue) executes the rest of the function without stopping;

• Enter '''Q''' quits the debugging;

• Enter '''ls()''' list all objects in the local environment;

• Enter an object name or print(<object name>) tells the current value of an object.

Example:

debug(SS)
SS(1,x)
debugging in: SS(1, x) debug: {
d <- x - mu d2 <- d^2
...
Browse[1]> n
debug: d <- x - mu ## the next command
Browse[1]> ls() ## current environment [1] "mu" "x" ## there is no d
Browse[1]> n ## go one step debug: d2 <- d^2 ## the next command
Browse[1]> ls() ## current environment [1] "d" "mu" "x" ## d has been created
Browse[1]> d[1:3] ## first three elements of d [1] -1.5021924 -0.8684688 -1.0789171
Browse[1]> hist(d) ## histogram of d
Browse[1]> where ## current position in call stack where 1: SS(1, x)
Browse[1]> n
debug: ss <- sum(d2)
Browse[1]> Q ## quit

'''undebug(SS)''' ## remove debug label, stop debugging process
SS(1,x) ## now call SS again will without debugging

You can label a function for debugging while debugging another function

f<-function(x) { r<-x-g(x); r }
g<-function(y) { r<-y*h(y); r }
h<-function(z) { r<-log(z); if(r<10) r^2 else r^3 }

debug(f) # ## If you only debug f, you will not go into g
f(-1)
Browse[1]> n
Browse[1]> n
Error in if (r < 10) r^2 else r^3 : missing value where TRUE/FALSE needed In addition: Warning message:
In log(z) : NaNs produced

But, we can also label ''g'' and ''h'' for debugging when we debug ''f''

f(-1)
Browse[1]> n
Browse[1]> debug(g)
Browse[1]> debug(h)
Browse[1]> n

Inserting a call to '''browser()''' in a function will pause the execution of a function at the point where browser() is called.
Similar to using debug() except you can control where execution gets paused.

'''Example:'''
h<-function(z) {
browser() ## a break point inserted here
r<-log(z); if(r<10) r^2 else r^3
}

f(-1)
Browse[1]> ls()
Browse[1]> z
Browse[1]> n
Browse[1]> n
Browse[1]> ls()
Browse[1]> c

Calling '''trace()''' on a function allows inserting new code into a function. The syntax for trace() may be challenging.

as.list(body(h))
trace("h",quote(if(is.nan(r)) {browser()}), at=3, print=FALSE)
f(1)
f(-1)

trace("h",quote(if(z<0) {z<-1}), at=2, print=FALSE)
f(-1)
untrace()

During the debugging process, '''recover()''' allows checking the status of variables in upper level functions. recover() can be used as an error handler using '''options()''' (e.g. options(error=recover)). When functions throw exceptions, execution stops at point of failure. Browsing the function calls and examining the environment may indicate the source of the problem.

==See also==
* [[SMHS_BigDataBigSci_CrossVal_| Back to Big Data Science and Cross-Validation]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GCM| Growth Curve Modeling (GCM)]]
* [[SMHS_BigDataBigSci_GCM| Generalized Estimating Equation (GEE) Modeling]]
* [[SMHS_BigDataBigSci|Back to Big Data Science]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_CrossVal_LDA_QDA}}

SMHS BigDataBigSci CrossVal

2016-05-24T14:08:48Z

Pineaumi: /* See also */

==[[SMHS_BigDataBigSci| Big Data Science]] - (Internal) Statistical Cross-Validaiton ==

== Questions ==
* What does it mean to validate a result, a method, approach, protocol, or data?
* Can we do “pretend” validations that closely mimic reality?

<center>[[Image:SMHS_BigDataBigSci_CrossVal1.png|250px]]</center>

''Validation'' is the scientific process of determining the degree of accuracy of a mathematical, analytic or computational model as a representation of the real world based on the intended model use. There are various challenges with using observed experimental data for model validation:

1. Incomplete details of the experimental conditions may be subject to boundary and initial conditions, sample or material properties, geometry or topology of the system/process.

2. Limited information about measurement errors due to lack of experimental uncertainty estimates.

Empirically observed data may be used to evaluate models with conventional statistical tests applied subsequently to test null hypotheses (e.g., that the model output is correct). In this process, the discrepancy between some model-predicted values and their corresponding/observed counterparts are compared. For example, a regression model predicted values may be compared to empirical observations. Under parametric assumptions of normal residuals and linearity, we could test null hypotheses like $slope = 1$ or $intercept = 0$. When comparing the model obtained on one training dataset to an independent dataset, the slope may be different from 1 and/or the intercept may be different from 0. The purpose of the regression comparison is a formal test of the hypothesis (e.g., $slope = 1, mean_{ observed} =mean_{ predicted},$ then the distributional properties of the adjusted estimates are critical in making an accurate inference. The logistic regression test is another example for comparing predicted and observed values. Measurement errors may creep in, due to sampling or analytical biases, instrument reading or recording errors, temporal or spatial sampling sample collection discrepancies, etc.

==Overview==

====Cross-validation====

Cross-validation is a method for validating of models by assessing the reliability and stability of the results of a statistical analysis (e.g., model predictions) based on independent datasets. For prediction of trend, association, clustering, etc., a model is usually trained on one dataset (training data) and tested on new unknown data (testing dataset). The cross-validation method defines a test dataset to evaluate the model avoiding overfitting (the process when a computational model describes random error, or noise, instead of underlying relationships in the data).

====Overfitting====

'''Example (US Presidential Elections):''' By 2014, there have been only '''56 presidential elections and 43 presidents'''. That is a small dataset, and learning from it may be challenging. '''If the predictor space expands to include things like having false teeth, it's pretty easy for the model to go from fitting the generalizable features of the data (the signal) and to start matching the noise.''' When this happens, the quality of the fit on the historical data may improve (e.g., better R2), but the model may fail miserably when used to make inferences about future presidential elections.

(Figure from http://xkcd.com/1122/)

<center>[[Image:SMHS BigDataBigSci_CrossVal2.png|400px]]</center>

'''Example (Google Flu Trends):''' A March 14, 2014 article in Science (''DOI: 10.1126/science.1248506''), identified problems in Google Flu Trends (http://www.google.org/flutrends/about/#US), DOI 10.1371/journal.pone.0023610, which may be attributed in part to overfitting. In February 2013, Nature reported that GFT was predicting more than double the proportion of doctor visits for influenza-like illness (ILI) than the Centers for Disease Control and Prevention (CDC), despite the fact that GFT was built to predict CDC reports.

GFT model found the best matches among 50 million search terms to fit 1,152 data points. The odds of finding search terms that match the propensity of the flu but are structurally unrelated, and so do not predict the future, were quite high. GFT developers, in fact, report weeding out seasonal search terms unrelated to the flu but strongly correlated to the CDC data, e.g., high school basketball season. The big GFT data may have overfitted the small number of cases. The GFT approach missed the non-seasonal 2009 influenza A–H1N1 pandemic.

'''Example (Autism).''' Autistic brains constantly overfit visual and cognitive stimuli. To an autistic person, a general conversation of several adults may seem like a cacophony due to super-sensitive detail-oriented hearing and perception tuned to literally pick up all elements of the conversation and the environment but downplay body language, sarcasm and non-literal cues. We can miss the forest for the trees when we start "overfitting," over-interpreting the noise on top of the actual signal. Ambient noise, trivial observations and unrelated perceptions may hide the true communication details.

During each communication (conversation) there are exchanges of both information and random noise. Fitting a perfect model is only listening to the “relevant” information. Over-fitting is when your attention is (excessively) consumed with the noise, or worse, letting the noise drown out the information exchange.

Any dataset is a mix of signal and noise. The main task of our brains are to sort these components and interpret the information (i.e., ignore the noise).

Our predictions are most accurate if we can model as much of the signal as possible and as little of the noise as possible. Note that in these terms, R2 is a poor metric to identify predictive power - it measures how much of the signal '''and''' the noise is explained by our model. In practice, it's hard to always identify what's signal and what's noise. This is why practical applications tends to favor simpler models, since the more complicated a model is the easier it is to overfit the noise component in the information.

'''Cross-validation is an iterative process''', where each step involves:

*Randomly partitioning a sample of data into 2 complementary subsets (training + testing),

*Performing the analysis on the training subset

*Validating the analysis on the testing subset

*Increase the iteration index and repeat the process (termination criteria can involve a fixed number, or a desired (mean?) variability or error-rate).

<center>[[Image:SMHS BigDataBigSci_CrossVal3.png|400px]]</center>

The validation results at each iteration are averaged, to reduce noise/variability, and reported.

Cross-validation guards against testing hypotheses suggested by the data themselves (aka: "Type III errors", False-Suggestion) in cases when new observations are hard to obtain (due to costs, reliability, time or other constraints).

Cross-validation is different from ''conventional-validation'' (e.g. 80%-20% partitioning the data set into training and testing subsets) as the in the conventional validation, the error (e.g. Root Mean Square Error, RMSE) on the training data is not a useful estimator of model performance, as it does not generalize across multiple samples. Errors of the conventional-valuation based on the results on the test data do not assess model performance, in general. A more fair way to properly estimate model prediction performance is to use cross-validation, which combines (averages) prediction errors or measures of fit to correct for the stochastic nature of training and testing data partitions and generate a more accurate and robust estimate of real model performance.

A more complex model ''overfits-the-data'', relative to a simpler model when the former generates accurate fitting results for known data but less accurate results when predicting based on new data (foresight). Knowledge from past experience includes information either ''relevant or irrelevant'' (noise) for the future information. In challenging data-driven predict models when uncertainty (entropy) is high, more noise is present in past information that needs to be ignored in future forecasting. However it is generally hard to discriminate patterns from noise in complex systems (i.e., deciding which part to model and which to ignore). Models that reduce the chance of fitting noise are called '''robust'''.

====Example (Linear Regression)====

We can demonstrate model assessment using linear regression. Suppose we observe response values $\{y_1,...,y_n\}$, and the corresponding $k$ predictors represented as a $kD$ vector of covariates $\{x_1,...,x_n\}$, where subjects/cases are indexed by $1 ≤ i ≤ n$, and the data-elements (variables) are indexed by $1 ≤ j ≤ k$.

\begin{pmatrix}
x_{1,1} & \cdots & x_{1,k} \\
\vdots & \ddots & \vdots \\
x_{n,1} & \cdots & x_{n,k}
\end{pmatrix}

Using least squares to estimate the linear function parameters (effect-sizes), $\{β_1,...,β_k\}$, allows us to compute a hyperplane $y = a + xβ$ that best fits the observed data $\{x_i,y_i\}_{1≤i≤n}$.

\begin{equation}
\begin{pmatrix}
y_{1} \\
\vdots \\
y_{n}
\end{pmatrix}

= \begin{pmatrix}
α_{1} \\
\vdots \\
α_{n}
\end{pmatrix}

+\begin{pmatrix}
x_{1,1} & \cdots & x_{1,k} \\
\vdots & \ddots & \vdots \\
x_{n,1} & \cdots & x_{n,k}
\end{pmatrix}

\begin{pmatrix}
β_{1} \\
\vdots \\
β_{k}
\end{pmatrix}
\end{equation}

$$
\begin{array}{lcl}
y_1=\alpha_1+x_{1,1}\beta_1+x_{1,2}\beta_2+...+x_{1,k}\beta_k\\
y_2=\alpha_2+x_{2,1}\beta_1+x_{2,2}\beta_2+...+x_{2,k}\beta_k \\
...\\
y_n=\alpha_n+x_{n,1}\beta_1+x_{n,2}\beta_2+...+x_{n,k}\beta_k
\end{array}$$

The model fit may be evaluated using the mean squared error (MSE). The MSE for a given value of the parameters ''α'' and ''β'' on the observed training data $\{x_i,y_i\}_{1 ≤ i ≤ n}$.

$$
\begin{equation} MSE=\frac{1}{n}\sum_{i=1}^{n} \Bigg(y_i-\underbrace{(\alpha_1+x_{i,1}\beta_1+x_{i,2}\beta_2+\cdots+x_{i,k}\beta_k)}_{(\text{predicted value, } \hat{y_i} \text{, at }\{x_{i,1,}\cdots,x_{i,k}\})} \Bigg)^2
\end{equation}
$$
<center>vs.</center>
$$
\begin{equation} RMSE=\sqrt{\frac{1}{n}\sum_{i=1}^{n} \Bigg(y_i-\underbrace{(\alpha_1+x_{i,1}\beta_1+x_{i,2}\beta_2+\cdots+x_{i,k}\beta_k)}_{(\text{predicted value, } \hat{y_i} \text{, at }\{x_{i,1,}\cdots,x_{i,k}\})} \Bigg)^2 }.
\end{equation}
$$

The expected value of the MSE (over the distribution of training sets) for the '''training set''' is $\frac{(n-k-1)}{(n + k + 1)} × E,$ where $E$ is the expected value of the MSE for the '''testing'''/'''validation''' data. Therefore, fitting a model and computing the MSE on the training set, we will get an over optimistic evaluation assessment (smaller RMSE) of how well the model may fit another dataset. This bias represents ''in-sample'' estimate of the fit, whereas we are interested in the cross-validation estimate as an ''out-of-sample'' estimate.

In the linear regression model, cross validation is not useful as we can compute the '''exact''' correction factor $\frac{(n - k - 1)}{(n + k + 1)}$ and correctly estimate the ''out-of-sample'' fit using the (MSE underestimate) ''in-sample'' MSE estimate. However, even in this situation, cross-validation remains useful as it can be used to select an optimal regularized cost function.

In most other modeling procedures (e.g. logistic regression), '''in general''', there are no simple closed-form expressions (formulas) to adjust the cross-validation error estimate from the in-sample fit estimate. Cross-validation is generally applicable way to predict the performance of a model on a validation set using stochastic computation instead of obtaining experimental, theoretical, mathematical, or analytic error estimates.

====Cross-validation methods====

There are 2 classes of cross-validation approaches – exhaustive and non-exhaustive.

'''Exhaustive cross-validation'''

Exhaustive cross-validation methods are based on determining all possible ways to divide the original sample into training and testing data. For example, the ''Leave-m-out cross-validation'' involves using $m$ observations for testing and the remaining ($n-m$) observations as training (when $m=1$, leave-1-out method). This process is repeated on all partitions of the original sample. This method requires model fitting and validating $C_m^n$ times ($n$ is the total number of observations in the original sample and $m$ is the number left out for validation). This requires a very large number of steps1.

'''Non-exhaustive cross-validation'''

Non-exhaustive cross validation methods avoid computing estimates/errors using all possible partitionings of the original sample, but rather approximates these. For example, in the '''''k''-fold cross-validation''', the original sample is randomly partitioned into $k$ equal sized subsamples. Of the $k$ subsamples, a single subsample is kept as final testing data for validation of the model. The other $k- 1$ subsamples are used as training data. The cross-validation process is then repeated $k$ times ($k$ folds). Each of the $k$ subsamples is used once as the validation data. There are corresponding $k$ results that are averaged (or otherwise aggregated) to generate a final model-quality estimation. In $k$-fold validation, all observations are used for both training and validation, and each observation is used for validation exactly once. In general, $k$ is a parameter that needs to be selected by investigator (common values may be 5, 10).

A general case of the $k$-fold validation is $k=n$ (the total number of observations), when it coincides with the '''leave-one-out cross-validation'''.

A variation of the $k$-fold validation is '''stratified k-fold cross-validation''', where each fold has the same (approximately) mean response value. For instance, if the model represents a binary classification of cases (e.g., NC vs. PD), this implies that each fold contains roughly the same proportion of the 2 class labels.

'''Repeated random sub-sampling validation''' splits randomly the entire dataset into training (where the model is fit) and testing data where the predictive accuracy is assessed). Again, the results are averaged over all iterative splits. This method has an advantage over $k$-fold cross validation as that the proportion of the training/testing split is not dependent on the number of iterations (folds). However, its drawback is that some observations may never be selected whereas others may be selected multiple-times in the testing/validation subsample, as validation subsets may overlap, and the results will vary each time we repeat the validation protocol (unless we set a seed point in the algorithm).

Asymptotically, as the number of random splits increases, the ''repeated random sub-sampling validation'' approaches the ''leave-k-out cross-validation''.

====Case-Studies====

'''Example 1: Parkinson’s Diseases Study''' involving neuroimaging, genetics, clinical, and phenotypic data for over 600 volunteers produced multivariate data for 3 cohorts (HC=Healthy Controls, PD=Parkinson’s, SWEDD= subjects without evidence for dopaminergic deficit).

# update packages
# update.packages()

#load the data: 06_PPMI_ClassificationValidationData_Short.csv
ppmi_data <-read.csv("https://umich.instructure.com/files/330400/download?download_frd=1",header=TRUE)

#binarize the Dx classes
ppmi_data$\$$ResearchGroup <- ifelse(ppmi_data$\$$ResearchGroup == "Control", "Control", "Patient")
attach(ppmi_data)

head(ppmi_data)

# Model-free analysis, classification
# install.packages("crossval")
#library("crossval")
require(crossval)
require(ada)
#set up adaboosting prediction function

#Define a new classification result-reporting function
my.ada <- function (train.x, train.y, test.x, test.y, negative, formula){
ada.fit <- ada(train.x, train.y)
predict.y <- predict(ada.fit, test.x)
#count TP, FP, TN, FN, Accuracy, etc.
out <- confusionMatrix(test.y, predict.y, negative = negative)
#negative is the label of a negative "null" sample (default: "control").
return (out)
}

#balance cases
#SMOTE: Synthetic Minority Oversampling Technique to handle class misbalance in binary classification.
set.seed(1000)
#install.packages("unbalanced") to deal with unbalanced group data
require(unbalanced)
ppmi_data$\$$PD <- ifelse(ppmi_data$\$$ResearchGroup=="Control",1,0)
uniqueID <- unique(ppmi_data$\$$FID_IID)
ppmi_data <- ppmi_data[ppmi_data$\$$VisitID==1,]
'''ppmi_data$\$$PD <- factor(ppmi_data$\$$PD)'''

colnames(ppmi_data)
#ppmi_data.1<-ppmi_data[,c(3:281,284,287,336:340,341)]
n <- ncol(ppmi_data)
output.1 <- ppmi_data$\$$PD

# remove Default Real Clinical subject classifications!
ppmi_data$\$$PD <- ifelse(ppmi_data$\$$ResearchGroup=="Control",1,0)
input <- ppmi_data[ ,-which(names(ppmi_data) %in% c("ResearchGroup","PD", "X", "FID_IID"))]
# output <- as.matrix(ppmi_data[ ,which(names(ppmi_data) %in% {"PD"})])
output <- as.factor(ppmi_data$\$$PD)
c(dim(input), dim(output))

# balance the dataset
data.1<-ubBalance(X= input, Y=output, type="ubSMOTE", percOver=300, percUnder=150, verbose=TRUE)
balancedData<-cbind(data.1$\$$X, data.1$\$$Y)
nrow(data.1$\$$X); ncol(data.1$\$$X)
nrow(balancedData); ncol(balancedData)
nrow(input); ncol(input)

colnames(balancedData) <- c(colnames(input), "PD")

###Check balance
##T test
alpha.0.05 <- 0.05
test.results.bin <- NULL # binarized/dichotomized p-values
test.results.raw <- NULL # raw p-values

# get a better error-handling t.test function that gracefully handles NA’s and trivial variances
my.t.test.p.value <- function(input1, input2) {
obj <- try(t.test(input1, input2), silent=TRUE)
if (is(obj, "try-error")) return(NA) else return('''obj$\$$p.value''')
}

for (i in 1:ncol(balancedData))
{
test.results.raw[i] <- my.t.test.p.value(input[,i], balancedData [,i])
test.results.bin[i] <- ifelse(test.results.raw[i] > alpha.0.05, 1, 0)
# binarize the p-value (0=significant, 1=otherwise)
print(c("i=", i, "var=", colnames(balancedData[i]), "t-test_raw_p_value=", test.results.raw[i]))
}

#we can also employ (e.g., FDR, Bonferonni) '''correction for multiple testing'''!
#test.results.corr <- '''stats::p.adjust'''(test.results.raw, method = "fdr", n = length(test.results.raw))
#where methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")
#plot(test.results.raw, test.results.corr)
#sum(test.results.raw < alpha.0.05, na.rm=T)/length(test.results.raw) #check proportion of inconsistencies
#sum(test.results.corr < alpha.0.05, na.rm =T)/length(test.results.corr)

#as the sample-size is changed:
length(input[,5]); length(balancedData [,5])
#to plot raw vs. rebalanced pairs (e.g., var="L_insular_cortex_Volume"), we need to equalize the lengths
plot (input[,5] +0*balancedData [,5], balancedData [,5]) # [,5] == "L_insular_cortex_Volume"

print(c("T-test results: ", test.results))
#zeros (0) are significant independent between-group T-test differences, ones (1) are insignificant

for (i in 1:(ncol(balancedData)-1))
{
test.results.raw [i] <- wilcox.test(input[,i], balancedData [,i])$\$$p.value
test.results.bin [i] <- ifelse(test.results.raw [i] > alpha.0.05, 1, 0)
print(c("i=", i, "Wilcoxon-test=", test.results.raw [i]))
}
print(c("Wilcoxon test results: ", test.results.bin))
#test.results.corr <- '''stats::p.adjust'''(test.results.raw, method = "fdr", n = length(test.results.raw))
#where methods are "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")
#plot(test.results.raw, test.results.corr)

'''#cross validation'''
#using '''raw data''':
X <- as.data.frame(input); Y <- output
neg <- "1" # "Control" == "1"

#using '''Rebalanced data''':
X <- as.data.frame(data.1$\$$X); Y <- data.1$\$$Y
#balancedData<-cbind(data.1$\$$X, data.1$\$$Y); dim(balancedData)

#'''Side note''': There is a function name collision for “crossval”, the same method is present in

#the “'''mlr'''” (machine Learning in R) package and in the “'''crossval'''” package.

#To specify a function call from a specific package do: packagename::functionname()

set.seed(115)
#cv.out <- crossval::crossval(my.ada, X, Y, K = 5, B = 1, negative = neg)
cv.out <- '''crossval::crossval'''(my.ada, X, Y, K = 5, B = 1, negative = neg)
#the label of a negative "null" sample (default: "control")
out <- diagnosticErrors(cv.out$\$$stat)

print(cv.out$\$$stat)
print(out)
{| class="wikitable" style="text-align:center; " border="1"
|-
|FP||TP||TN||FN
|-
|1.0||59.8||23.4||0.2
|}
{| class="wikitable" style="text-align:center; " border="1"
|-
|acc||sens||spec||ppv||npv||lor
|-
|0.9857820||0.9966667||0.9590164||0.9835526||0.9915254||8.8531796
|}

#Define a new LDA = Linear discriminant analysis predicting function

require("MASS") # for lda function

'''predfun.lda = function(train.x, train.y, test.x, test.y, negative)'''
{ lda.fit = lda(train.x, grouping=train.y)
ynew = predict(lda.fit, test.x)$\$$class
#count TP, FP etc.
out = confusionMatrix(test.y, ynew, negative=negative)
return( out )
}

'''(1) a simple example using the sleep dataset''' (containing the effect of two soporific drugs to increase hours of sleep (treatment-compared design) on 10 patients)

data(sleep)
X = as.matrix(sleep[,1, drop=FALSE]) # increase in hours of sleep,
# drop is logical, if TRUE the result is coerced to the lowest possible dimension.
#The default is to drop if only one column is left, but not to drop if only one row is left.
Y = sleep[,2] # drug given
plot(X ~ Y)
levels(Y) # "1" "2"
dim(X) # 20 1

set.seed(123456)
cv.out <- '''crossval::crossval'''(predfun.lda, X, Y, K=5, B=20, negative="1")
cv.out$\$$stat
diagnosticErrors(cv.out$\$$stat)

'''(2) A model-based example (linear regression) using the attitude dataset:'''

'''#?attitude, colnames(attitude)'''

#"rating" "complaints" "privileges" "learning" "raises" "critical" "advance"

#aggregated survey of clerical employees of an organization, representing 35 employees of 30

#(randomly selected) departments. Data=percent proportion of favorable responses to 7 questions in each department.

#Note: when using a data frame, a time-saver is to use “.” to indicate “include all covariates" in the DF.

#E.g., fit <- lm(Y ~ ., data = D)

data("attitude")
y = attitude[,1] # rating variable
x = attitude[,-1] # date frame with the remaining variables
is.factor(y)
summary( lm(y ~ . , data=x) ) # R-squared: 0.7326
#set up lm prediction function

'''predfun.lm = function(train.x, train.y, test.x, test.y)'''
{ lm.fit = lm(train.y ~ . , data=train.x)
ynew = predict(lm.fit, test.x )
#compute squared error risk (MSE)
out = mean( (ynew - test.y)^2)
#note that, in general, when fitting linear model to continuous outcome variable (Y),
#we can’t use the '''out<-confusionMatrix(test.y, ynew, negative=negative)''', as it requires a binary outcome
#this is why we use the MSE as an estimate of the discrepancy between observed & predicted values
return(out)
}

#require("MASS")

'''predfun.lda = function(train.x, train.y, test.x, test.y, negative)'''
{ lda.fit = lda(train.x, grouping=train.y)
ynew = predict(lda.fit, test.x)$\$$class
#count TP, FP etc.
out = confusionMatrix(test.y, ynew, negative=negative)
return( out )
}

#prediction MSE using all variables
set.seed(123456)
cv.out.lm = '''crossval::crossval'''(predfun.lm, x, y, K=5, B=20, negative="1")
c(cv.out.lm$\$$stat, cv.out.lm$\$$stat.se) # 72.581198 3.736784
#reducing to using only two variables
cv.out.lm = '''crossval::crossval'''(predfun.lm, x[,c(1,3)], y, K=5, B=20, negative="1")
c(cv.out.lm$\$$stat, cv.out.lm$\$$stat.se) # 52.563957 2.015109

'''(3) a real example using the ppmi_data'''

#ppmi_data <-read.csv("https://umich.instructure.com/files/330400/download?download_frd=1",header=TRUE)
#ppmi_data$\$$ResearchGroup <- ifelse(ppmi_data$\$$ResearchGroup == "Control", "Control", "Patient")
#attach(ppmi_data); head(ppmi_data)
#install.packages("crossval")
#library("crossval")
#ppmi_data$\$$PD <- ifelse(ppmi_data$\$$ResearchGroup=="Control",1,0)
#input <- ppmi_data[ ,-which(names(ppmi_data) %in% c("ResearchGroup","PD", "X", "FID_IID"))]
#output <- as.factor(ppmi_data$\$$PD)

#remove the irrelevant variables (e.g., visit ID)
output <- as.factor(ppmi_data$\$$PD)
input <- ppmi_data[, -which(names(ppmi_data) %in% c("ResearchGroup","PD", "X", "FID_IID", "VisitID"))]
X = as.matrix(input) # Predictor variables
Y = as.matrix(output) # Actual PD clinical assessment
dim(X); dim(Y)

layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page
fit <- lm(Y~X); plot(fit) # plot the fit
levels(as.factor(Y)) # "0" "1"
c(dim(X), dim(Y)) # 1043 103

set.seed(12345)
#cv.out.lm = '''crossval::crossval'''(predfun.lm, as.data.frame(X), as.numeric(Y), K=5, B=20)

cv.out.lda = crossval::crossval(predfun.lda, X, Y, K=5, B=20, negative="1")
#K=Number of folds; '''B=Number of repetitions.'''

#Results

cv.out.lda$\$$stat; cv.out.lda; diagnosticErrors(cv.out.lda$\$$stat)
cv.out.lm$\$$stat; cv.out.lm; diagnosticErrors(cv.out.lm$\$$stat)

'''The cross-validation (CV) output object includes the following components:'''

*stat.cv: ''Vector'' od statistics returned by predfun for each cross validation run

*stat: ''Mean'' the statistic returned by predfun averaged over all cross validation runs

*stat.se: ''Variability'': the corresponding standard error.

{| class="wikitable" style="text-align:center; " border="1"
|-
|FP||TP||TN||FN
|-
|0.06||96.94||33.14||2.06
|}

{| class="wikitable" style="text-align:center; " border="1"
|-
|acc||sens||spec||ppv||npv||lor
|-
|0.9839637||0.9791919||0.9981928||0.9993814||0.9414773||10.1655380
|}

[[Image:SMHS_BigDataBigSci_CrossVal4.png|500px]]

====Alternative predictor functions====

Logistic Regression

(See the earlier batch of class notes, https://umich.instructure.com/files/421847/download?download_frd=1)

#ppmi_data <-
read.csv("https://umich.instructure.com/files/330400/download?download_frd=1",header=TRUE)
# ppmi_data$\$$ResearchGroup <- ifelse(ppmi_data$\$$ResearchGroup == "Control", "Control", "Patient")
#install.packages("crossval"); library("crossval")
#ppmi_data$\$$PD <- ifelse(ppmi_data$\$$ResearchGroup=="Control",1,0)

#remove the irrelevant variables (e.g., visit ID)

output <- as.factor(ppmi_data$\$$PD)
input <- ppmi_data[, -which(names(ppmi_data) %in% c("ResearchGroup","PD", "X", "FID_IID", "VisitID"))]
X = as.matrix(input) # Predictor variables
Y = as.matrix(output)

'''Note that the predicted values are in LOG terms, so we need to exponentiate them to interpret them correctly'''

lm.logit <- glm(as.numeric(Y) ~ ., data = as.data.frame(X), family = "binomial")
ynew <- predict(lm.logit, as.data.frame(X)); plot(ynew)
ynew2 <- ifelse(exp(ynew)<0.5, 0, 1); plot(ynew2)

'''predfun.logit = function(train.x, train.y, test.x, test.y, neg)'''
{ lm.logit <- glm(train.y ~ ., data = train.x, family = "binomial")
ynew = predict(lm.logit, test.x )
#compute TP, FP, TN, FN
ynew2 <- ifelse(exp(ynew)<0.5, 0, 1)
out = confusionMatrix(test.y, ynew2, negative=neg) # Binary outcome, we can use confusionMatrix
return( out )
}

#Reduce the bag of explanatory variables, purely to simplify the interpretation of the analytics in this example!

input.short <- input[, which(names(input) %in% c("R_fusiform_gyrus_Volume",
"R_fusiform_gyrus_ShapeIndex", "R_fusiform_gyrus_Curvedness",
"Sex", "Weight", "Age" , "chr12_rs34637584_GT", "chr17_rs11868035_GT",
"UPDRS_Part_I_Summary_Score_Baseline", "UPDRS_Part_I_Summary_Score_Month_03",
"UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline",
"UPDRS_Part_III_Summary_Score_Baseline",
"X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline"
))]
X = as.matrix(input.short)

cv.out.logit = '''crossval::crossval'''(predfun.logit, as.data.frame(X), as.numeric(Y), K=5, B=2, neg="1")
cv.out.logit$\$$stat.cv
diagnosticErrors(cv.out.logit$\$$stat)

Caution: Note that if you forget to exponentiate the predicted logistic model values (see ynew2 in predict.logit), you will get nonsense results (e.g., all cases are predicted to be in one class, trivial sensitivity or NPP).

predfun.qda = function(train.x, train.y, test.x, test.y, negative)
{
require("MASS") # for lda function
qda.fit = qda(train.x, grouping=train.y)
ynew = predict(qda.fit,test.x)$\$$class
out.qda = confusionMatrix(test.y, ynew, negative=negative)
return( out.qda )
}

cv.out.qda = '''crossval::crossval'''(predfun.qda, as.data.frame(input.short), as.factor(Y), K=5, B=20, neg="1")
diagnosticErrors(cv.out.lda$\$$stat); diagnosticErrors(cv.out.qda$\$$stat);

This error message: “Error in qda.default(x, grouping, ...) : rank deficiency in group 1” indicates that there is a rank deficiency, i.e. some variables are collinear and one or more covariance matrices cannot be inverted to obtain the estimates in group 1 (Controls)!

If you remove the strongly correlated data elements ("R_fusiform_gyrus_Volume","R_fusiform_gyrus_ShapeIndex", and "R_fusiform_gyrus_Curvedness"), the rank-deficiency problem goes away!

input.short2 <- input[, which(names(input) %in%)"R_fusiform_gyrus_Volume",
"Sex", "Weight", "Age" , "chr17_rs11868035_GT",
"UPDRS_Part_I_Summary_Score_Baseline",
"UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline",
"UPDRS_Part_III_Summary_Score_Baseline",
"X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline"
))]
X = as.matrix(input.short2)
cv.out.qda = crossval::crossval(predfun.qda, as.data.frame(X), as.numeric(Y), K=5, B=2, neg="1")

Compare the QDA and GLM/Logit predictions:

diagnosticErrors(cv.out.qda$\$$stat); diagnosticErrors(cv.out.logit$\$$stat)

1http://www.ohrt.com/odds/binomial.php

==See also==
* [[SMHS_BigDataBigSci_CrossVal_LDA_QDA| Next Section: Foundation of LDA and QDA for prediction, dimensionality reduction or forecasting]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GCM| Growth Curve Modeling (GCM)]]
* [[SMHS_BigDataBigSci_GCM| Generalized Estimating Equation (GEE) Modeling]]
* [[SMHS_BigDataBigSci|Back to Big Data Science]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_CrossVal}}

SMHS BigDataBigSci SEM

2016-05-24T14:07:27Z

Pineaumi: /* See also */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Structural Equation Modeling (SEM) ==

SEM allow re-parameterization of random-effects to specify latent variables that may affect measures at different time points using structural equations. SEM show variables having predictive (possibly causal) effects on other variables (denoted by arrows) where coefficients index the strength and direction of predictive relations. SEM does not offer much more than what classical regression methods do, but it does allow simultaneous estimation of multiple equations modeling complementary relations.

SEM is a general multivariate statistical analysis technique that can be used for causal modeling/inference, path analysis, confirmatory factor analysis (CFA), covariance structure modeling, and correlation structure modeling.

===SEM Advantages===
* It allows testing models with multiple dependent variables
* Provides mechanisms for modeling mediating variables
* Enables modeling of error terms
* acilitates modeling of challenging data (longitudinal with auto-correlated errors, multi-level data, non-normal data, incomplete data)

SEM allows separation of observed and latent variables. Other standard statistical procedures may be viewed as special cases of SEM, where statistical significance less important, than in other techniques, and covariances are the core of structural equation models.

===Definitions===
*The disturbance, D, is the variance in Y unexplained by a variable X that is assumed to affect Y.
X → Y ← D

* Measurement error, E, is the variance in X unexplained by A, where X is an observed variable that is presumed to measure a latent variable, A.
A → X ← E

* Categorical variables in a model are exogenous (independent) or endogenous (dependent).

===Notation===

* In SEM observed (or manifest) indicators are represented by squares/rectangles whereas latent variables (or factors) represented by circles/ovals.

<center>[[Image:SMHS_BigDataBigSci1.png|500px]]</center>

*'''Relations: Direct effects''' (→), '''Reciprocal effects''' (↔ or ⇆), and '''Correlation or covariance''' (⤻ or ⤺) all have different appearance in SEM models.

===Model Components===

The measurement part of SEM model deals with the latent variables and their indicators. A pure measurement model is a confirmatory factor analysis (CFA) model with unmeasured covariance (bidirectional arrows) between each possible pair of latent variables. There are straight arrows from the latent variables to their respective indicators and straight arrows from the error and disturbance terms to their respective variables, but no direct effects (straight arrows) connecting the latent variables. The measurement model is evaluated using goodness of fit measures (Chi-Square test, BIC, AIC, etc.) Validation of the measurement model is always first.

Then we proceed to the structural model (including a set of exogenous and endogenous variables together with the direct effects (straight arrows) connecting them along with the disturbance and error terms for these variables that reflect the effects of unmeasured variables not in the model).

===Notes===

* Sample-size considerations: mostly same as for regression - more is always better.
* Model assessment strategies: Chi-square test, Comparative Fit Index, Root Mean Square Error, Tucker Lewis Index, Goodness of Fit Index, AIC, and BIC.>
* Choice for number of Indicator variables: depends on pilot data analyses, a priori concerns, fewer is better.

===[[SMHS_BigDataBigSci_SEM_Ex1|Hands-on Example 1 (School Kids Mental Abilities)]]===

===[[SMHS_BigDataBigSci_SEM_Ex2|Hands-on Example 2 (Parkinson’s Disease data)]]===

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_GCM| Next Section: Growth Curve Modeling]]
* [[SMHS_BigDataBigSci_GCM| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_SEM}}

SMHS BigDataBigSci

2016-05-24T13:57:39Z

Pineaumi: /* Generalized Estimating Equation (GEE) Modeling */

==[[SMHS| Scientific Methods for Health Sciences]] - Model-based Analyses ==

Structural Equation Modeling (SEM), Growth Curve Models (GCM), and Generalized Estimating Equation (GEE) Modeling

==Questions ==

*How to represent dependencies in linear models and examine causal effects?
*Is there a way to study population average effects of a covariate against specific individual effects?

==Overview==

SEM allow re-parameterization of random-effects to specify latent variables that may affect measures at different time points using structural equations. SEM show variables having predictive (possibly causal) effects on other variables (denoted by arrows) where coefficients index the strength and direction of predictive relations. SEM does not offer much more than what classical regression methods do, but it does allow simultaneous estimation of multiple equations modeling complementary relations.

Growth Curve (or latent growth) modeling is a statistical technique employed in SEM for estimating growth trajectories for longitudinal data (over time). It represent repeated measures of dependent variables as functions of time and other covariates. When subjects or units are observed repeatedly over known time points latent growth curve models reveal the trend of an individual as a function of an underlying growth process where the growth curve parameters can be estimated for each subject/unit.

GEE is a marginal longitudinal method that directly assesses the mean relations of interest (i.e., how the mean dependent variable changes over time), accounting for covariances among the observations within subjects, and getting a better estimate and valid significance tests of the relations. Thus, GEE estimates two different equations, (1) for the mean relations, and (2) for the covariance structure. An advantage of GEE over random-effect models is that it does not require the dependent variable to be normally distributed. However, a disadvantage of GEE is that it is less flexible and versatile – commonly employed algorithms for it require a small-to-moderate number of time points evenly (or approximately evenly) spaced, and similarly spaced across subjects. Nevertheless, it is a little more flexible than repeated-measure ANOVA because it permits some missing values and has an easy way to test for and model away the specific form of autocorrelation within subjects.

GEE is mostly used when the study is focused on uncovering the population average effect of a covariate vs. the individual specific effect. These two things are only equivalent for linear models, but not in non-linear models.

For instance, suppose $Y_{i,j}$ is the random effects logistic model of the $j^{th}$, observation of the $i^{th}$ subject, then
$
log\Bigg(\frac{p_{i,j}}{1-p_{i,j}} \Bigg)=μ+ν_i,
$
where $ν_i \sim N(0,σ^2)$ is a random effect for subject i and $p_{i,j}=P(Y_{i,j}=1|ν_i).$

(1) When using a random effects model on such data, the estimate of μ accounts for the fact that a mean zero normally distributed perturbation was applied to each individual, making it ''individual-specific''.

(2) When using a GEE model on the same data, we estimate the population average log odds,

\begin{equation}
δ=log\Bigg(\frac{E_v(\frac{1}{1+e^{-μ+v}i})}{1-E_v(\frac{1}{1+e^{-μ+v}i})}
\Bigg),
\end{equation}

in general $μ≠δ$.

If $μ=1$ and $σ^2=1$, then $δ≈.83$.

empirically:

m <- 1; s <- 1; v<-rnorm(1000, 0,s); v2 <- 1/(1+exp(-m+v)); v_mean <- mean(v2)

d <- log(v_mean/(1-v_mean)); d

Note that the random effects have mean zero on the transformed, linked, scale, but their effect is not mean zero on the original scale of the data. We can also simulate data from a mixed effects logistic regression model and compare the population level average with the inverse-logit of the intercept to see that they are not equal. This leads to a difference of the interpretation of the coefficients between GEE and random effects models, or SEM.

That is, there will be a difference between the GEE population average coefficients and the individual specific coefficients (random effects models).

# theoretically, if it can be computed:

$E(Y)=μ=1$ (in this specific case), but the expectation of the population average log odds
$δ=log\Bigg[\frac{P(Y_{i,j}=1|v_i)}{1-P(Y_{i,j}=1|v_i)}\Bigg]$ would be $< 1$ 1.
Note that this is kind of related to the fact that a grand-total average need not be equal to an average of partial averages.

The mean of the $i^{th}$ person in the $j^{th}$ observation (e.g., location, time, etc.) can be expressed by:

$E(Yij | Xij,α_j)= g[μ(Xij|β)+Uij(α_j,Xij)]$,

Where $μ(X_{ij}|β)$ is the average “response” of a person with the same covariates $X_{ij}$, $β$ a set of fixed effect coefficients, and $Uij(α_j,Xij)$ is an error term that is a function of the (time, space) random effects, $α_j$, and also a function of the covariates $X_{ij}$, and $g$ is the '''link function''' which specifies the regression type -- e.g.,

*linear:''' $g^{-1} (u)=u,$

*log:''' $g^{-1} (u)= log(u),$

*logistic:''' $g^{-1} (u)=log(\frac{u}{1-u})$

*$E(Uij(α_j,Xij)|Xij)=0.$

The link function, $g(u)$, provides the relationship between the linear predictor and the mean of the distribution function. For practical applications there are many commonly used link functions. It makes sense to try to match the domain of the link function to the range of the distribution function's mean.

<center>Common distributions with typical uses and canonical link functions</center>
<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|Distribution ||Support of distribution||Typical uses||Link name||Link function||Mean function
|-
|Normal||real: $(-∞, +∞)$||Linear-response data||Identity||$X\beta=\mu$||$\mu=X\beta$
|-
|Exponential, Gamma||real:$(0, +∞)$||Exponential-response data, scale parameters||Inverse||$X\beta=-\mu^{-1}$||$\mu=-(X\beta)^{-1}$
|-
|Inverse Gaussian||real:$(0, +∞)$|| ||Inverse squared||$X\beta=-\mu^{-2}$||$\mu=(-X\beta)^{-1/2}$
|}
</center>

===Footnotes===

*1 http://www.researchgate.net/publication/41895248

==Model-based Analytics==

===[[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]===

===[[SMHS_BigDataBigSci_GCM| Growth Curve Modeling (GCM)]]===

===[[SMHS_BigDataBigSci_GCM| Generalized Estimating Equation (GEE) Modeling]]===

===[[SMHS_BigDataBigSci_CrossVal|Internal Validation - Statistical n-fold cross-validaiton]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci}}

SMHS BigDataBigSci GCM

2016-05-23T20:00:01Z

Pineaumi: /* See also */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Growth Curve Models==

Latent growth curve models may be used to analyze longitudinal or temporal data where the outcome measure is assessed on multiple occasions, and we examine its change over time, e.g., the trajectory over time can be
modeled as a linear or quadratic function. Random effects are used to capture individual differences by conveniently representing (continuous) latent variables, aka growth factors. To fit a linear growth model we may specify a model with two latent variables: a random intercept, and a random slope:

#load data 05_PPMI_top_UPDRS_Integrated_LongFormat.csv ( dim(myData) 661 71), wide
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330395/download?download_frd=1&verifier=v6jBvV4x94ka3EYcGKuXXg5BZNaOLBVp0xkJih0H",header=TRUE)
attach(myData)

# dichotomize the "ResearchGroup" variable
table(myData$\$$ResearchGroup)
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# linear growth model with 4 timepoints
# intercept (i) and slope (s) with fixed coefficients
# i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 (intercept/constant)
# s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (slope/linear term)
# ??? =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (quadratic term)

In this model, we have fixed all the coefficients of the linear growth functions:

model4 <-
'
i =~ 1*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
1*UPDRS_Part_I_Summary_Score_Month_06 + 1*UPDRS_Part_I_Summary_Score_Month_09 +
1*UPDRS_Part_I_Summary_Score_Month_12 + 1*UPDRS_Part_I_Summary_Score_Month_18 +
1*UPDRS_Part_I_Summary_Score_Month_24 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
1*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
1*UPDRS_Part_III_Summary_Score_Month_06 + 1*UPDRS_Part_III_Summary_Score_Month_09 +
1*UPDRS_Part_III_Summary_Score_Month_12 + 1*UPDRS_Part_III_Summary_Score_Month_18 +
1*UPDRS_Part_III_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
2*UPDRS_Part_I_Summary_Score_Month_06 + 3*UPDRS_Part_I_Summary_Score_Month_09 +
4*UPDRS_Part_I_Summary_Score_Month_12 + 5*UPDRS_Part_I_Summary_Score_Month_18 +
6*UPDRS_Part_I_Summary_Score_Month_24 +
0*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
2*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
3*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
4*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
5*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
6*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
0*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
2*UPDRS_Part_III_Summary_Score_Month_06 + 3*UPDRS_Part_III_Summary_Score_Month_09 +
4*UPDRS_Part_III_Summary_Score_Month_12 + 5*UPDRS_Part_III_Summary_Score_Month_18 +
6*UPDRS_Part_III_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
'

fit4 <- growth(model4, data=myData)
summary(fit4)
parameterEstimates(fit4) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame
fitted(fit4) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit4)

==Measures of model quality (Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA))==

# report the fit measures as a signature vector: Comparative Fit Index (CFI), Root Mean Square Error of
# Approximation (RMSEA)
fitMeasures(fit4, c("cfi", "rmsea", "srmr"))

====Comparative Fit Index====

(CFI) is an incremental measure directly based on the non-centrality measure. If d = χ2(df) where df are the degrees of freedom of the model, the Comparative Fit Index is:
$
\frac{(Null Model)-d(Proposed Model)}{d(Null Model)}.
$

$0≤CFI≤1$ (by definition). It is interpreted as:

*$CFI<0.9$ - model fitting is poor.

*$0.9≤CFI≤0.95$ is considered marginal,

*$CFI>0.95$ is good.

CFI is a relative index of model fit – it compare the fit of your model to the fit of (the worst) fitting null model.

====Root Mean Square Error of Approximation====
(RMSEA) - “Ramsey”

An absolute measure of fit based on the non-centrality parameter:

$\sqrt{\frac{X^2-df}{df×(N - 1)}}$,

where N the sample size and df the degrees of freedom of the model. If χ2 < df, then the RMSEA∶=0. It has a penalty for complexity via the chi square to df ratio. The RMSEA is a popular measure of model fit.

*RMSEA < 0.01, excellent,

*RMSEA < 0.05, good

*RMSEA > 0.10 cutoff for poor fitting models

====Standardized Root Mean Square Residual====
(SRMR) is an absolute measure of fit defined as the standardized difference between the observed correlation and the predicted correlation. A value of zero indicates perfect fit. The SRMR has no penalty for model complexity. SRMR <0.08 is considered a good fit.

# inspect the model results (report parameter table)
inspect(fit4)

#install.packages("semTools")
# library("semTools")

A Simpler Model (fit5)

model5 <- '
# intercept and slope with fixed coefficients
i =~ UPDRS_Part_I_Summary_Score_Baseline + UPDRS_Part_I_Summary_Score_Month_03 + UPDRS_Part_I_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 + 6*UPDRS_Part_I_Summary_Score_Month_24
# regressions
i ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
s ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
# time-varying covariates
UPDRS_Part_I_Summary_Score_Baseline ~ Weight
UPDRS_Part_I_Summary_Score_Month_03 ~ ResearchGroup
UPDRS_Part_I_Summary_Score_Month_24 ~ Age
'

fit5 <- growth(model5, data=myData)
summary(fit5); fitMeasures(fit5, c("cfi", "rmsea", "srmr"))
parameterEstimates(fit5) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame

lavaan (0.5-18) converged normally after 99 iterations
Number of observations 661
Estimator ML
Minimum Function Test Statistic 3.703
Degrees of freedom 1
P-value (Chi-square) 0.054
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Latent variables:
i =~
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 1.074
UPDRS_P_I_S_S 1.172
s =~
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 6.000

Regressions:
i ~
R_fsfrm_gyr_V 0.000
Weight 0.003
ResearchGroup -0.880
Age -0.009
c12_34637584_ -0.907
s ~
R_fsfrm_gyr_V -0.000
Weight -0.000
ResearchGroup -0.084
Age 0.002
c12_34637584_ -0.047
UPDRS_Part_I_Summary_Score_Baseline ~
Weight -0.000
UPDRS_Part_I_Summary_Score_Month_03 ~
ResearchGroup 0.693
UPDRS_Part_I_Summary_Score_Month_24 ~
Age -0.002

Covariances:
i ~~
s 0.074

Intercepts:
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
i 1.633
s -0.023

Variances:
UPDRS_P_I_S_S 1.017
UPDRS_P_I_S_S 1.093
UPDRS_P_I_S_S 2.993
i 1.019
s -0.025

cfi rmsea srmr
0.996 0.064 0.008

fitted(fit5) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model
# write.table(fitted(fit5), file="C:\\Users\\Dinov\\Desktop\\test1.txt")

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit5)

# report the fit measures as a signature vector
fitMeasures(fit5, c("cfi", "rmsea", "srmr")) # comparative fit index (CFI)

# inspect the model results (report parameter table)
inspect(fit5)

Note: See discussion of SEM modeling pros/cons 2.

==Generalized Estimating Equation (GEE) Modeling==

Generalized Estimating Equations (GEE) modeling3 is used for analyzing data with the following characteristics:
(1) the observations within a group may be correlated, (2) observations in separate clusters are independent, (3) a monotone transformation of the expectation is linearly related to the explanatory variables, and (4) the variance is a function of the expectation. The expectation (#3) and the variance (# 4) are conditional given group-level or individual-level covariates.

GEE is applied to handle correlated discrete and continuous outcome variables. For the outcome variables, it only requires specification of the first 2 moments and correlation among them. The goal is to estimate fixed parameters without specifying their joint distribution. The correlation is specified by one of these 4 alternatives (which is specified in the R call: geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = " exchangeable", scale.fix = TRUE):

<center>[[Image:SMHS_BigDataBigSci8.png|300px]]</center>

===Respiratory Illness GEE R example===

This example is based on a data set on respiratory illness 4 and the geepack package. The data is from a clinical study of the treatment effects on patients with respiratory illness. N=111 patients from 2 clinical centers randomized to receive either placebo or active treatments. 4 temporal examinations assessed the respiratory state of patients as good (=1) or poor (=0). Explanatory variables characterizing a patient were: center (1,2), treatment (A=active, P=placebo), sex (M=male, F=female), age (in years) at baseline. The values of the covariates were constant for the repeated elementary observations on each patient.

Table 1 shows the number of patients for the response patterns across the 4 visits split by baseline-status and treatment. Baseline respiratory status = 0 appear to have either low or high number of positive responses. Baseline respiratory status = 1 tend to respond positively. Table 2 describes the distribution of the number of positive responses per patient for sex and center.

# library("geepack")

Table 1: Distribution of patients for different response patterns classified by baseline-respiratory response and treatment. The patterns are ordered according to increasing numbers of positive responses.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! ||Visit|| colspan="15"| All Possible Response Patterns (2*2*2*2=16 permutation patterns)||
|-
|||1||0||1||0||0||0||1||1||1||0||0||1||1||1||0||1||
|-
|||2||0||0||1||0||0||1||0||0||1||0||1||1||0||1||1||
|-
|||3||0||0||0||1||0||0||1||0||1||1||1||0||1||1||1||
|-
|||4||0||0||0||0||1||0||0||1||0||1||0||1||1||1||1||
|-
!Baseline||Treatment||||||||||||||||||||||||||||||||Sum
|-
| rowspan="2"|0||A||7||2||2||2||1||0||1||0||1||0||1||2||0||4||7||30
|-
|P||18||1||0||2||1||2||0||0||1||0||0||1||2||0||3||31
|-
|rowspan="2"|1||A||0||0||0||0||0||0||1||1||0||0||4||0||1||0||17||24
|-
|P||1||4||1||0||0||0||0||1||1||3||1||1||2||1||10||26
|-
|Sum||||26||7||3||4||2||2||2||2||3||3||6||4||5||5||37||111
|}
</center>

Table 2: Distribution of patients for the number of positive responses across the 4 visits for Sex and Center.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! colspan="2" rowspan="2"| ||colspan="5"|Number of positive responses
|-
| 0||1||2||3||4
|-
|rowspan="2"|Sex || F||7||3||3||3||7
|-
|M||19||13||9||17||30
|-
|rowspan="2"|Center|| 1||18||9||6||11||12
|-
|2||8||7||6||9||25
|}
</center>

Figure 1 shows a plot of age against the proportion of positive responses for each patient. It indicates a quadratic relationship between the proportions and the age. Fitting a logistic model to the data (which would be appropriate if there were no time effects and no spread in the response probabilities for patients with the same covariate values).

# install.packages("geepack")
library("geepack")

# data include a clinical trial of 111 patients with respiratory illness from two different clinics were randomized to receive either
# placebo (P) or an active (A) treatment. Patients were examined at baseline and at four visits during treatment.
# At each examination, respiratory status (categorized as 1 = good, 0 = poor)
data("respiratory")
head(respiratory)
myData <- respiratory

<center>head(myData)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Center||ID||Treat||Sex||Age||Baseline||Visit||Outcome
|-
|1 ||1||1||P||M||46||0||1||0
|-
|2 ||1||1||P||M||46||0||2||0
|-
|3 ||1||1||P||M||46||0||3||0
|-
|4 ||1||1||P||M||46||0||4||0
|-
|5||1||2||P||M||28||0||1||0
|-
|6||1||2||P||M||28||0||2||0
|}
</center>

# Get proportions of positive responses
responses <- factor(myData$\$$outcome, labels = c("OutcomePositive", "OutcomeNegative"))
data.frame <- data.frame(responses, myData$\$$age)
head(data.frame)
tab <- prop.table(table(data.frame), 1); tab # compute proportions
sum(tab[1,]) # check proportions (sums to 1.0)?
prop <- tab[1,] # save the proportions of positive responses for each patient
plot(as.numeric(dimnames(tab)$\$$myData.age), tab[1,], xlab = "Age", ylab = "Proportion of Positive Outcomes")
# dimnames(tab) # to see/inspect positive/negative outcomes

[[Image:SMHS_BigDataBigSci9.png|500px]]

x <- as.numeric(dimnames(tab)$\$$myData.age)
poly <- loess( prop ~ x) # fit a Local Polynomial Regression Fitting
plot(x, prop)
lines(predict(poly), col='red', lwd=2)

smoothingSpline <- smooth.spline(x, prop, spar=0.6)
plot(x, prop)
lines(smoothingSpline, col='red', lwd=1.5)
smoothPolySpline <- smooth.spline(x, predict(poly), spar=0.6)
lines(smoothPolySpline, col='blue', lwd=2)
legend("topright", inset=.05, title="Polynomial regression models", c("Raw Poly","Smooth Poly"), fill=c('red', 'blue'), horiz=TRUE)

[[Image:SMHS_BigDataBigSci10.png|500px]]

model.glm <- glm(outcome ~ baseline + center + sex + treat + age + I(age^2), data = respiratory, family = binomial)

summary(model.glm)

<center>Deviance Residuals:
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -2.5951||-0.9108||0.4034||0.8336||2.0951
|}
</center>

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Estimate||Std. Error||z value||$Pr( \gt |z|)$
|-
|(Intercept)||3.3579727||1.0285292||3.265||0.0011 **
|-
|baseline||1.8850421||0.2482959||7.592||3.15e-14 ***
|-
|center||0.5099244||0.2453982||2.078||0.0377 *
|-
|sexM||-0.4510595||0.3166570||-1.424||0.1543
|-
|Treatp||-1.3231587||0.2431603||-5.442||5.28e-08 ***
|-
|age||-0.2072815||0.0472538||-4.387||1.15e-05 ***
|-
|I(age^2)||0.0025650||0.0006324||4.056||4.99e-05 ***
|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 609.41 on 443 degrees of freedom

Residual deviance: 468.62 on 437 degrees of freedom

AIC: 482.62

The correlation matrix of the of the outcome measures across visits is shown in Table 3.

attach(myData)
mat1 <- matrix(c(outcome[visit==1], outcome [visit==2], outcome [visit==3],
outcome[visit==4]), ncol = 4)
cor(mat1)

Table 3: Correlation matrix for the outcome measurements at different visits.

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||[,1]||[,2]||[,3]||[,4]
|-
|[,1]||1.0000000||0.5087944||0.4431438||0.5139016
|-
|[,2]||0.5087944||1.0000000||0.5821877||0.5301611
|-
|[,3]||0.4431438||0.5821877||1.0000000||0.5871276
|-
|[,4]||0.5139016||0.5301611||0.5871276||1.0000000
|}
</center>

# We can also examine for multicollinearity problem, using the correlation matrix for X
cor(model.matrix(model.glm)[,-1])

# GEE modeling: R function arguments/options

*corstr= for defining the correlation structure within groups in a GEE model

*id= is used to identify the grouping variable in a GEE model

*scale.fix= when TRUE causes the scale parameter to be fixed (by default at 1) rather than estimated

*waves= names a positive integer-valued variable that is used to identify the order and spacing of observations within groups in a GEE model. This argument is crucial when there are missing values and gaps in the data

gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = "exchangeable", scale.fix = TRUE)

# The column labeled Wald in the summary table is the square of the z-statistic. The reported p-values are the
# upper tailed probabilities from a chisq1 distribution and test whether the true parameter value ≠0.
summary(gee.model1)

# To test the effect of ''treatment'' using anova()
gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id = id, corstr = "exchangeable", std.err="san.se")
gee.model2 <- geeglm(outcome ~ center + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id=id, corstr = "exchangeable", std.err="san.se")
anova(gee.model1, gee.model2)

# To test whether a categorical predictor with more than two levels should be retained in a GEE model we need
# to test the entire set of dummy variables simultaneously as a single construct.
# The geepack package provides a method for the anova function for a multivariate Wald test
# When the anova function is applied to a single geeglm object it returns sequential Wald tests for
# individual predictors with the tests carried out in the order the predictors are listed in the model formula.
anova(gee.model1)

===PD GEE example===

This example used the PPMI/PD data to show GEE analysis.

# 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv
longData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1",header=TRUE)

# library("geepack")

# Data Elements: FID_IID L_insular_cortex_ComputeArea L_insular_cortex_Volume R_insular_cortex_ComputeArea R_insular_cortex_Volume L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea R_putamen_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT chr17_rs12185268_GT chr17_rs199533_GT UPDRS_part_I UPDRS_part_II UPDRS_part_III time_visit

dim(longData)

data1 = na.omit(longData)
attach(data1)
ControlGroup <- ifelse(ResearchGroup == "Control", 1, 0)

# these calculations take a long time!!!
# if you get “Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca, :
# nrow(zsca) and length(y) not match” – this indicates some of the variables are of different lengths
# if you get “glm.fit: algorithm did not converge” – see this discussion: http://goo.gl/lrjBjB

gee.model0 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

gee.model1 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

# compare 2 gee models
# anova(gee.model0,gee.model1)

# you can try the “family = poisson(link = "log")” model for the ResearchGroup response, as well

gee.model2 <- geeglm(ControlGroup
~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+R_insular_cortex_ComputeArea+ R_insular_cortex_Volume +L_cingulate_gyrus_ComputeArea + L_cingulate_gyrus_Volume + R_cingulate_gyrus_ComputeArea + R_cingulate_gyrus_Volume + L_caudate_ComputeArea + L_caudate_Volume + R_caudate_ComputeArea + R_caudate_Volume + L_putamen_ComputeArea + L_putamen_Volume + R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr12_rs34637584_GT + chr17_rs11868035_GT + chr17_rs11012_GT + chr17_rs393152_GT + chr17_rs12185268_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

Remember that we do not interpret GEE coefficients as relating to individuals – GEE models are marginal models and the conclusions drawn are interpreted as population-based. Also, the time element in the model (time_visit) is just another controlling factor. The effect-sizes (betas) associated with each variable/predictor represent the slopes associated with the corresponding covariate, while holding time constant. If we need to examine interactions (e.g., Weight change over Time), we need to include an interaction term in model: (i.e. + Weight*time_visit).

summary (gee.model2)

# Individual Wald test and confidence intervals for each covariate
predictors2 <- coef(summary(gee.model2))
CI2 <- with(as.data.frame(predictors2), cbind(lwr=Estimate-1.96*Std.err, est=Estimate, upr=Estimate+1.96*Std.err))
rownames(CI2) <- rownames(predictors2)
CI2

==Appendix==

SEM References

*http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf

GEE References

*https://cran.r-project.org/web/packages/geepack/geepack.pdf

*http://www.jstatsoft.org/v15/i02/paper

===Footnotes===

*2 http://www.imachordata.com/ecological-sems-and-composite-variables-what-why-and-how/
*3 http://www.jstatsoft.org/v15/i02/
*4 https://books.google.com/books?id=mdEqBgAAQBAJ

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GCM}}

SMHS BigDataBigSci GCM

2016-05-23T19:58:06Z

Pineaumi: /* Model-based Analytics - Growth Curve Models */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Growth Curve Models==

Latent growth curve models may be used to analyze longitudinal or temporal data where the outcome measure is assessed on multiple occasions, and we examine its change over time, e.g., the trajectory over time can be
modeled as a linear or quadratic function. Random effects are used to capture individual differences by conveniently representing (continuous) latent variables, aka growth factors. To fit a linear growth model we may specify a model with two latent variables: a random intercept, and a random slope:

#load data 05_PPMI_top_UPDRS_Integrated_LongFormat.csv ( dim(myData) 661 71), wide
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330395/download?download_frd=1&verifier=v6jBvV4x94ka3EYcGKuXXg5BZNaOLBVp0xkJih0H",header=TRUE)
attach(myData)

# dichotomize the "ResearchGroup" variable
table(myData$\$$ResearchGroup)
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# linear growth model with 4 timepoints
# intercept (i) and slope (s) with fixed coefficients
# i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 (intercept/constant)
# s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (slope/linear term)
# ??? =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (quadratic term)

In this model, we have fixed all the coefficients of the linear growth functions:

model4 <-
'
i =~ 1*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
1*UPDRS_Part_I_Summary_Score_Month_06 + 1*UPDRS_Part_I_Summary_Score_Month_09 +
1*UPDRS_Part_I_Summary_Score_Month_12 + 1*UPDRS_Part_I_Summary_Score_Month_18 +
1*UPDRS_Part_I_Summary_Score_Month_24 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
1*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
1*UPDRS_Part_III_Summary_Score_Month_06 + 1*UPDRS_Part_III_Summary_Score_Month_09 +
1*UPDRS_Part_III_Summary_Score_Month_12 + 1*UPDRS_Part_III_Summary_Score_Month_18 +
1*UPDRS_Part_III_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
2*UPDRS_Part_I_Summary_Score_Month_06 + 3*UPDRS_Part_I_Summary_Score_Month_09 +
4*UPDRS_Part_I_Summary_Score_Month_12 + 5*UPDRS_Part_I_Summary_Score_Month_18 +
6*UPDRS_Part_I_Summary_Score_Month_24 +
0*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
2*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
3*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
4*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
5*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
6*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
0*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
2*UPDRS_Part_III_Summary_Score_Month_06 + 3*UPDRS_Part_III_Summary_Score_Month_09 +
4*UPDRS_Part_III_Summary_Score_Month_12 + 5*UPDRS_Part_III_Summary_Score_Month_18 +
6*UPDRS_Part_III_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
'

fit4 <- growth(model4, data=myData)
summary(fit4)
parameterEstimates(fit4) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame
fitted(fit4) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit4)

==Measures of model quality (Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA))==

# report the fit measures as a signature vector: Comparative Fit Index (CFI), Root Mean Square Error of
# Approximation (RMSEA)
fitMeasures(fit4, c("cfi", "rmsea", "srmr"))

====Comparative Fit Index====

(CFI) is an incremental measure directly based on the non-centrality measure. If d = χ2(df) where df are the degrees of freedom of the model, the Comparative Fit Index is:
$
\frac{(Null Model)-d(Proposed Model)}{d(Null Model)}.
$

$0≤CFI≤1$ (by definition). It is interpreted as:

*$CFI<0.9$ - model fitting is poor.

*$0.9≤CFI≤0.95$ is considered marginal,

*$CFI>0.95$ is good.

CFI is a relative index of model fit – it compare the fit of your model to the fit of (the worst) fitting null model.

====Root Mean Square Error of Approximation====
(RMSEA) - “Ramsey”

An absolute measure of fit based on the non-centrality parameter:

$\sqrt{\frac{X^2-df}{df×(N - 1)}}$,

where N the sample size and df the degrees of freedom of the model. If χ2 < df, then the RMSEA∶=0. It has a penalty for complexity via the chi square to df ratio. The RMSEA is a popular measure of model fit.

*RMSEA < 0.01, excellent,

*RMSEA < 0.05, good

*RMSEA > 0.10 cutoff for poor fitting models

====Standardized Root Mean Square Residual====
(SRMR) is an absolute measure of fit defined as the standardized difference between the observed correlation and the predicted correlation. A value of zero indicates perfect fit. The SRMR has no penalty for model complexity. SRMR <0.08 is considered a good fit.

# inspect the model results (report parameter table)
inspect(fit4)

#install.packages("semTools")
# library("semTools")

A Simpler Model (fit5)

model5 <- '
# intercept and slope with fixed coefficients
i =~ UPDRS_Part_I_Summary_Score_Baseline + UPDRS_Part_I_Summary_Score_Month_03 + UPDRS_Part_I_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 + 6*UPDRS_Part_I_Summary_Score_Month_24
# regressions
i ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
s ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
# time-varying covariates
UPDRS_Part_I_Summary_Score_Baseline ~ Weight
UPDRS_Part_I_Summary_Score_Month_03 ~ ResearchGroup
UPDRS_Part_I_Summary_Score_Month_24 ~ Age
'

fit5 <- growth(model5, data=myData)
summary(fit5); fitMeasures(fit5, c("cfi", "rmsea", "srmr"))
parameterEstimates(fit5) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame

lavaan (0.5-18) converged normally after 99 iterations
Number of observations 661
Estimator ML
Minimum Function Test Statistic 3.703
Degrees of freedom 1
P-value (Chi-square) 0.054
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Latent variables:
i =~
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 1.074
UPDRS_P_I_S_S 1.172
s =~
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 6.000

Regressions:
i ~
R_fsfrm_gyr_V 0.000
Weight 0.003
ResearchGroup -0.880
Age -0.009
c12_34637584_ -0.907
s ~
R_fsfrm_gyr_V -0.000
Weight -0.000
ResearchGroup -0.084
Age 0.002
c12_34637584_ -0.047
UPDRS_Part_I_Summary_Score_Baseline ~
Weight -0.000
UPDRS_Part_I_Summary_Score_Month_03 ~
ResearchGroup 0.693
UPDRS_Part_I_Summary_Score_Month_24 ~
Age -0.002

Covariances:
i ~~
s 0.074

Intercepts:
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
i 1.633
s -0.023

Variances:
UPDRS_P_I_S_S 1.017
UPDRS_P_I_S_S 1.093
UPDRS_P_I_S_S 2.993
i 1.019
s -0.025

cfi rmsea srmr
0.996 0.064 0.008

fitted(fit5) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model
# write.table(fitted(fit5), file="C:\\Users\\Dinov\\Desktop\\test1.txt")

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit5)

# report the fit measures as a signature vector
fitMeasures(fit5, c("cfi", "rmsea", "srmr")) # comparative fit index (CFI)

# inspect the model results (report parameter table)
inspect(fit5)

Note: See discussion of SEM modeling pros/cons 2.

==Generalized Estimating Equation (GEE) Modeling==

Generalized Estimating Equations (GEE) modeling3 is used for analyzing data with the following characteristics:
(1) the observations within a group may be correlated, (2) observations in separate clusters are independent, (3) a monotone transformation of the expectation is linearly related to the explanatory variables, and (4) the variance is a function of the expectation. The expectation (#3) and the variance (# 4) are conditional given group-level or individual-level covariates.

GEE is applied to handle correlated discrete and continuous outcome variables. For the outcome variables, it only requires specification of the first 2 moments and correlation among them. The goal is to estimate fixed parameters without specifying their joint distribution. The correlation is specified by one of these 4 alternatives (which is specified in the R call: geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = " exchangeable", scale.fix = TRUE):

<center>[[Image:SMHS_BigDataBigSci8.png|300px]]</center>

===Respiratory Illness GEE R example===

This example is based on a data set on respiratory illness 4 and the geepack package. The data is from a clinical study of the treatment effects on patients with respiratory illness. N=111 patients from 2 clinical centers randomized to receive either placebo or active treatments. 4 temporal examinations assessed the respiratory state of patients as good (=1) or poor (=0). Explanatory variables characterizing a patient were: center (1,2), treatment (A=active, P=placebo), sex (M=male, F=female), age (in years) at baseline. The values of the covariates were constant for the repeated elementary observations on each patient.

Table 1 shows the number of patients for the response patterns across the 4 visits split by baseline-status and treatment. Baseline respiratory status = 0 appear to have either low or high number of positive responses. Baseline respiratory status = 1 tend to respond positively. Table 2 describes the distribution of the number of positive responses per patient for sex and center.

# library("geepack")

Table 1: Distribution of patients for different response patterns classified by baseline-respiratory response and treatment. The patterns are ordered according to increasing numbers of positive responses.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! ||Visit|| colspan="15"| All Possible Response Patterns (2*2*2*2=16 permutation patterns)||
|-
|||1||0||1||0||0||0||1||1||1||0||0||1||1||1||0||1||
|-
|||2||0||0||1||0||0||1||0||0||1||0||1||1||0||1||1||
|-
|||3||0||0||0||1||0||0||1||0||1||1||1||0||1||1||1||
|-
|||4||0||0||0||0||1||0||0||1||0||1||0||1||1||1||1||
|-
!Baseline||Treatment||||||||||||||||||||||||||||||||Sum
|-
| rowspan="2"|0||A||7||2||2||2||1||0||1||0||1||0||1||2||0||4||7||30
|-
|P||18||1||0||2||1||2||0||0||1||0||0||1||2||0||3||31
|-
|rowspan="2"|1||A||0||0||0||0||0||0||1||1||0||0||4||0||1||0||17||24
|-
|P||1||4||1||0||0||0||0||1||1||3||1||1||2||1||10||26
|-
|Sum||||26||7||3||4||2||2||2||2||3||3||6||4||5||5||37||111
|}
</center>

Table 2: Distribution of patients for the number of positive responses across the 4 visits for Sex and Center.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! colspan="2" rowspan="2"| ||colspan="5"|Number of positive responses
|-
| 0||1||2||3||4
|-
|rowspan="2"|Sex || F||7||3||3||3||7
|-
|M||19||13||9||17||30
|-
|rowspan="2"|Center|| 1||18||9||6||11||12
|-
|2||8||7||6||9||25
|}
</center>

Figure 1 shows a plot of age against the proportion of positive responses for each patient. It indicates a quadratic relationship between the proportions and the age. Fitting a logistic model to the data (which would be appropriate if there were no time effects and no spread in the response probabilities for patients with the same covariate values).

# install.packages("geepack")
library("geepack")

# data include a clinical trial of 111 patients with respiratory illness from two different clinics were randomized to receive either
# placebo (P) or an active (A) treatment. Patients were examined at baseline and at four visits during treatment.
# At each examination, respiratory status (categorized as 1 = good, 0 = poor)
data("respiratory")
head(respiratory)
myData <- respiratory

<center>head(myData)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Center||ID||Treat||Sex||Age||Baseline||Visit||Outcome
|-
|1 ||1||1||P||M||46||0||1||0
|-
|2 ||1||1||P||M||46||0||2||0
|-
|3 ||1||1||P||M||46||0||3||0
|-
|4 ||1||1||P||M||46||0||4||0
|-
|5||1||2||P||M||28||0||1||0
|-
|6||1||2||P||M||28||0||2||0
|}
</center>

# Get proportions of positive responses
responses <- factor(myData$\$$outcome, labels = c("OutcomePositive", "OutcomeNegative"))
data.frame <- data.frame(responses, myData$\$$age)
head(data.frame)
tab <- prop.table(table(data.frame), 1); tab # compute proportions
sum(tab[1,]) # check proportions (sums to 1.0)?
prop <- tab[1,] # save the proportions of positive responses for each patient
plot(as.numeric(dimnames(tab)$\$$myData.age), tab[1,], xlab = "Age", ylab = "Proportion of Positive Outcomes")
# dimnames(tab) # to see/inspect positive/negative outcomes

[[Image:SMHS_BigDataBigSci9.png|500px]]

x <- as.numeric(dimnames(tab)$\$$myData.age)
poly <- loess( prop ~ x) # fit a Local Polynomial Regression Fitting
plot(x, prop)
lines(predict(poly), col='red', lwd=2)

smoothingSpline <- smooth.spline(x, prop, spar=0.6)
plot(x, prop)
lines(smoothingSpline, col='red', lwd=1.5)
smoothPolySpline <- smooth.spline(x, predict(poly), spar=0.6)
lines(smoothPolySpline, col='blue', lwd=2)
legend("topright", inset=.05, title="Polynomial regression models", c("Raw Poly","Smooth Poly"), fill=c('red', 'blue'), horiz=TRUE)

[[Image:SMHS_BigDataBigSci10.png|500px]]

model.glm <- glm(outcome ~ baseline + center + sex + treat + age + I(age^2), data = respiratory, family = binomial)

summary(model.glm)

<center>Deviance Residuals:
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -2.5951||-0.9108||0.4034||0.8336||2.0951
|}
</center>

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Estimate||Std. Error||z value||$Pr( \gt |z|)$
|-
|(Intercept)||3.3579727||1.0285292||3.265||0.0011 **
|-
|baseline||1.8850421||0.2482959||7.592||3.15e-14 ***
|-
|center||0.5099244||0.2453982||2.078||0.0377 *
|-
|sexM||-0.4510595||0.3166570||-1.424||0.1543
|-
|Treatp||-1.3231587||0.2431603||-5.442||5.28e-08 ***
|-
|age||-0.2072815||0.0472538||-4.387||1.15e-05 ***
|-
|I(age^2)||0.0025650||0.0006324||4.056||4.99e-05 ***
|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 609.41 on 443 degrees of freedom

Residual deviance: 468.62 on 437 degrees of freedom

AIC: 482.62

The correlation matrix of the of the outcome measures across visits is shown in Table 3.

attach(myData)
mat1 <- matrix(c(outcome[visit==1], outcome [visit==2], outcome [visit==3],
outcome[visit==4]), ncol = 4)
cor(mat1)

Table 3: Correlation matrix for the outcome measurements at different visits.

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||[,1]||[,2]||[,3]||[,4]
|-
|[,1]||1.0000000||0.5087944||0.4431438||0.5139016
|-
|[,2]||0.5087944||1.0000000||0.5821877||0.5301611
|-
|[,3]||0.4431438||0.5821877||1.0000000||0.5871276
|-
|[,4]||0.5139016||0.5301611||0.5871276||1.0000000
|}
</center>

# We can also examine for multicollinearity problem, using the correlation matrix for X
cor(model.matrix(model.glm)[,-1])

# GEE modeling: R function arguments/options

*corstr= for defining the correlation structure within groups in a GEE model

*id= is used to identify the grouping variable in a GEE model

*scale.fix= when TRUE causes the scale parameter to be fixed (by default at 1) rather than estimated

*waves= names a positive integer-valued variable that is used to identify the order and spacing of observations within groups in a GEE model. This argument is crucial when there are missing values and gaps in the data

gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = "exchangeable", scale.fix = TRUE)

# The column labeled Wald in the summary table is the square of the z-statistic. The reported p-values are the
# upper tailed probabilities from a chisq1 distribution and test whether the true parameter value ≠0.
summary(gee.model1)

# To test the effect of ''treatment'' using anova()
gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id = id, corstr = "exchangeable", std.err="san.se")
gee.model2 <- geeglm(outcome ~ center + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id=id, corstr = "exchangeable", std.err="san.se")
anova(gee.model1, gee.model2)

# To test whether a categorical predictor with more than two levels should be retained in a GEE model we need
# to test the entire set of dummy variables simultaneously as a single construct.
# The geepack package provides a method for the anova function for a multivariate Wald test
# When the anova function is applied to a single geeglm object it returns sequential Wald tests for
# individual predictors with the tests carried out in the order the predictors are listed in the model formula.
anova(gee.model1)

===PD GEE example===

This example used the PPMI/PD data to show GEE analysis.

# 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv
longData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1",header=TRUE)

# library("geepack")

# Data Elements: FID_IID L_insular_cortex_ComputeArea L_insular_cortex_Volume R_insular_cortex_ComputeArea R_insular_cortex_Volume L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea R_putamen_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT chr17_rs12185268_GT chr17_rs199533_GT UPDRS_part_I UPDRS_part_II UPDRS_part_III time_visit

dim(longData)

data1 = na.omit(longData)
attach(data1)
ControlGroup <- ifelse(ResearchGroup == "Control", 1, 0)

# these calculations take a long time!!!
# if you get “Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca, :
# nrow(zsca) and length(y) not match” – this indicates some of the variables are of different lengths
# if you get “glm.fit: algorithm did not converge” – see this discussion: http://goo.gl/lrjBjB

gee.model0 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

gee.model1 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

# compare 2 gee models
# anova(gee.model0,gee.model1)

# you can try the “family = poisson(link = "log")” model for the ResearchGroup response, as well

gee.model2 <- geeglm(ControlGroup
~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+R_insular_cortex_ComputeArea+ R_insular_cortex_Volume +L_cingulate_gyrus_ComputeArea + L_cingulate_gyrus_Volume + R_cingulate_gyrus_ComputeArea + R_cingulate_gyrus_Volume + L_caudate_ComputeArea + L_caudate_Volume + R_caudate_ComputeArea + R_caudate_Volume + L_putamen_ComputeArea + L_putamen_Volume + R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr12_rs34637584_GT + chr17_rs11868035_GT + chr17_rs11012_GT + chr17_rs393152_GT + chr17_rs12185268_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

Remember that we do not interpret GEE coefficients as relating to individuals – GEE models are marginal models and the conclusions drawn are interpreted as population-based. Also, the time element in the model (time_visit) is just another controlling factor. The effect-sizes (betas) associated with each variable/predictor represent the slopes associated with the corresponding covariate, while holding time constant. If we need to examine interactions (e.g., Weight change over Time), we need to include an interaction term in model: (i.e. + Weight*time_visit).

summary (gee.model2)

# Individual Wald test and confidence intervals for each covariate
predictors2 <- coef(summary(gee.model2))
CI2 <- with(as.data.frame(predictors2), cbind(lwr=Estimate-1.96*Std.err, est=Estimate, upr=Estimate+1.96*Std.err))
rownames(CI2) <- rownames(predictors2)
CI2

==Appendix==

SEM References

*http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf

GEE References

*https://cran.r-project.org/web/packages/geepack/geepack.pdf

*http://www.jstatsoft.org/v15/i02/paper

===Footnotes===

*2 http://www.imachordata.com/ecological-sems-and-composite-variables-what-why-and-how/
*3 http://www.jstatsoft.org/v15/i02/
*4 https://books.google.com/books?id=mdEqBgAAQBAJ

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GEE| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GCM}}

SMHS BigDataBigSci GEE

2016-05-23T19:52:39Z

Pineaumi: /* See also */

SMHS BigDataBigSci GEE

2016-05-23T19:50:18Z

Pineaumi: /* Overview */

==See also==
*[[SMHS_BigDataBigSci| Back to Model-based Analytics]]
*[[SMHS_BigDataBigSci_GCM| Back to Growth Curve Modeling ]]
*[[SMHS_BigDataBigSci_SEM| Back to Structural Equation Modeling (SEM)]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GEE}}

SMHS BigDataBigSci GEE

2016-05-23T19:49:33Z

Pineaumi: /* Questions */

==Overview==

GEE is a marginal longitudinal method that directly assesses the mean relations of interest (i.e., how the mean dependent variable changes over time), accounting for covariances among the observations within subjects, and getting a better estimate and valid significance tests of the relations. Thus, GEE estimates two different equations, (1) for the mean relations, and (2) for the covariance structure. An advantage of GEE over random-effect models is that it does not require the dependent variable to be normally distributed. However, a disadvantage of GEE is that it is less flexible and versatile – commonly employed algorithms for it require a small-to-moderate number of time points evenly (or approximately evenly) spaced, and similarly spaced across subjects. Nevertheless, it is a little more flexible than repeated-measure ANOVA because it permits some missing values and has an easy way to test for and model away the specific form of autocorrelation within subjects.

GEE is mostly used when the study is focused on uncovering the population average effect of a covariate vs. the individual specific effect. These two things are only equivalent for linear models, but not in non-linear models.

==See also==
*[[SMHS_BigDataBigSci| Back to Model-based Analytics]]
*[[SMHS_BigDataBigSci_GCM| Back to Growth Curve Modeling ]]
*[[SMHS_BigDataBigSci_SEM| Back to Structural Equation Modeling (SEM)]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GEE}}

SMHS BigDataBigSci GEE

2016-05-23T19:49:27Z

Pineaumi: /* Model-based Analytics - Generalized Estimating Equation (GEE) Modeling */

==Questions==

*How to represent dependencies in linear models and examine causal effects?
*Is there a way to study population average effects of a covariate against specific individual effects?

==Overview==

GEE is a marginal longitudinal method that directly assesses the mean relations of interest (i.e., how the mean dependent variable changes over time), accounting for covariances among the observations within subjects, and getting a better estimate and valid significance tests of the relations. Thus, GEE estimates two different equations, (1) for the mean relations, and (2) for the covariance structure. An advantage of GEE over random-effect models is that it does not require the dependent variable to be normally distributed. However, a disadvantage of GEE is that it is less flexible and versatile – commonly employed algorithms for it require a small-to-moderate number of time points evenly (or approximately evenly) spaced, and similarly spaced across subjects. Nevertheless, it is a little more flexible than repeated-measure ANOVA because it permits some missing values and has an easy way to test for and model away the specific form of autocorrelation within subjects.

GEE is mostly used when the study is focused on uncovering the population average effect of a covariate vs. the individual specific effect. These two things are only equivalent for linear models, but not in non-linear models.

==See also==
*[[SMHS_BigDataBigSci| Back to Model-based Analytics]]
*[[SMHS_BigDataBigSci_GCM| Back to Growth Curve Modeling ]]
*[[SMHS_BigDataBigSci_SEM| Back to Structural Equation Modeling (SEM)]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GEE}}

SMHS BigDataBigSci GEE

2016-05-23T19:47:11Z

Pineaumi: /* Footnotes */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Generalized Estimating Equation (GEE) Modeling ==

==Questions==

*How to represent dependencies in linear models and examine causal effects?
*Is there a way to study population average effects of a covariate against specific individual effects?

==Overview==

GEE is a marginal longitudinal method that directly assesses the mean relations of interest (i.e., how the mean dependent variable changes over time), accounting for covariances among the observations within subjects, and getting a better estimate and valid significance tests of the relations. Thus, GEE estimates two different equations, (1) for the mean relations, and (2) for the covariance structure. An advantage of GEE over random-effect models is that it does not require the dependent variable to be normally distributed. However, a disadvantage of GEE is that it is less flexible and versatile – commonly employed algorithms for it require a small-to-moderate number of time points evenly (or approximately evenly) spaced, and similarly spaced across subjects. Nevertheless, it is a little more flexible than repeated-measure ANOVA because it permits some missing values and has an easy way to test for and model away the specific form of autocorrelation within subjects.

GEE is mostly used when the study is focused on uncovering the population average effect of a covariate vs. the individual specific effect. These two things are only equivalent for linear models, but not in non-linear models.

==See also==
*[[SMHS_BigDataBigSci| Back to Model-based Analytics]]
*[[SMHS_BigDataBigSci_GCM| Back to Growth Curve Modeling ]]
*[[SMHS_BigDataBigSci_SEM| Back to Structural Equation Modeling (SEM)]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GEE}}

SMHS BigDataBigSci GCM

2016-05-23T19:43:57Z

Pineaumi: /* Footnotes */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Growth Curve Models==

Latent growth curve models may be used to analyze longitudinal or temporal data where the outcome measure is assessed on multiple occasions, and we examine its change over time, e.g., the trajectory over time can be
modeled as a linear or quadratic function. Random effects are used to capture individual differences by conveniently representing (continuous) latent variables, aka growth factors. To fit a linear growth model we may specify a model with two latent variables: a random intercept, and a random slope:

#load data 05_PPMI_top_UPDRS_Integrated_LongFormat.csv ( dim(myData) 661 71), wide
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330395/download?download_frd=1&verifier=v6jBvV4x94ka3EYcGKuXXg5BZNaOLBVp0xkJih0H",header=TRUE)
attach(myData)

# dichotomize the "ResearchGroup" variable
table(myData$\$$ResearchGroup)
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# linear growth model with 4 timepoints
# intercept (i) and slope (s) with fixed coefficients
# i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 (intercept/constant)
# s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (slope/linear term)
# ??? =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (quadratic term)

In this model, we have fixed all the coefficients of the linear growth functions:

model4 <-
'
i =~ 1*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
1*UPDRS_Part_I_Summary_Score_Month_06 + 1*UPDRS_Part_I_Summary_Score_Month_09 +
1*UPDRS_Part_I_Summary_Score_Month_12 + 1*UPDRS_Part_I_Summary_Score_Month_18 +
1*UPDRS_Part_I_Summary_Score_Month_24 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
1*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
1*UPDRS_Part_III_Summary_Score_Month_06 + 1*UPDRS_Part_III_Summary_Score_Month_09 +
1*UPDRS_Part_III_Summary_Score_Month_12 + 1*UPDRS_Part_III_Summary_Score_Month_18 +
1*UPDRS_Part_III_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
2*UPDRS_Part_I_Summary_Score_Month_06 + 3*UPDRS_Part_I_Summary_Score_Month_09 +
4*UPDRS_Part_I_Summary_Score_Month_12 + 5*UPDRS_Part_I_Summary_Score_Month_18 +
6*UPDRS_Part_I_Summary_Score_Month_24 +
0*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
2*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
3*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
4*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
5*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
6*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
0*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
2*UPDRS_Part_III_Summary_Score_Month_06 + 3*UPDRS_Part_III_Summary_Score_Month_09 +
4*UPDRS_Part_III_Summary_Score_Month_12 + 5*UPDRS_Part_III_Summary_Score_Month_18 +
6*UPDRS_Part_III_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
'

fit4 <- growth(model4, data=myData)
summary(fit4)
parameterEstimates(fit4) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame
fitted(fit4) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit4)

==Measures of model quality (Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA))==

# report the fit measures as a signature vector: Comparative Fit Index (CFI), Root Mean Square Error of
# Approximation (RMSEA)
fitMeasures(fit4, c("cfi", "rmsea", "srmr"))

====Comparative Fit Index====

(CFI) is an incremental measure directly based on the non-centrality measure. If d = χ2(df) where df are the degrees of freedom of the model, the Comparative Fit Index is:
$
\frac{(Null Model)-d(Proposed Model)}{d(Null Model)}.
$

$0≤CFI≤1$ (by definition). It is interpreted as:

*$CFI<0.9$ - model fitting is poor.

*$0.9≤CFI≤0.95$ is considered marginal,

*$CFI>0.95$ is good.

CFI is a relative index of model fit – it compare the fit of your model to the fit of (the worst) fitting null model.

====Root Mean Square Error of Approximation====
(RMSEA) - “Ramsey”

An absolute measure of fit based on the non-centrality parameter:

$\sqrt{\frac{X^2-df}{df×(N - 1)}}$,

where N the sample size and df the degrees of freedom of the model. If χ2 < df, then the RMSEA∶=0. It has a penalty for complexity via the chi square to df ratio. The RMSEA is a popular measure of model fit.

*RMSEA < 0.01, excellent,

*RMSEA < 0.05, good

*RMSEA > 0.10 cutoff for poor fitting models

====Standardized Root Mean Square Residual====
(SRMR) is an absolute measure of fit defined as the standardized difference between the observed correlation and the predicted correlation. A value of zero indicates perfect fit. The SRMR has no penalty for model complexity. SRMR <0.08 is considered a good fit.

# inspect the model results (report parameter table)
inspect(fit4)

#install.packages("semTools")
# library("semTools")

A Simpler Model (fit5)

model5 <- '
# intercept and slope with fixed coefficients
i =~ UPDRS_Part_I_Summary_Score_Baseline + UPDRS_Part_I_Summary_Score_Month_03 + UPDRS_Part_I_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 + 6*UPDRS_Part_I_Summary_Score_Month_24
# regressions
i ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
s ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
# time-varying covariates
UPDRS_Part_I_Summary_Score_Baseline ~ Weight
UPDRS_Part_I_Summary_Score_Month_03 ~ ResearchGroup
UPDRS_Part_I_Summary_Score_Month_24 ~ Age
'

fit5 <- growth(model5, data=myData)
summary(fit5); fitMeasures(fit5, c("cfi", "rmsea", "srmr"))
parameterEstimates(fit5) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame

lavaan (0.5-18) converged normally after 99 iterations
Number of observations 661
Estimator ML
Minimum Function Test Statistic 3.703
Degrees of freedom 1
P-value (Chi-square) 0.054
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Latent variables:
i =~
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 1.074
UPDRS_P_I_S_S 1.172
s =~
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 6.000

Regressions:
i ~
R_fsfrm_gyr_V 0.000
Weight 0.003
ResearchGroup -0.880
Age -0.009
c12_34637584_ -0.907
s ~
R_fsfrm_gyr_V -0.000
Weight -0.000
ResearchGroup -0.084
Age 0.002
c12_34637584_ -0.047
UPDRS_Part_I_Summary_Score_Baseline ~
Weight -0.000
UPDRS_Part_I_Summary_Score_Month_03 ~
ResearchGroup 0.693
UPDRS_Part_I_Summary_Score_Month_24 ~
Age -0.002

Covariances:
i ~~
s 0.074

Intercepts:
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
i 1.633
s -0.023

Variances:
UPDRS_P_I_S_S 1.017
UPDRS_P_I_S_S 1.093
UPDRS_P_I_S_S 2.993
i 1.019
s -0.025

cfi rmsea srmr
0.996 0.064 0.008

fitted(fit5) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model
# write.table(fitted(fit5), file="C:\\Users\\Dinov\\Desktop\\test1.txt")

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit5)

# report the fit measures as a signature vector
fitMeasures(fit5, c("cfi", "rmsea", "srmr")) # comparative fit index (CFI)

# inspect the model results (report parameter table)
inspect(fit5)

Note: See discussion of SEM modeling pros/cons 2.

==Generalized Estimating Equation (GEE) Modeling==

Generalized Estimating Equations (GEE) modeling3 is used for analyzing data with the following characteristics:
(1) the observations within a group may be correlated, (2) observations in separate clusters are independent, (3) a monotone transformation of the expectation is linearly related to the explanatory variables, and (4) the variance is a function of the expectation. The expectation (#3) and the variance (# 4) are conditional given group-level or individual-level covariates.

GEE is applied to handle correlated discrete and continuous outcome variables. For the outcome variables, it only requires specification of the first 2 moments and correlation among them. The goal is to estimate fixed parameters without specifying their joint distribution. The correlation is specified by one of these 4 alternatives (which is specified in the R call: geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = " exchangeable", scale.fix = TRUE):

<center>[[Image:SMHS_BigDataBigSci8.png|300px]]</center>

===Respiratory Illness GEE R example===

This example is based on a data set on respiratory illness 4 and the geepack package. The data is from a clinical study of the treatment effects on patients with respiratory illness. N=111 patients from 2 clinical centers randomized to receive either placebo or active treatments. 4 temporal examinations assessed the respiratory state of patients as good (=1) or poor (=0). Explanatory variables characterizing a patient were: center (1,2), treatment (A=active, P=placebo), sex (M=male, F=female), age (in years) at baseline. The values of the covariates were constant for the repeated elementary observations on each patient.

Table 1 shows the number of patients for the response patterns across the 4 visits split by baseline-status and treatment. Baseline respiratory status = 0 appear to have either low or high number of positive responses. Baseline respiratory status = 1 tend to respond positively. Table 2 describes the distribution of the number of positive responses per patient for sex and center.

# library("geepack")

Table 1: Distribution of patients for different response patterns classified by baseline-respiratory response and treatment. The patterns are ordered according to increasing numbers of positive responses.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! ||Visit|| colspan="15"| All Possible Response Patterns (2*2*2*2=16 permutation patterns)||
|-
|||1||0||1||0||0||0||1||1||1||0||0||1||1||1||0||1||
|-
|||2||0||0||1||0||0||1||0||0||1||0||1||1||0||1||1||
|-
|||3||0||0||0||1||0||0||1||0||1||1||1||0||1||1||1||
|-
|||4||0||0||0||0||1||0||0||1||0||1||0||1||1||1||1||
|-
!Baseline||Treatment||||||||||||||||||||||||||||||||Sum
|-
| rowspan="2"|0||A||7||2||2||2||1||0||1||0||1||0||1||2||0||4||7||30
|-
|P||18||1||0||2||1||2||0||0||1||0||0||1||2||0||3||31
|-
|rowspan="2"|1||A||0||0||0||0||0||0||1||1||0||0||4||0||1||0||17||24
|-
|P||1||4||1||0||0||0||0||1||1||3||1||1||2||1||10||26
|-
|Sum||||26||7||3||4||2||2||2||2||3||3||6||4||5||5||37||111
|}
</center>

Table 2: Distribution of patients for the number of positive responses across the 4 visits for Sex and Center.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! colspan="2" rowspan="2"| ||colspan="5"|Number of positive responses
|-
| 0||1||2||3||4
|-
|rowspan="2"|Sex || F||7||3||3||3||7
|-
|M||19||13||9||17||30
|-
|rowspan="2"|Center|| 1||18||9||6||11||12
|-
|2||8||7||6||9||25
|}
</center>

Figure 1 shows a plot of age against the proportion of positive responses for each patient. It indicates a quadratic relationship between the proportions and the age. Fitting a logistic model to the data (which would be appropriate if there were no time effects and no spread in the response probabilities for patients with the same covariate values).

# install.packages("geepack")
library("geepack")

# data include a clinical trial of 111 patients with respiratory illness from two different clinics were randomized to receive either
# placebo (P) or an active (A) treatment. Patients were examined at baseline and at four visits during treatment.
# At each examination, respiratory status (categorized as 1 = good, 0 = poor)
data("respiratory")
head(respiratory)
myData <- respiratory

<center>head(myData)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Center||ID||Treat||Sex||Age||Baseline||Visit||Outcome
|-
|1 ||1||1||P||M||46||0||1||0
|-
|2 ||1||1||P||M||46||0||2||0
|-
|3 ||1||1||P||M||46||0||3||0
|-
|4 ||1||1||P||M||46||0||4||0
|-
|5||1||2||P||M||28||0||1||0
|-
|6||1||2||P||M||28||0||2||0
|}
</center>

# Get proportions of positive responses
responses <- factor(myData$\$$outcome, labels = c("OutcomePositive", "OutcomeNegative"))
data.frame <- data.frame(responses, myData$\$$age)
head(data.frame)
tab <- prop.table(table(data.frame), 1); tab # compute proportions
sum(tab[1,]) # check proportions (sums to 1.0)?
prop <- tab[1,] # save the proportions of positive responses for each patient
plot(as.numeric(dimnames(tab)$\$$myData.age), tab[1,], xlab = "Age", ylab = "Proportion of Positive Outcomes")
# dimnames(tab) # to see/inspect positive/negative outcomes

[[Image:SMHS_BigDataBigSci9.png|500px]]

x <- as.numeric(dimnames(tab)$\$$myData.age)
poly <- loess( prop ~ x) # fit a Local Polynomial Regression Fitting
plot(x, prop)
lines(predict(poly), col='red', lwd=2)

smoothingSpline <- smooth.spline(x, prop, spar=0.6)
plot(x, prop)
lines(smoothingSpline, col='red', lwd=1.5)
smoothPolySpline <- smooth.spline(x, predict(poly), spar=0.6)
lines(smoothPolySpline, col='blue', lwd=2)
legend("topright", inset=.05, title="Polynomial regression models", c("Raw Poly","Smooth Poly"), fill=c('red', 'blue'), horiz=TRUE)

[[Image:SMHS_BigDataBigSci10.png|500px]]

model.glm <- glm(outcome ~ baseline + center + sex + treat + age + I(age^2), data = respiratory, family = binomial)

summary(model.glm)

<center>Deviance Residuals:
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -2.5951||-0.9108||0.4034||0.8336||2.0951
|}
</center>

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Estimate||Std. Error||z value||$Pr( \gt |z|)$
|-
|(Intercept)||3.3579727||1.0285292||3.265||0.0011 **
|-
|baseline||1.8850421||0.2482959||7.592||3.15e-14 ***
|-
|center||0.5099244||0.2453982||2.078||0.0377 *
|-
|sexM||-0.4510595||0.3166570||-1.424||0.1543
|-
|Treatp||-1.3231587||0.2431603||-5.442||5.28e-08 ***
|-
|age||-0.2072815||0.0472538||-4.387||1.15e-05 ***
|-
|I(age^2)||0.0025650||0.0006324||4.056||4.99e-05 ***
|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 609.41 on 443 degrees of freedom

Residual deviance: 468.62 on 437 degrees of freedom

AIC: 482.62

The correlation matrix of the of the outcome measures across visits is shown in Table 3.

attach(myData)
mat1 <- matrix(c(outcome[visit==1], outcome [visit==2], outcome [visit==3],
outcome[visit==4]), ncol = 4)
cor(mat1)

Table 3: Correlation matrix for the outcome measurements at different visits.

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||[,1]||[,2]||[,3]||[,4]
|-
|[,1]||1.0000000||0.5087944||0.4431438||0.5139016
|-
|[,2]||0.5087944||1.0000000||0.5821877||0.5301611
|-
|[,3]||0.4431438||0.5821877||1.0000000||0.5871276
|-
|[,4]||0.5139016||0.5301611||0.5871276||1.0000000
|}
</center>

# We can also examine for multicollinearity problem, using the correlation matrix for X
cor(model.matrix(model.glm)[,-1])

# GEE modeling: R function arguments/options

*corstr= for defining the correlation structure within groups in a GEE model

*id= is used to identify the grouping variable in a GEE model

*scale.fix= when TRUE causes the scale parameter to be fixed (by default at 1) rather than estimated

*waves= names a positive integer-valued variable that is used to identify the order and spacing of observations within groups in a GEE model. This argument is crucial when there are missing values and gaps in the data

gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = "exchangeable", scale.fix = TRUE)

# The column labeled Wald in the summary table is the square of the z-statistic. The reported p-values are the
# upper tailed probabilities from a chisq1 distribution and test whether the true parameter value ≠0.
summary(gee.model1)

# To test the effect of ''treatment'' using anova()
gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id = id, corstr = "exchangeable", std.err="san.se")
gee.model2 <- geeglm(outcome ~ center + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id=id, corstr = "exchangeable", std.err="san.se")
anova(gee.model1, gee.model2)

# To test whether a categorical predictor with more than two levels should be retained in a GEE model we need
# to test the entire set of dummy variables simultaneously as a single construct.
# The geepack package provides a method for the anova function for a multivariate Wald test
# When the anova function is applied to a single geeglm object it returns sequential Wald tests for
# individual predictors with the tests carried out in the order the predictors are listed in the model formula.
anova(gee.model1)

===PD GEE example===

This example used the PPMI/PD data to show GEE analysis.

# 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv
longData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1",header=TRUE)

# library("geepack")

# Data Elements: FID_IID L_insular_cortex_ComputeArea L_insular_cortex_Volume R_insular_cortex_ComputeArea R_insular_cortex_Volume L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea R_putamen_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT chr17_rs12185268_GT chr17_rs199533_GT UPDRS_part_I UPDRS_part_II UPDRS_part_III time_visit

dim(longData)

data1 = na.omit(longData)
attach(data1)
ControlGroup <- ifelse(ResearchGroup == "Control", 1, 0)

# these calculations take a long time!!!
# if you get “Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca, :
# nrow(zsca) and length(y) not match” – this indicates some of the variables are of different lengths
# if you get “glm.fit: algorithm did not converge” – see this discussion: http://goo.gl/lrjBjB

gee.model0 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

gee.model1 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

# compare 2 gee models
# anova(gee.model0,gee.model1)

# you can try the “family = poisson(link = "log")” model for the ResearchGroup response, as well

gee.model2 <- geeglm(ControlGroup
~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+R_insular_cortex_ComputeArea+ R_insular_cortex_Volume +L_cingulate_gyrus_ComputeArea + L_cingulate_gyrus_Volume + R_cingulate_gyrus_ComputeArea + R_cingulate_gyrus_Volume + L_caudate_ComputeArea + L_caudate_Volume + R_caudate_ComputeArea + R_caudate_Volume + L_putamen_ComputeArea + L_putamen_Volume + R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr12_rs34637584_GT + chr17_rs11868035_GT + chr17_rs11012_GT + chr17_rs393152_GT + chr17_rs12185268_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

Remember that we do not interpret GEE coefficients as relating to individuals – GEE models are marginal models and the conclusions drawn are interpreted as population-based. Also, the time element in the model (time_visit) is just another controlling factor. The effect-sizes (betas) associated with each variable/predictor represent the slopes associated with the corresponding covariate, while holding time constant. If we need to examine interactions (e.g., Weight change over Time), we need to include an interaction term in model: (i.e. + Weight*time_visit).

summary (gee.model2)

# Individual Wald test and confidence intervals for each covariate
predictors2 <- coef(summary(gee.model2))
CI2 <- with(as.data.frame(predictors2), cbind(lwr=Estimate-1.96*Std.err, est=Estimate, upr=Estimate+1.96*Std.err))
rownames(CI2) <- rownames(predictors2)
CI2

==Appendix==

SEM References

*http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf

GEE References

*https://cran.r-project.org/web/packages/geepack/geepack.pdf

*http://www.jstatsoft.org/v15/i02/paper

===Footnotes===

*2 http://www.imachordata.com/ecological-sems-and-composite-variables-what-why-and-how/
*3 http://www.jstatsoft.org/v15/i02/
*4 https://books.google.com/books?id=mdEqBgAAQBAJ

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GEE| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GCM}}

SMHS BigDataBigSci GEE

2016-05-23T19:19:44Z

Pineaumi: /* Overview */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Generalized Estimating Equation (GEE) Modeling ==

==Questions==

*How to represent dependencies in linear models and examine causal effects?
*Is there a way to study population average effects of a covariate against specific individual effects?

==Overview==

GEE is a marginal longitudinal method that directly assesses the mean relations of interest (i.e., how the mean dependent variable changes over time), accounting for covariances among the observations within subjects, and getting a better estimate and valid significance tests of the relations. Thus, GEE estimates two different equations, (1) for the mean relations, and (2) for the covariance structure. An advantage of GEE over random-effect models is that it does not require the dependent variable to be normally distributed. However, a disadvantage of GEE is that it is less flexible and versatile – commonly employed algorithms for it require a small-to-moderate number of time points evenly (or approximately evenly) spaced, and similarly spaced across subjects. Nevertheless, it is a little more flexible than repeated-measure ANOVA because it permits some missing values and has an easy way to test for and model away the specific form of autocorrelation within subjects.

GEE is mostly used when the study is focused on uncovering the population average effect of a covariate vs. the individual specific effect. These two things are only equivalent for linear models, but not in non-linear models.

===Footnotes===

*3 http://www.jstatsoft.org/v15/i02/

*4 https://books.google.com/books?id=mdEqBgAAQBAJ

==See also==
*[[SMHS_BigDataBigSci| Back to Model-based Analytics]]
*[[SMHS_BigDataBigSci_GCM| Back to Growth Curve Modeling ]]
*[[SMHS_BigDataBigSci_SEM| Back to Structural Equation Modeling (SEM)]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GEE}}

SMHS BigDataBigSci GCM

2016-05-23T19:14:35Z

Pineaumi: /* Footnotes */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Growth Curve Models==

Latent growth curve models may be used to analyze longitudinal or temporal data where the outcome measure is assessed on multiple occasions, and we examine its change over time, e.g., the trajectory over time can be
modeled as a linear or quadratic function. Random effects are used to capture individual differences by conveniently representing (continuous) latent variables, aka growth factors. To fit a linear growth model we may specify a model with two latent variables: a random intercept, and a random slope:

#load data 05_PPMI_top_UPDRS_Integrated_LongFormat.csv ( dim(myData) 661 71), wide
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330395/download?download_frd=1&verifier=v6jBvV4x94ka3EYcGKuXXg5BZNaOLBVp0xkJih0H",header=TRUE)
attach(myData)

# dichotomize the "ResearchGroup" variable
table(myData$\$$ResearchGroup)
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# linear growth model with 4 timepoints
# intercept (i) and slope (s) with fixed coefficients
# i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 (intercept/constant)
# s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (slope/linear term)
# ??? =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (quadratic term)

In this model, we have fixed all the coefficients of the linear growth functions:

model4 <-
'
i =~ 1*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
1*UPDRS_Part_I_Summary_Score_Month_06 + 1*UPDRS_Part_I_Summary_Score_Month_09 +
1*UPDRS_Part_I_Summary_Score_Month_12 + 1*UPDRS_Part_I_Summary_Score_Month_18 +
1*UPDRS_Part_I_Summary_Score_Month_24 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
1*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
1*UPDRS_Part_III_Summary_Score_Month_06 + 1*UPDRS_Part_III_Summary_Score_Month_09 +
1*UPDRS_Part_III_Summary_Score_Month_12 + 1*UPDRS_Part_III_Summary_Score_Month_18 +
1*UPDRS_Part_III_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
2*UPDRS_Part_I_Summary_Score_Month_06 + 3*UPDRS_Part_I_Summary_Score_Month_09 +
4*UPDRS_Part_I_Summary_Score_Month_12 + 5*UPDRS_Part_I_Summary_Score_Month_18 +
6*UPDRS_Part_I_Summary_Score_Month_24 +
0*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
2*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
3*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
4*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
5*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
6*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
0*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
2*UPDRS_Part_III_Summary_Score_Month_06 + 3*UPDRS_Part_III_Summary_Score_Month_09 +
4*UPDRS_Part_III_Summary_Score_Month_12 + 5*UPDRS_Part_III_Summary_Score_Month_18 +
6*UPDRS_Part_III_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
'

fit4 <- growth(model4, data=myData)
summary(fit4)
parameterEstimates(fit4) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame
fitted(fit4) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit4)

==Measures of model quality (Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA))==

# report the fit measures as a signature vector: Comparative Fit Index (CFI), Root Mean Square Error of
# Approximation (RMSEA)
fitMeasures(fit4, c("cfi", "rmsea", "srmr"))

====Comparative Fit Index====

(CFI) is an incremental measure directly based on the non-centrality measure. If d = χ2(df) where df are the degrees of freedom of the model, the Comparative Fit Index is:
$
\frac{(Null Model)-d(Proposed Model)}{d(Null Model)}.
$

$0≤CFI≤1$ (by definition). It is interpreted as:

*$CFI<0.9$ - model fitting is poor.

*$0.9≤CFI≤0.95$ is considered marginal,

*$CFI>0.95$ is good.

CFI is a relative index of model fit – it compare the fit of your model to the fit of (the worst) fitting null model.

====Root Mean Square Error of Approximation====
(RMSEA) - “Ramsey”

An absolute measure of fit based on the non-centrality parameter:

$\sqrt{\frac{X^2-df}{df×(N - 1)}}$,

where N the sample size and df the degrees of freedom of the model. If χ2 < df, then the RMSEA∶=0. It has a penalty for complexity via the chi square to df ratio. The RMSEA is a popular measure of model fit.

*RMSEA < 0.01, excellent,

*RMSEA < 0.05, good

*RMSEA > 0.10 cutoff for poor fitting models

====Standardized Root Mean Square Residual====
(SRMR) is an absolute measure of fit defined as the standardized difference between the observed correlation and the predicted correlation. A value of zero indicates perfect fit. The SRMR has no penalty for model complexity. SRMR <0.08 is considered a good fit.

# inspect the model results (report parameter table)
inspect(fit4)

#install.packages("semTools")
# library("semTools")

A Simpler Model (fit5)

model5 <- '
# intercept and slope with fixed coefficients
i =~ UPDRS_Part_I_Summary_Score_Baseline + UPDRS_Part_I_Summary_Score_Month_03 + UPDRS_Part_I_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 + 6*UPDRS_Part_I_Summary_Score_Month_24
# regressions
i ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
s ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
# time-varying covariates
UPDRS_Part_I_Summary_Score_Baseline ~ Weight
UPDRS_Part_I_Summary_Score_Month_03 ~ ResearchGroup
UPDRS_Part_I_Summary_Score_Month_24 ~ Age
'

fit5 <- growth(model5, data=myData)
summary(fit5); fitMeasures(fit5, c("cfi", "rmsea", "srmr"))
parameterEstimates(fit5) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame

lavaan (0.5-18) converged normally after 99 iterations
Number of observations 661
Estimator ML
Minimum Function Test Statistic 3.703
Degrees of freedom 1
P-value (Chi-square) 0.054
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Latent variables:
i =~
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 1.074
UPDRS_P_I_S_S 1.172
s =~
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 6.000

Regressions:
i ~
R_fsfrm_gyr_V 0.000
Weight 0.003
ResearchGroup -0.880
Age -0.009
c12_34637584_ -0.907
s ~
R_fsfrm_gyr_V -0.000
Weight -0.000
ResearchGroup -0.084
Age 0.002
c12_34637584_ -0.047
UPDRS_Part_I_Summary_Score_Baseline ~
Weight -0.000
UPDRS_Part_I_Summary_Score_Month_03 ~
ResearchGroup 0.693
UPDRS_Part_I_Summary_Score_Month_24 ~
Age -0.002

Covariances:
i ~~
s 0.074

Intercepts:
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
i 1.633
s -0.023

Variances:
UPDRS_P_I_S_S 1.017
UPDRS_P_I_S_S 1.093
UPDRS_P_I_S_S 2.993
i 1.019
s -0.025

cfi rmsea srmr
0.996 0.064 0.008

fitted(fit5) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model
# write.table(fitted(fit5), file="C:\\Users\\Dinov\\Desktop\\test1.txt")

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit5)

# report the fit measures as a signature vector
fitMeasures(fit5, c("cfi", "rmsea", "srmr")) # comparative fit index (CFI)

# inspect the model results (report parameter table)
inspect(fit5)

Note: See discussion of SEM modeling pros/cons 2.

==Generalized Estimating Equation (GEE) Modeling==

Generalized Estimating Equations (GEE) modeling3 is used for analyzing data with the following characteristics:
(1) the observations within a group may be correlated, (2) observations in separate clusters are independent, (3) a monotone transformation of the expectation is linearly related to the explanatory variables, and (4) the variance is a function of the expectation. The expectation (#3) and the variance (# 4) are conditional given group-level or individual-level covariates.

GEE is applied to handle correlated discrete and continuous outcome variables. For the outcome variables, it only requires specification of the first 2 moments and correlation among them. The goal is to estimate fixed parameters without specifying their joint distribution. The correlation is specified by one of these 4 alternatives (which is specified in the R call: geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = " exchangeable", scale.fix = TRUE):

<center>[[Image:SMHS_BigDataBigSci8.png|300px]]</center>

===Respiratory Illness GEE R example===

This example is based on a data set on respiratory illness 4 and the geepack package. The data is from a clinical study of the treatment effects on patients with respiratory illness. N=111 patients from 2 clinical centers randomized to receive either placebo or active treatments. 4 temporal examinations assessed the respiratory state of patients as good (=1) or poor (=0). Explanatory variables characterizing a patient were: center (1,2), treatment (A=active, P=placebo), sex (M=male, F=female), age (in years) at baseline. The values of the covariates were constant for the repeated elementary observations on each patient.

Table 1 shows the number of patients for the response patterns across the 4 visits split by baseline-status and treatment. Baseline respiratory status = 0 appear to have either low or high number of positive responses. Baseline respiratory status = 1 tend to respond positively. Table 2 describes the distribution of the number of positive responses per patient for sex and center.

# library("geepack")

Table 1: Distribution of patients for different response patterns classified by baseline-respiratory response and treatment. The patterns are ordered according to increasing numbers of positive responses.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! ||Visit|| colspan="15"| All Possible Response Patterns (2*2*2*2=16 permutation patterns)||
|-
|||1||0||1||0||0||0||1||1||1||0||0||1||1||1||0||1||
|-
|||2||0||0||1||0||0||1||0||0||1||0||1||1||0||1||1||
|-
|||3||0||0||0||1||0||0||1||0||1||1||1||0||1||1||1||
|-
|||4||0||0||0||0||1||0||0||1||0||1||0||1||1||1||1||
|-
!Baseline||Treatment||||||||||||||||||||||||||||||||Sum
|-
| rowspan="2"|0||A||7||2||2||2||1||0||1||0||1||0||1||2||0||4||7||30
|-
|P||18||1||0||2||1||2||0||0||1||0||0||1||2||0||3||31
|-
|rowspan="2"|1||A||0||0||0||0||0||0||1||1||0||0||4||0||1||0||17||24
|-
|P||1||4||1||0||0||0||0||1||1||3||1||1||2||1||10||26
|-
|Sum||||26||7||3||4||2||2||2||2||3||3||6||4||5||5||37||111
|}
</center>

Table 2: Distribution of patients for the number of positive responses across the 4 visits for Sex and Center.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! colspan="2" rowspan="2"| ||colspan="5"|Number of positive responses
|-
| 0||1||2||3||4
|-
|rowspan="2"|Sex || F||7||3||3||3||7
|-
|M||19||13||9||17||30
|-
|rowspan="2"|Center|| 1||18||9||6||11||12
|-
|2||8||7||6||9||25
|}
</center>

Figure 1 shows a plot of age against the proportion of positive responses for each patient. It indicates a quadratic relationship between the proportions and the age. Fitting a logistic model to the data (which would be appropriate if there were no time effects and no spread in the response probabilities for patients with the same covariate values).

# install.packages("geepack")
library("geepack")

# data include a clinical trial of 111 patients with respiratory illness from two different clinics were randomized to receive either
# placebo (P) or an active (A) treatment. Patients were examined at baseline and at four visits during treatment.
# At each examination, respiratory status (categorized as 1 = good, 0 = poor)
data("respiratory")
head(respiratory)
myData <- respiratory

<center>head(myData)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Center||ID||Treat||Sex||Age||Baseline||Visit||Outcome
|-
|1 ||1||1||P||M||46||0||1||0
|-
|2 ||1||1||P||M||46||0||2||0
|-
|3 ||1||1||P||M||46||0||3||0
|-
|4 ||1||1||P||M||46||0||4||0
|-
|5||1||2||P||M||28||0||1||0
|-
|6||1||2||P||M||28||0||2||0
|}
</center>

# Get proportions of positive responses
responses <- factor(myData$\$$outcome, labels = c("OutcomePositive", "OutcomeNegative"))
data.frame <- data.frame(responses, myData$\$$age)
head(data.frame)
tab <- prop.table(table(data.frame), 1); tab # compute proportions
sum(tab[1,]) # check proportions (sums to 1.0)?
prop <- tab[1,] # save the proportions of positive responses for each patient
plot(as.numeric(dimnames(tab)$\$$myData.age), tab[1,], xlab = "Age", ylab = "Proportion of Positive Outcomes")
# dimnames(tab) # to see/inspect positive/negative outcomes

[[Image:SMHS_BigDataBigSci9.png|500px]]

x <- as.numeric(dimnames(tab)$\$$myData.age)
poly <- loess( prop ~ x) # fit a Local Polynomial Regression Fitting
plot(x, prop)
lines(predict(poly), col='red', lwd=2)

smoothingSpline <- smooth.spline(x, prop, spar=0.6)
plot(x, prop)
lines(smoothingSpline, col='red', lwd=1.5)
smoothPolySpline <- smooth.spline(x, predict(poly), spar=0.6)
lines(smoothPolySpline, col='blue', lwd=2)
legend("topright", inset=.05, title="Polynomial regression models", c("Raw Poly","Smooth Poly"), fill=c('red', 'blue'), horiz=TRUE)

[[Image:SMHS_BigDataBigSci10.png|500px]]

model.glm <- glm(outcome ~ baseline + center + sex + treat + age + I(age^2), data = respiratory, family = binomial)

summary(model.glm)

<center>Deviance Residuals:
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -2.5951||-0.9108||0.4034||0.8336||2.0951
|}
</center>

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Estimate||Std. Error||z value||$Pr( \gt |z|)$
|-
|(Intercept)||3.3579727||1.0285292||3.265||0.0011 **
|-
|baseline||1.8850421||0.2482959||7.592||3.15e-14 ***
|-
|center||0.5099244||0.2453982||2.078||0.0377 *
|-
|sexM||-0.4510595||0.3166570||-1.424||0.1543
|-
|Treatp||-1.3231587||0.2431603||-5.442||5.28e-08 ***
|-
|age||-0.2072815||0.0472538||-4.387||1.15e-05 ***
|-
|I(age^2)||0.0025650||0.0006324||4.056||4.99e-05 ***
|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 609.41 on 443 degrees of freedom

Residual deviance: 468.62 on 437 degrees of freedom

AIC: 482.62

The correlation matrix of the of the outcome measures across visits is shown in Table 3.

attach(myData)
mat1 <- matrix(c(outcome[visit==1], outcome [visit==2], outcome [visit==3],
outcome[visit==4]), ncol = 4)
cor(mat1)

Table 3: Correlation matrix for the outcome measurements at different visits.

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||[,1]||[,2]||[,3]||[,4]
|-
|[,1]||1.0000000||0.5087944||0.4431438||0.5139016
|-
|[,2]||0.5087944||1.0000000||0.5821877||0.5301611
|-
|[,3]||0.4431438||0.5821877||1.0000000||0.5871276
|-
|[,4]||0.5139016||0.5301611||0.5871276||1.0000000
|}
</center>

# We can also examine for multicollinearity problem, using the correlation matrix for X
cor(model.matrix(model.glm)[,-1])

# GEE modeling: R function arguments/options

*corstr= for defining the correlation structure within groups in a GEE model

*id= is used to identify the grouping variable in a GEE model

*scale.fix= when TRUE causes the scale parameter to be fixed (by default at 1) rather than estimated

*waves= names a positive integer-valued variable that is used to identify the order and spacing of observations within groups in a GEE model. This argument is crucial when there are missing values and gaps in the data

gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = "exchangeable", scale.fix = TRUE)

# The column labeled Wald in the summary table is the square of the z-statistic. The reported p-values are the
# upper tailed probabilities from a chisq1 distribution and test whether the true parameter value ≠0.
summary(gee.model1)

# To test the effect of ''treatment'' using anova()
gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id = id, corstr = "exchangeable", std.err="san.se")
gee.model2 <- geeglm(outcome ~ center + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id=id, corstr = "exchangeable", std.err="san.se")
anova(gee.model1, gee.model2)

# To test whether a categorical predictor with more than two levels should be retained in a GEE model we need
# to test the entire set of dummy variables simultaneously as a single construct.
# The geepack package provides a method for the anova function for a multivariate Wald test
# When the anova function is applied to a single geeglm object it returns sequential Wald tests for
# individual predictors with the tests carried out in the order the predictors are listed in the model formula.
anova(gee.model1)

===PD GEE example===

This example used the PPMI/PD data to show GEE analysis.

# 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv
longData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1",header=TRUE)

# library("geepack")

# Data Elements: FID_IID L_insular_cortex_ComputeArea L_insular_cortex_Volume R_insular_cortex_ComputeArea R_insular_cortex_Volume L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea R_putamen_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT chr17_rs12185268_GT chr17_rs199533_GT UPDRS_part_I UPDRS_part_II UPDRS_part_III time_visit

dim(longData)

data1 = na.omit(longData)
attach(data1)
ControlGroup <- ifelse(ResearchGroup == "Control", 1, 0)

# these calculations take a long time!!!
# if you get “Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca, :
# nrow(zsca) and length(y) not match” – this indicates some of the variables are of different lengths
# if you get “glm.fit: algorithm did not converge” – see this discussion: http://goo.gl/lrjBjB

gee.model0 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

gee.model1 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

# compare 2 gee models
# anova(gee.model0,gee.model1)

# you can try the “family = poisson(link = "log")” model for the ResearchGroup response, as well

gee.model2 <- geeglm(ControlGroup
~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+R_insular_cortex_ComputeArea+ R_insular_cortex_Volume +L_cingulate_gyrus_ComputeArea + L_cingulate_gyrus_Volume + R_cingulate_gyrus_ComputeArea + R_cingulate_gyrus_Volume + L_caudate_ComputeArea + L_caudate_Volume + R_caudate_ComputeArea + R_caudate_Volume + L_putamen_ComputeArea + L_putamen_Volume + R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr12_rs34637584_GT + chr17_rs11868035_GT + chr17_rs11012_GT + chr17_rs393152_GT + chr17_rs12185268_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

Remember that we do not interpret GEE coefficients as relating to individuals – GEE models are marginal models and the conclusions drawn are interpreted as population-based. Also, the time element in the model (time_visit) is just another controlling factor. The effect-sizes (betas) associated with each variable/predictor represent the slopes associated with the corresponding covariate, while holding time constant. If we need to examine interactions (e.g., Weight change over Time), we need to include an interaction term in model: (i.e. + Weight*time_visit).

summary (gee.model2)

# Individual Wald test and confidence intervals for each covariate
predictors2 <- coef(summary(gee.model2))
CI2 <- with(as.data.frame(predictors2), cbind(lwr=Estimate-1.96*Std.err, est=Estimate, upr=Estimate+1.96*Std.err))
rownames(CI2) <- rownames(predictors2)
CI2

==Appendix==

SEM References

*http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf

GEE References

*https://cran.r-project.org/web/packages/geepack/geepack.pdf

*http://www.jstatsoft.org/v15/i02/paper

===Footnotes===

*2 http://www.imachordata.com/ecological-sems-and-composite-variables-what-why-and-how/

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GEE| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GCM}}

SMHS BigDataBigSci GCM

2016-05-23T19:14:10Z

Pineaumi: /* Footnotes */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Growth Curve Models==

Latent growth curve models may be used to analyze longitudinal or temporal data where the outcome measure is assessed on multiple occasions, and we examine its change over time, e.g., the trajectory over time can be
modeled as a linear or quadratic function. Random effects are used to capture individual differences by conveniently representing (continuous) latent variables, aka growth factors. To fit a linear growth model we may specify a model with two latent variables: a random intercept, and a random slope:

#load data 05_PPMI_top_UPDRS_Integrated_LongFormat.csv ( dim(myData) 661 71), wide
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330395/download?download_frd=1&verifier=v6jBvV4x94ka3EYcGKuXXg5BZNaOLBVp0xkJih0H",header=TRUE)
attach(myData)

# dichotomize the "ResearchGroup" variable
table(myData$\$$ResearchGroup)
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# linear growth model with 4 timepoints
# intercept (i) and slope (s) with fixed coefficients
# i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 (intercept/constant)
# s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (slope/linear term)
# ??? =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (quadratic term)

In this model, we have fixed all the coefficients of the linear growth functions:

model4 <-
'
i =~ 1*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
1*UPDRS_Part_I_Summary_Score_Month_06 + 1*UPDRS_Part_I_Summary_Score_Month_09 +
1*UPDRS_Part_I_Summary_Score_Month_12 + 1*UPDRS_Part_I_Summary_Score_Month_18 +
1*UPDRS_Part_I_Summary_Score_Month_24 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
1*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
1*UPDRS_Part_III_Summary_Score_Month_06 + 1*UPDRS_Part_III_Summary_Score_Month_09 +
1*UPDRS_Part_III_Summary_Score_Month_12 + 1*UPDRS_Part_III_Summary_Score_Month_18 +
1*UPDRS_Part_III_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
2*UPDRS_Part_I_Summary_Score_Month_06 + 3*UPDRS_Part_I_Summary_Score_Month_09 +
4*UPDRS_Part_I_Summary_Score_Month_12 + 5*UPDRS_Part_I_Summary_Score_Month_18 +
6*UPDRS_Part_I_Summary_Score_Month_24 +
0*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
2*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
3*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
4*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
5*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
6*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
0*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
2*UPDRS_Part_III_Summary_Score_Month_06 + 3*UPDRS_Part_III_Summary_Score_Month_09 +
4*UPDRS_Part_III_Summary_Score_Month_12 + 5*UPDRS_Part_III_Summary_Score_Month_18 +
6*UPDRS_Part_III_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
'

fit4 <- growth(model4, data=myData)
summary(fit4)
parameterEstimates(fit4) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame
fitted(fit4) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit4)

==Measures of model quality (Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA))==

# report the fit measures as a signature vector: Comparative Fit Index (CFI), Root Mean Square Error of
# Approximation (RMSEA)
fitMeasures(fit4, c("cfi", "rmsea", "srmr"))

====Comparative Fit Index====

(CFI) is an incremental measure directly based on the non-centrality measure. If d = χ2(df) where df are the degrees of freedom of the model, the Comparative Fit Index is:
$
\frac{(Null Model)-d(Proposed Model)}{d(Null Model)}.
$

$0≤CFI≤1$ (by definition). It is interpreted as:

*$CFI<0.9$ - model fitting is poor.

*$0.9≤CFI≤0.95$ is considered marginal,

*$CFI>0.95$ is good.

CFI is a relative index of model fit – it compare the fit of your model to the fit of (the worst) fitting null model.

====Root Mean Square Error of Approximation====
(RMSEA) - “Ramsey”

An absolute measure of fit based on the non-centrality parameter:

$\sqrt{\frac{X^2-df}{df×(N - 1)}}$,

where N the sample size and df the degrees of freedom of the model. If χ2 < df, then the RMSEA∶=0. It has a penalty for complexity via the chi square to df ratio. The RMSEA is a popular measure of model fit.

*RMSEA < 0.01, excellent,

*RMSEA < 0.05, good

*RMSEA > 0.10 cutoff for poor fitting models

====Standardized Root Mean Square Residual====
(SRMR) is an absolute measure of fit defined as the standardized difference between the observed correlation and the predicted correlation. A value of zero indicates perfect fit. The SRMR has no penalty for model complexity. SRMR <0.08 is considered a good fit.

# inspect the model results (report parameter table)
inspect(fit4)

#install.packages("semTools")
# library("semTools")

A Simpler Model (fit5)

model5 <- '
# intercept and slope with fixed coefficients
i =~ UPDRS_Part_I_Summary_Score_Baseline + UPDRS_Part_I_Summary_Score_Month_03 + UPDRS_Part_I_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 + 6*UPDRS_Part_I_Summary_Score_Month_24
# regressions
i ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
s ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
# time-varying covariates
UPDRS_Part_I_Summary_Score_Baseline ~ Weight
UPDRS_Part_I_Summary_Score_Month_03 ~ ResearchGroup
UPDRS_Part_I_Summary_Score_Month_24 ~ Age
'

fit5 <- growth(model5, data=myData)
summary(fit5); fitMeasures(fit5, c("cfi", "rmsea", "srmr"))
parameterEstimates(fit5) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame

lavaan (0.5-18) converged normally after 99 iterations
Number of observations 661
Estimator ML
Minimum Function Test Statistic 3.703
Degrees of freedom 1
P-value (Chi-square) 0.054
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Latent variables:
i =~
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 1.074
UPDRS_P_I_S_S 1.172
s =~
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 6.000

Regressions:
i ~
R_fsfrm_gyr_V 0.000
Weight 0.003
ResearchGroup -0.880
Age -0.009
c12_34637584_ -0.907
s ~
R_fsfrm_gyr_V -0.000
Weight -0.000
ResearchGroup -0.084
Age 0.002
c12_34637584_ -0.047
UPDRS_Part_I_Summary_Score_Baseline ~
Weight -0.000
UPDRS_Part_I_Summary_Score_Month_03 ~
ResearchGroup 0.693
UPDRS_Part_I_Summary_Score_Month_24 ~
Age -0.002

Covariances:
i ~~
s 0.074

Intercepts:
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
i 1.633
s -0.023

Variances:
UPDRS_P_I_S_S 1.017
UPDRS_P_I_S_S 1.093
UPDRS_P_I_S_S 2.993
i 1.019
s -0.025

cfi rmsea srmr
0.996 0.064 0.008

fitted(fit5) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model
# write.table(fitted(fit5), file="C:\\Users\\Dinov\\Desktop\\test1.txt")

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit5)

# report the fit measures as a signature vector
fitMeasures(fit5, c("cfi", "rmsea", "srmr")) # comparative fit index (CFI)

# inspect the model results (report parameter table)
inspect(fit5)

Note: See discussion of SEM modeling pros/cons 2.

==Generalized Estimating Equation (GEE) Modeling==

Generalized Estimating Equations (GEE) modeling3 is used for analyzing data with the following characteristics:
(1) the observations within a group may be correlated, (2) observations in separate clusters are independent, (3) a monotone transformation of the expectation is linearly related to the explanatory variables, and (4) the variance is a function of the expectation. The expectation (#3) and the variance (# 4) are conditional given group-level or individual-level covariates.

GEE is applied to handle correlated discrete and continuous outcome variables. For the outcome variables, it only requires specification of the first 2 moments and correlation among them. The goal is to estimate fixed parameters without specifying their joint distribution. The correlation is specified by one of these 4 alternatives (which is specified in the R call: geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = " exchangeable", scale.fix = TRUE):

<center>[[Image:SMHS_BigDataBigSci8.png|300px]]</center>

===Respiratory Illness GEE R example===

This example is based on a data set on respiratory illness 4 and the geepack package. The data is from a clinical study of the treatment effects on patients with respiratory illness. N=111 patients from 2 clinical centers randomized to receive either placebo or active treatments. 4 temporal examinations assessed the respiratory state of patients as good (=1) or poor (=0). Explanatory variables characterizing a patient were: center (1,2), treatment (A=active, P=placebo), sex (M=male, F=female), age (in years) at baseline. The values of the covariates were constant for the repeated elementary observations on each patient.

Table 1 shows the number of patients for the response patterns across the 4 visits split by baseline-status and treatment. Baseline respiratory status = 0 appear to have either low or high number of positive responses. Baseline respiratory status = 1 tend to respond positively. Table 2 describes the distribution of the number of positive responses per patient for sex and center.

# library("geepack")

Table 1: Distribution of patients for different response patterns classified by baseline-respiratory response and treatment. The patterns are ordered according to increasing numbers of positive responses.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! ||Visit|| colspan="15"| All Possible Response Patterns (2*2*2*2=16 permutation patterns)||
|-
|||1||0||1||0||0||0||1||1||1||0||0||1||1||1||0||1||
|-
|||2||0||0||1||0||0||1||0||0||1||0||1||1||0||1||1||
|-
|||3||0||0||0||1||0||0||1||0||1||1||1||0||1||1||1||
|-
|||4||0||0||0||0||1||0||0||1||0||1||0||1||1||1||1||
|-
!Baseline||Treatment||||||||||||||||||||||||||||||||Sum
|-
| rowspan="2"|0||A||7||2||2||2||1||0||1||0||1||0||1||2||0||4||7||30
|-
|P||18||1||0||2||1||2||0||0||1||0||0||1||2||0||3||31
|-
|rowspan="2"|1||A||0||0||0||0||0||0||1||1||0||0||4||0||1||0||17||24
|-
|P||1||4||1||0||0||0||0||1||1||3||1||1||2||1||10||26
|-
|Sum||||26||7||3||4||2||2||2||2||3||3||6||4||5||5||37||111
|}
</center>

Table 2: Distribution of patients for the number of positive responses across the 4 visits for Sex and Center.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! colspan="2" rowspan="2"| ||colspan="5"|Number of positive responses
|-
| 0||1||2||3||4
|-
|rowspan="2"|Sex || F||7||3||3||3||7
|-
|M||19||13||9||17||30
|-
|rowspan="2"|Center|| 1||18||9||6||11||12
|-
|2||8||7||6||9||25
|}
</center>

Figure 1 shows a plot of age against the proportion of positive responses for each patient. It indicates a quadratic relationship between the proportions and the age. Fitting a logistic model to the data (which would be appropriate if there were no time effects and no spread in the response probabilities for patients with the same covariate values).

# install.packages("geepack")
library("geepack")

# data include a clinical trial of 111 patients with respiratory illness from two different clinics were randomized to receive either
# placebo (P) or an active (A) treatment. Patients were examined at baseline and at four visits during treatment.
# At each examination, respiratory status (categorized as 1 = good, 0 = poor)
data("respiratory")
head(respiratory)
myData <- respiratory

<center>head(myData)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Center||ID||Treat||Sex||Age||Baseline||Visit||Outcome
|-
|1 ||1||1||P||M||46||0||1||0
|-
|2 ||1||1||P||M||46||0||2||0
|-
|3 ||1||1||P||M||46||0||3||0
|-
|4 ||1||1||P||M||46||0||4||0
|-
|5||1||2||P||M||28||0||1||0
|-
|6||1||2||P||M||28||0||2||0
|}
</center>

# Get proportions of positive responses
responses <- factor(myData$\$$outcome, labels = c("OutcomePositive", "OutcomeNegative"))
data.frame <- data.frame(responses, myData$\$$age)
head(data.frame)
tab <- prop.table(table(data.frame), 1); tab # compute proportions
sum(tab[1,]) # check proportions (sums to 1.0)?
prop <- tab[1,] # save the proportions of positive responses for each patient
plot(as.numeric(dimnames(tab)$\$$myData.age), tab[1,], xlab = "Age", ylab = "Proportion of Positive Outcomes")
# dimnames(tab) # to see/inspect positive/negative outcomes

[[Image:SMHS_BigDataBigSci9.png|500px]]

x <- as.numeric(dimnames(tab)$\$$myData.age)
poly <- loess( prop ~ x) # fit a Local Polynomial Regression Fitting
plot(x, prop)
lines(predict(poly), col='red', lwd=2)

smoothingSpline <- smooth.spline(x, prop, spar=0.6)
plot(x, prop)
lines(smoothingSpline, col='red', lwd=1.5)
smoothPolySpline <- smooth.spline(x, predict(poly), spar=0.6)
lines(smoothPolySpline, col='blue', lwd=2)
legend("topright", inset=.05, title="Polynomial regression models", c("Raw Poly","Smooth Poly"), fill=c('red', 'blue'), horiz=TRUE)

[[Image:SMHS_BigDataBigSci10.png|500px]]

model.glm <- glm(outcome ~ baseline + center + sex + treat + age + I(age^2), data = respiratory, family = binomial)

summary(model.glm)

<center>Deviance Residuals:
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -2.5951||-0.9108||0.4034||0.8336||2.0951
|}
</center>

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Estimate||Std. Error||z value||$Pr( \gt |z|)$
|-
|(Intercept)||3.3579727||1.0285292||3.265||0.0011 **
|-
|baseline||1.8850421||0.2482959||7.592||3.15e-14 ***
|-
|center||0.5099244||0.2453982||2.078||0.0377 *
|-
|sexM||-0.4510595||0.3166570||-1.424||0.1543
|-
|Treatp||-1.3231587||0.2431603||-5.442||5.28e-08 ***
|-
|age||-0.2072815||0.0472538||-4.387||1.15e-05 ***
|-
|I(age^2)||0.0025650||0.0006324||4.056||4.99e-05 ***
|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 609.41 on 443 degrees of freedom

Residual deviance: 468.62 on 437 degrees of freedom

AIC: 482.62

The correlation matrix of the of the outcome measures across visits is shown in Table 3.

attach(myData)
mat1 <- matrix(c(outcome[visit==1], outcome [visit==2], outcome [visit==3],
outcome[visit==4]), ncol = 4)
cor(mat1)

Table 3: Correlation matrix for the outcome measurements at different visits.

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||[,1]||[,2]||[,3]||[,4]
|-
|[,1]||1.0000000||0.5087944||0.4431438||0.5139016
|-
|[,2]||0.5087944||1.0000000||0.5821877||0.5301611
|-
|[,3]||0.4431438||0.5821877||1.0000000||0.5871276
|-
|[,4]||0.5139016||0.5301611||0.5871276||1.0000000
|}
</center>

# We can also examine for multicollinearity problem, using the correlation matrix for X
cor(model.matrix(model.glm)[,-1])

# GEE modeling: R function arguments/options

*corstr= for defining the correlation structure within groups in a GEE model

*id= is used to identify the grouping variable in a GEE model

*scale.fix= when TRUE causes the scale parameter to be fixed (by default at 1) rather than estimated

*waves= names a positive integer-valued variable that is used to identify the order and spacing of observations within groups in a GEE model. This argument is crucial when there are missing values and gaps in the data

gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = "exchangeable", scale.fix = TRUE)

# The column labeled Wald in the summary table is the square of the z-statistic. The reported p-values are the
# upper tailed probabilities from a chisq1 distribution and test whether the true parameter value ≠0.
summary(gee.model1)

# To test the effect of ''treatment'' using anova()
gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id = id, corstr = "exchangeable", std.err="san.se")
gee.model2 <- geeglm(outcome ~ center + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id=id, corstr = "exchangeable", std.err="san.se")
anova(gee.model1, gee.model2)

# To test whether a categorical predictor with more than two levels should be retained in a GEE model we need
# to test the entire set of dummy variables simultaneously as a single construct.
# The geepack package provides a method for the anova function for a multivariate Wald test
# When the anova function is applied to a single geeglm object it returns sequential Wald tests for
# individual predictors with the tests carried out in the order the predictors are listed in the model formula.
anova(gee.model1)

===PD GEE example===

This example used the PPMI/PD data to show GEE analysis.

# 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv
longData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1",header=TRUE)

# library("geepack")

# Data Elements: FID_IID L_insular_cortex_ComputeArea L_insular_cortex_Volume R_insular_cortex_ComputeArea R_insular_cortex_Volume L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea R_putamen_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT chr17_rs12185268_GT chr17_rs199533_GT UPDRS_part_I UPDRS_part_II UPDRS_part_III time_visit

dim(longData)

data1 = na.omit(longData)
attach(data1)
ControlGroup <- ifelse(ResearchGroup == "Control", 1, 0)

# these calculations take a long time!!!
# if you get “Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca, :
# nrow(zsca) and length(y) not match” – this indicates some of the variables are of different lengths
# if you get “glm.fit: algorithm did not converge” – see this discussion: http://goo.gl/lrjBjB

gee.model0 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

gee.model1 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

# compare 2 gee models
# anova(gee.model0,gee.model1)

# you can try the “family = poisson(link = "log")” model for the ResearchGroup response, as well

gee.model2 <- geeglm(ControlGroup
~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+R_insular_cortex_ComputeArea+ R_insular_cortex_Volume +L_cingulate_gyrus_ComputeArea + L_cingulate_gyrus_Volume + R_cingulate_gyrus_ComputeArea + R_cingulate_gyrus_Volume + L_caudate_ComputeArea + L_caudate_Volume + R_caudate_ComputeArea + R_caudate_Volume + L_putamen_ComputeArea + L_putamen_Volume + R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr12_rs34637584_GT + chr17_rs11868035_GT + chr17_rs11012_GT + chr17_rs393152_GT + chr17_rs12185268_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

Remember that we do not interpret GEE coefficients as relating to individuals – GEE models are marginal models and the conclusions drawn are interpreted as population-based. Also, the time element in the model (time_visit) is just another controlling factor. The effect-sizes (betas) associated with each variable/predictor represent the slopes associated with the corresponding covariate, while holding time constant. If we need to examine interactions (e.g., Weight change over Time), we need to include an interaction term in model: (i.e. + Weight*time_visit).

summary (gee.model2)

# Individual Wald test and confidence intervals for each covariate
predictors2 <- coef(summary(gee.model2))
CI2 <- with(as.data.frame(predictors2), cbind(lwr=Estimate-1.96*Std.err, est=Estimate, upr=Estimate+1.96*Std.err))
rownames(CI2) <- rownames(predictors2)
CI2

==Appendix==

SEM References

*http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf

GEE References

*https://cran.r-project.org/web/packages/geepack/geepack.pdf

*http://www.jstatsoft.org/v15/i02/paper

===Footnotes===

*2 http://www.imachordata.com/ecological-sems-and-composite-variables-what-why-and-how/

*3 http://www.jstatsoft.org/v15/i02/

*4 https://books.google.com/books?id=mdEqBgAAQBAJ

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GEE| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GCM}}

SMHS BigDataBigSci GCM

2016-05-23T19:12:55Z

Pineaumi: /* Comparative Fit Index */

==[[SMHS_BigDataBigSci| Model-based Analytics]] - Growth Curve Models==

Latent growth curve models may be used to analyze longitudinal or temporal data where the outcome measure is assessed on multiple occasions, and we examine its change over time, e.g., the trajectory over time can be
modeled as a linear or quadratic function. Random effects are used to capture individual differences by conveniently representing (continuous) latent variables, aka growth factors. To fit a linear growth model we may specify a model with two latent variables: a random intercept, and a random slope:

#load data 05_PPMI_top_UPDRS_Integrated_LongFormat.csv ( dim(myData) 661 71), wide
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330395/download?download_frd=1&verifier=v6jBvV4x94ka3EYcGKuXXg5BZNaOLBVp0xkJih0H",header=TRUE)
attach(myData)

# dichotomize the "ResearchGroup" variable
table(myData$\$$ResearchGroup)
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# linear growth model with 4 timepoints
# intercept (i) and slope (s) with fixed coefficients
# i =~ 1*t1 + 1*t2 + 1*t3 + 1*t4 (intercept/constant)
# s =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (slope/linear term)
# ??? =~ 0*t1 + 1*t2 + 2*t3 + 3*t4 (quadratic term)

In this model, we have fixed all the coefficients of the linear growth functions:

model4 <-
'
i =~ 1*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
1*UPDRS_Part_I_Summary_Score_Month_06 + 1*UPDRS_Part_I_Summary_Score_Month_09 +
1*UPDRS_Part_I_Summary_Score_Month_12 + 1*UPDRS_Part_I_Summary_Score_Month_18 +
1*UPDRS_Part_I_Summary_Score_Month_24 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
1*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
1*UPDRS_Part_III_Summary_Score_Month_06 + 1*UPDRS_Part_III_Summary_Score_Month_09 +
1*UPDRS_Part_III_Summary_Score_Month_12 + 1*UPDRS_Part_III_Summary_Score_Month_18 +
1*UPDRS_Part_III_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
1*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 +
2*UPDRS_Part_I_Summary_Score_Month_06 + 3*UPDRS_Part_I_Summary_Score_Month_09 +
4*UPDRS_Part_I_Summary_Score_Month_12 + 5*UPDRS_Part_I_Summary_Score_Month_18 +
6*UPDRS_Part_I_Summary_Score_Month_24 +
0*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline +
1*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 +
2*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 +
3*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 +
4*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 +
5*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 +
6*UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 +
0*UPDRS_Part_III_Summary_Score_Baseline + 1*UPDRS_Part_III_Summary_Score_Month_03 +
2*UPDRS_Part_III_Summary_Score_Month_06 + 3*UPDRS_Part_III_Summary_Score_Month_09 +
4*UPDRS_Part_III_Summary_Score_Month_12 + 5*UPDRS_Part_III_Summary_Score_Month_18 +
6*UPDRS_Part_III_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 +
0*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline +
2*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 +
4*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 +
6*X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24
'

fit4 <- growth(model4, data=myData)
summary(fit4)
parameterEstimates(fit4) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame
fitted(fit4) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit4)

==Measures of model quality (Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA))==

# report the fit measures as a signature vector: Comparative Fit Index (CFI), Root Mean Square Error of
# Approximation (RMSEA)
fitMeasures(fit4, c("cfi", "rmsea", "srmr"))

====Comparative Fit Index====

(CFI) is an incremental measure directly based on the non-centrality measure. If d = χ2(df) where df are the degrees of freedom of the model, the Comparative Fit Index is:
$
\frac{(Null Model)-d(Proposed Model)}{d(Null Model)}.
$

$0≤CFI≤1$ (by definition). It is interpreted as:

*$CFI<0.9$ - model fitting is poor.

*$0.9≤CFI≤0.95$ is considered marginal,

*$CFI>0.95$ is good.

CFI is a relative index of model fit – it compare the fit of your model to the fit of (the worst) fitting null model.

====Root Mean Square Error of Approximation====
(RMSEA) - “Ramsey”

An absolute measure of fit based on the non-centrality parameter:

$\sqrt{\frac{X^2-df}{df×(N - 1)}}$,

where N the sample size and df the degrees of freedom of the model. If χ2 < df, then the RMSEA∶=0. It has a penalty for complexity via the chi square to df ratio. The RMSEA is a popular measure of model fit.

*RMSEA < 0.01, excellent,

*RMSEA < 0.05, good

*RMSEA > 0.10 cutoff for poor fitting models

====Standardized Root Mean Square Residual====
(SRMR) is an absolute measure of fit defined as the standardized difference between the observed correlation and the predicted correlation. A value of zero indicates perfect fit. The SRMR has no penalty for model complexity. SRMR <0.08 is considered a good fit.

# inspect the model results (report parameter table)
inspect(fit4)

#install.packages("semTools")
# library("semTools")

A Simpler Model (fit5)

model5 <- '
# intercept and slope with fixed coefficients
i =~ UPDRS_Part_I_Summary_Score_Baseline + UPDRS_Part_I_Summary_Score_Month_03 + UPDRS_Part_I_Summary_Score_Month_24
s =~ 0*UPDRS_Part_I_Summary_Score_Baseline + 1*UPDRS_Part_I_Summary_Score_Month_03 + 6*UPDRS_Part_I_Summary_Score_Month_24
# regressions
i ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
s ~ R_fusiform_gyrus_Volume + Weight + ResearchGroup + Age + chr12_rs34637584_GT
# time-varying covariates
UPDRS_Part_I_Summary_Score_Baseline ~ Weight
UPDRS_Part_I_Summary_Score_Month_03 ~ ResearchGroup
UPDRS_Part_I_Summary_Score_Month_24 ~ Age
'

fit5 <- growth(model5, data=myData)
summary(fit5); fitMeasures(fit5, c("cfi", "rmsea", "srmr"))
parameterEstimates(fit5) # extracts the values of the estimated parameters, the standard errors,
# the z-values, the standardized parameter values, and returns a data frame

lavaan (0.5-18) converged normally after 99 iterations
Number of observations 661
Estimator ML
Minimum Function Test Statistic 3.703
Degrees of freedom 1
P-value (Chi-square) 0.054
Parameter estimates:
Information Expected
Standard Errors Standard
Estimate Std.err Z-value P(>|z|)
Latent variables:
i =~
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 1.074
UPDRS_P_I_S_S 1.172
s =~
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 1.000
UPDRS_P_I_S_S 6.000

Regressions:
i ~
R_fsfrm_gyr_V 0.000
Weight 0.003
ResearchGroup -0.880
Age -0.009
c12_34637584_ -0.907
s ~
R_fsfrm_gyr_V -0.000
Weight -0.000
ResearchGroup -0.084
Age 0.002
c12_34637584_ -0.047
UPDRS_Part_I_Summary_Score_Baseline ~
Weight -0.000
UPDRS_Part_I_Summary_Score_Month_03 ~
ResearchGroup 0.693
UPDRS_Part_I_Summary_Score_Month_24 ~
Age -0.002

Covariances:
i ~~
s 0.074

Intercepts:
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
UPDRS_P_I_S_S 0.000
i 1.633
s -0.023

Variances:
UPDRS_P_I_S_S 1.017
UPDRS_P_I_S_S 1.093
UPDRS_P_I_S_S 2.993
i 1.019
s -0.025

cfi rmsea srmr
0.996 0.064 0.008

fitted(fit5) # return the model-implied (fitted) covariance matrix (and mean vector) of a fitted model
# write.table(fitted(fit5), file="C:\\Users\\Dinov\\Desktop\\test1.txt")

# resid() function return (unstandardized) residuals of a fitted model including the difference between
# the observed and implied covariance matrix and mean vector
resid(fit5)

# report the fit measures as a signature vector
fitMeasures(fit5, c("cfi", "rmsea", "srmr")) # comparative fit index (CFI)

# inspect the model results (report parameter table)
inspect(fit5)

Note: See discussion of SEM modeling pros/cons 2.

==Generalized Estimating Equation (GEE) Modeling==

Generalized Estimating Equations (GEE) modeling3 is used for analyzing data with the following characteristics:
(1) the observations within a group may be correlated, (2) observations in separate clusters are independent, (3) a monotone transformation of the expectation is linearly related to the explanatory variables, and (4) the variance is a function of the expectation. The expectation (#3) and the variance (# 4) are conditional given group-level or individual-level covariates.

GEE is applied to handle correlated discrete and continuous outcome variables. For the outcome variables, it only requires specification of the first 2 moments and correlation among them. The goal is to estimate fixed parameters without specifying their joint distribution. The correlation is specified by one of these 4 alternatives (which is specified in the R call: geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = " exchangeable", scale.fix = TRUE):

<center>[[Image:SMHS_BigDataBigSci8.png|300px]]</center>

===Respiratory Illness GEE R example===

This example is based on a data set on respiratory illness 4 and the geepack package. The data is from a clinical study of the treatment effects on patients with respiratory illness. N=111 patients from 2 clinical centers randomized to receive either placebo or active treatments. 4 temporal examinations assessed the respiratory state of patients as good (=1) or poor (=0). Explanatory variables characterizing a patient were: center (1,2), treatment (A=active, P=placebo), sex (M=male, F=female), age (in years) at baseline. The values of the covariates were constant for the repeated elementary observations on each patient.

Table 1 shows the number of patients for the response patterns across the 4 visits split by baseline-status and treatment. Baseline respiratory status = 0 appear to have either low or high number of positive responses. Baseline respiratory status = 1 tend to respond positively. Table 2 describes the distribution of the number of positive responses per patient for sex and center.

# library("geepack")

Table 1: Distribution of patients for different response patterns classified by baseline-respiratory response and treatment. The patterns are ordered according to increasing numbers of positive responses.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! ||Visit|| colspan="15"| All Possible Response Patterns (2*2*2*2=16 permutation patterns)||
|-
|||1||0||1||0||0||0||1||1||1||0||0||1||1||1||0||1||
|-
|||2||0||0||1||0||0||1||0||0||1||0||1||1||0||1||1||
|-
|||3||0||0||0||1||0||0||1||0||1||1||1||0||1||1||1||
|-
|||4||0||0||0||0||1||0||0||1||0||1||0||1||1||1||1||
|-
!Baseline||Treatment||||||||||||||||||||||||||||||||Sum
|-
| rowspan="2"|0||A||7||2||2||2||1||0||1||0||1||0||1||2||0||4||7||30
|-
|P||18||1||0||2||1||2||0||0||1||0||0||1||2||0||3||31
|-
|rowspan="2"|1||A||0||0||0||0||0||0||1||1||0||0||4||0||1||0||17||24
|-
|P||1||4||1||0||0||0||0||1||1||3||1||1||2||1||10||26
|-
|Sum||||26||7||3||4||2||2||2||2||3||3||6||4||5||5||37||111
|}
</center>

Table 2: Distribution of patients for the number of positive responses across the 4 visits for Sex and Center.

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
! colspan="2" rowspan="2"| ||colspan="5"|Number of positive responses
|-
| 0||1||2||3||4
|-
|rowspan="2"|Sex || F||7||3||3||3||7
|-
|M||19||13||9||17||30
|-
|rowspan="2"|Center|| 1||18||9||6||11||12
|-
|2||8||7||6||9||25
|}
</center>

Figure 1 shows a plot of age against the proportion of positive responses for each patient. It indicates a quadratic relationship between the proportions and the age. Fitting a logistic model to the data (which would be appropriate if there were no time effects and no spread in the response probabilities for patients with the same covariate values).

# install.packages("geepack")
library("geepack")

# data include a clinical trial of 111 patients with respiratory illness from two different clinics were randomized to receive either
# placebo (P) or an active (A) treatment. Patients were examined at baseline and at four visits during treatment.
# At each examination, respiratory status (categorized as 1 = good, 0 = poor)
data("respiratory")
head(respiratory)
myData <- respiratory

<center>head(myData)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Center||ID||Treat||Sex||Age||Baseline||Visit||Outcome
|-
|1 ||1||1||P||M||46||0||1||0
|-
|2 ||1||1||P||M||46||0||2||0
|-
|3 ||1||1||P||M||46||0||3||0
|-
|4 ||1||1||P||M||46||0||4||0
|-
|5||1||2||P||M||28||0||1||0
|-
|6||1||2||P||M||28||0||2||0
|}
</center>

# Get proportions of positive responses
responses <- factor(myData$\$$outcome, labels = c("OutcomePositive", "OutcomeNegative"))
data.frame <- data.frame(responses, myData$\$$age)
head(data.frame)
tab <- prop.table(table(data.frame), 1); tab # compute proportions
sum(tab[1,]) # check proportions (sums to 1.0)?
prop <- tab[1,] # save the proportions of positive responses for each patient
plot(as.numeric(dimnames(tab)$\$$myData.age), tab[1,], xlab = "Age", ylab = "Proportion of Positive Outcomes")
# dimnames(tab) # to see/inspect positive/negative outcomes

[[Image:SMHS_BigDataBigSci9.png|500px]]

x <- as.numeric(dimnames(tab)$\$$myData.age)
poly <- loess( prop ~ x) # fit a Local Polynomial Regression Fitting
plot(x, prop)
lines(predict(poly), col='red', lwd=2)

smoothingSpline <- smooth.spline(x, prop, spar=0.6)
plot(x, prop)
lines(smoothingSpline, col='red', lwd=1.5)
smoothPolySpline <- smooth.spline(x, predict(poly), spar=0.6)
lines(smoothPolySpline, col='blue', lwd=2)
legend("topright", inset=.05, title="Polynomial regression models", c("Raw Poly","Smooth Poly"), fill=c('red', 'blue'), horiz=TRUE)

[[Image:SMHS_BigDataBigSci10.png|500px]]

model.glm <- glm(outcome ~ baseline + center + sex + treat + age + I(age^2), data = respiratory, family = binomial)

summary(model.glm)

<center>Deviance Residuals:
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -2.5951||-0.9108||0.4034||0.8336||2.0951
|}
</center>

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Estimate||Std. Error||z value||$Pr( \gt |z|)$
|-
|(Intercept)||3.3579727||1.0285292||3.265||0.0011 **
|-
|baseline||1.8850421||0.2482959||7.592||3.15e-14 ***
|-
|center||0.5099244||0.2453982||2.078||0.0377 *
|-
|sexM||-0.4510595||0.3166570||-1.424||0.1543
|-
|Treatp||-1.3231587||0.2431603||-5.442||5.28e-08 ***
|-
|age||-0.2072815||0.0472538||-4.387||1.15e-05 ***
|-
|I(age^2)||0.0025650||0.0006324||4.056||4.99e-05 ***
|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 609.41 on 443 degrees of freedom

Residual deviance: 468.62 on 437 degrees of freedom

AIC: 482.62

The correlation matrix of the of the outcome measures across visits is shown in Table 3.

attach(myData)
mat1 <- matrix(c(outcome[visit==1], outcome [visit==2], outcome [visit==3],
outcome[visit==4]), ncol = 4)
cor(mat1)

Table 3: Correlation matrix for the outcome measurements at different visits.

<center>Coefficients:
{| class="wikitable" style="text-align:center; " border="1"
|-
|||[,1]||[,2]||[,3]||[,4]
|-
|[,1]||1.0000000||0.5087944||0.4431438||0.5139016
|-
|[,2]||0.5087944||1.0000000||0.5821877||0.5301611
|-
|[,3]||0.4431438||0.5821877||1.0000000||0.5871276
|-
|[,4]||0.5139016||0.5301611||0.5871276||1.0000000
|}
</center>

# We can also examine for multicollinearity problem, using the correlation matrix for X
cor(model.matrix(model.glm)[,-1])

# GEE modeling: R function arguments/options

*corstr= for defining the correlation structure within groups in a GEE model

*id= is used to identify the grouping variable in a GEE model

*scale.fix= when TRUE causes the scale parameter to be fixed (by default at 1) rather than estimated

*waves= names a positive integer-valued variable that is used to identify the order and spacing of observations within groups in a GEE model. This argument is crucial when there are missing values and gaps in the data

gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family = "binomial", id = id, corstr = "exchangeable", scale.fix = TRUE)

# The column labeled Wald in the summary table is the square of the z-statistic. The reported p-values are the
# upper tailed probabilities from a chisq1 distribution and test whether the true parameter value ≠0.
summary(gee.model1)

# To test the effect of ''treatment'' using anova()
gee.model1 <- geeglm(outcome ~ center + treat + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id = id, corstr = "exchangeable", std.err="san.se")
gee.model2 <- geeglm(outcome ~ center + sex + baseline + age, data = respiratory, family=binomial(link="logit"), id=id, corstr = "exchangeable", std.err="san.se")
anova(gee.model1, gee.model2)

# To test whether a categorical predictor with more than two levels should be retained in a GEE model we need
# to test the entire set of dummy variables simultaneously as a single construct.
# The geepack package provides a method for the anova function for a multivariate Wald test
# When the anova function is applied to a single geeglm object it returns sequential Wald tests for
# individual predictors with the tests carried out in the order the predictors are listed in the model formula.
anova(gee.model1)

===PD GEE example===

This example used the PPMI/PD data to show GEE analysis.

# 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv
longData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1",header=TRUE)

# library("geepack")

# Data Elements: FID_IID L_insular_cortex_ComputeArea L_insular_cortex_Volume R_insular_cortex_ComputeArea R_insular_cortex_Volume L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea R_putamen_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT chr17_rs12185268_GT chr17_rs199533_GT UPDRS_part_I UPDRS_part_II UPDRS_part_III time_visit

dim(longData)

data1 = na.omit(longData)
attach(data1)
ControlGroup <- ifelse(ResearchGroup == "Control", 1, 0)

# these calculations take a long time!!!
# if you get “Error in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca, :
# nrow(zsca) and length(y) not match” – this indicates some of the variables are of different lengths
# if you get “glm.fit: algorithm did not converge” – see this discussion: http://goo.gl/lrjBjB

gee.model0 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

gee.model1 <- geeglm(ControlGroup ~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+ R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr17_rs11012_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

# compare 2 gee models
# anova(gee.model0,gee.model1)

# you can try the “family = poisson(link = "log")” model for the ResearchGroup response, as well

gee.model2 <- geeglm(ControlGroup
~ L_insular_cortex_ComputeArea+L_insular_cortex_Volume+R_insular_cortex_ComputeArea+ R_insular_cortex_Volume +L_cingulate_gyrus_ComputeArea + L_cingulate_gyrus_Volume + R_cingulate_gyrus_ComputeArea + R_cingulate_gyrus_Volume + L_caudate_ComputeArea + L_caudate_Volume + R_caudate_ComputeArea + R_caudate_Volume + L_putamen_ComputeArea + L_putamen_Volume + R_putamen_ComputeArea + R_putamen_Volume + Sex + Weight + Age + chr12_rs34637584_GT + chr17_rs11868035_GT + chr17_rs11012_GT + chr17_rs393152_GT + chr17_rs12185268_GT + chr17_rs199533_GT + UPDRS_part_I + UPDRS_part_II + time_visit, data = data1, family=binomial(link="logit"), id = FID_IID, corstr = "unstructured", std.err="san.se")

Remember that we do not interpret GEE coefficients as relating to individuals – GEE models are marginal models and the conclusions drawn are interpreted as population-based. Also, the time element in the model (time_visit) is just another controlling factor. The effect-sizes (betas) associated with each variable/predictor represent the slopes associated with the corresponding covariate, while holding time constant. If we need to examine interactions (e.g., Weight change over Time), we need to include an interaction term in model: (i.e. + Weight*time_visit).

summary (gee.model2)

# Individual Wald test and confidence intervals for each covariate
predictors2 <- coef(summary(gee.model2))
CI2 <- with(as.data.frame(predictors2), cbind(lwr=Estimate-1.96*Std.err, est=Estimate, upr=Estimate+1.96*Std.err))
rownames(CI2) <- rownames(predictors2)
CI2

==Appendix==

SEM References

*http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf

GEE References

*https://cran.r-project.org/web/packages/geepack/geepack.pdf

*http://www.jstatsoft.org/v15/i02/paper

===Footnotes===

* 3 http://www.jstatsoft.org/v15/i02/

* 4 https://books.google.com/books?id=mdEqBgAAQBAJ

==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
* [[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_GEE| Next Section: Generalized Estimating Equation (GEE) Modeling]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_GCM}}

SMHS BigDataBigSci SEM Ex2

2016-05-23T19:10:55Z

Pineaumi: /* Output */

==[[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]] - Hands-on Example 2 (Parkinson’s Disease data) ==

# Data: PPMI Integrated imaging, demographics, genetics, clinical and cognitive (UPDRS) data.
# Dinov et al., 2016

<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
!Index||FID_IID||L_cingulate_gyrus_ComputeArea||L_cingulate_gyrus_Volume||R_cingulate_gyrus_ComputeArea||R_cingulate_gyrus_Volume||L_caudate_ComputeArea||L_caudate_Volume||R_caudate_ComputeArea||R_caudate_Volume||L_putamen_ComputeArea||L_putamen_Volume||R_putamen_ComputeArea||R_putamen_Volume||L_hippocampus_ComputeArea||L_hippocampus_Volume||R_hippocampus_ComputeArea||R_hippocampus_Volume||cerebellum_ComputeArea||cerebellum_Volume||L_fusiform_gyrus_ComputeArea||L_fusiform_gyrus_Volume||R_fusiform_gyrus_ComputeArea||R_fusiform_gyrus_Volume||Sex||Weight||ResearchGroup||Age||chr12_rs34637584_GT||chr17_rs11868035_GT||chr17_rs11012_GT||chr17_rs393152_GT||chr17_rs12185268_GT||UPDRS_part_I||UPDRS_part_II||UPDRS_part_III||UPDRS_part_IV||time_visit
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||0||2||12||NA||0
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||0||2||18||NA||42
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||0||3||23||NA||24
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||1||3||19||NA||9
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||4||3||20||NA||0
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||1||4||29||NA||42
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||0||2||39||NA||24
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||0||5||25||NA||9
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||1||6||34||NA||0
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||1||11||42||0||42
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||1||5||39||0||24
|-
|2||3001||4381.93||11205.13||4610.447||12246.55||621.5344||821.8991||1302.146||2526.248||1029.175||1543.017||1680.197||3792.201||1769.672||4737.038||1578.946||3817.621||20909.58||185742.6||4534.707||15830.32||3945.037||14471.84||1||74.2||PD||65.1808||0||1||1||1||1||NA||NA||NA||NA||9
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||3||15||17||NA||3
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||2||10||22||NA||48
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||NA||NA||NA||NA||30
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||1||16||20||NA||12
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||3||15||27||0||3
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||4||16||22||0||48
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||8||14||22||0||30
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||4||13||24||1||12
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||4||16||31||4||3
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||6||14||19||4||48
|-
|3||3002||3221.54||7439.645||3194.348||7264.683||876.9414||1364.86||1056.22||1965.206||1275.905||2696.695||1375.725||2966.682||1529.759||3736.04||1799.439||4665.168||17627.01||155632.3||4013.385||12677.99||3551.876||11263.23||2||70.6||PD||67.6247||0||1||0||0||0||5||18||29||3||30
|}
</center>

# install.packages("lavaan")
library(lavaan)
#load data 05_PPMI_top_UPDRS_Integrated_LongFormat1.csv ( dim(myData) 1764 31 )
# setwd("/dir/")
myData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1&verifier=3bYRT9FXgBGMCQv8MNxsclWnMgodiJRYo3ODFtDq",header=TRUE)

# dichotomize the "ResearchGroup" variable
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# Data elements: Index FID_IID L_cingulate_gyrus_ComputeArea L_cingulate_gyrus_Volume
R_cingulate_gyrus_ComputeArea R_cingulate_gyrus_Volume L_caudate_ComputeArea
L_caudate_Volume R_caudate_ComputeArea R_caudate_Volume
L_putamen_ComputeArea L_putamen_Volume R_putamen_ComputeArea
R_putamen_Volume L_hippocampus_ComputeArea L_hippocampus_Volume R_hippocampus_ComputeArea
R_hippocampus_Volume cerebellum_ComputeArea
cerebellum_Volume L_fusiform_gyrus_ComputeArea L_fusiform_gyrus_Volume R_fusiform_gyrus_ComputeArea
R_fusiform_gyrus_Volume Sex Weight ResearchGroup Age chr12_rs34637584_GT chr17_rs11868035_GT chr17_rs11012_GT chr17_rs393152_GT
chr17_rs12185268_GT UPDRS_Part_I_Summary_Score_Baseline
UPDRS_Part_I_Summary_Score_Month_03 UPDRS_Part_I_Summary_Score_Month_06 UPDRS_Part_I_Summary_Score_Month_09 UPDRS_Part_I_Summary_Score_Month_12 UPDRS_Part_I_Summary_Score_Month_18 UPDRS_Part_I_Summary_Score_Month_24 UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 UPDRS_Part_III_Summary_Score_Baseline UPDRS_Part_III_Summary_Score_Month_03 UPDRS_Part_III_Summary_Score_Month_06 UPDRS_Part_III_Summary_Score_Month_09 UPDRS_Part_III_Summary_Score_Month_12 UPDRS_Part_III_Summary_Score_Month_18 UPDRS_Part_III_Summary_Score_Month_24 X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24

====Validation of the measurement model====

myData<-within(myData, {
L_cingulate_gyrus_ComputeArea <- lm(L_cingulate_gyrus_ComputeArea ~ L_cingulate_gyrus_Volume+R_cingulate_gyrus_ComputeArea+R_cingulate_gyrus_Volume+L_caudate_ComputeArea+L_caudate_Volume+R_caudate_ComputeArea+R_caudate_Volume+L_putamen_ComputeArea+L_putamen_Volume+R_putamen_ComputeArea+R_putamen_Volum e+L_hippocampus_ComputeArea+L_hippocampus_Volume+R_hippocampus_ComputeArea+R_hippocampus_Volume+cerebellum_ComputeArea+cerebellum_Volume+L_fusiform_gyrus_ComputeArea+L_fusiform_gyrus_Volume+R_fusiform_gyrus_ComputeArea+R_fusiform_gyru s_Volume, data=myData)$\$$residuals
Weight <- lm(Weight ~ Sex+ResearchGroup+Age+chr12_rs34637584_GT+chr17_rs11868035_GT+chr17_rs11012_GT+chr17_rs393152_GT+chr17_rs12185268_GT, data=myData)$\$$residuals
UPDRS_Part_I_Summary_Score_Baseline <- lm(UPDRS_Part_I_Summary_Score_Baseline ~ UPDRS_Part_I_Summary_Score_Month_03+UPDRS_Part_I_Summary_Score_Month_06+UPDRS_Part_I_Summary_Score_Month_09+UPDRS_Part_I_Summary_Score_Month_12+UPDRS_Part_I_Summary_Score_Month_18+UPDRS_Part_I_Summary_Score_Month_24+UPDRS_Part_II_Pati ent_Questionnaire_Summary_Score_Baseline+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09+UPDRS_Part_II_Pa tient_Questionnaire_Summary_Score_Month_12+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24+UPDRS_Part_III_Summary_Score_Baseline+UPDRS_Part_III_Summary_Score_Month_ 03+UPDRS_Part_III_Summary_Score_Month_06+UPDRS_Part_III_Summary_Score_Month_09+UPDRS_Part_III_Summary_Score_Month_12+UPDRS_Part_III_Summary_Score_Month_18+UPDRS_Part_III_Summary_Score_Month_24+X_Assessment_Non.Motor_Epworth_Sleepiness _Scale_Summary_Score_Baseline+X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06+X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12+X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_ Month_24+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short _Summary_Score_Month_12+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24, data=myData)$\$$residuals })

====Structural Model====

# Next, proceed with the structural model including the residuals from data to account for effects of site.

Lavaan model specification:

formula type operator mnemonic
latent variable definition =~ is measured by
regression ~ is regressed on
(residual) (co)variance ~~ is correlated with
Intercept ~ 1 Intercept

For example,
myModel <-
# regressions
y1 + y2 ~ f1 + f2 + x1 + x2
f1 ~ f2 + f3
f2 ~ f3 + x1 + x2

# latent variable definitions
f1 =~ y1 + y2 + y3
f2 =~ y4 + y5 + y6
f3 =~ y7 + y8 + y9 + y10

# variances and covariances
y1 ~~ y1
y1 ~~ y2
f1 ~~ f2

# intercepts
y1 ~ 1
f1 ~ 1
model1 <-
'
# latent variable definitions - defining how the latent variables are “manifested by” a set of observed
# (or manifest) variables, aka “indicators”
# (1) Measurement Model
Imaging =~ L_cingulate_gyrus_ComputeArea+L_cingulate_gyrus_Volume
DemoGeno =~ Weight+Sex+Age
UPDRS =~ UPDRS_Part_I_Summary_Score_Baseline+UPDRS_Part_I_Summary_Score_Month_03

# (2) Regressions
ResearchGroup ~ Imaging + DemoGeno + UPDRS
'
model2 <-
'
# latent variable definitions - defining how the latent variables are “manifested by” a set of observed
# (or manifest) variables, aka “indicators”
# (1) Measurement Model
Imaging =~ L_cingulate_gyrus_ComputeArea+L_cingulate_gyrus_Volume+R_cingulate_gyrus_ComputeArea+R_cingulate_gyrus_Volume+L_caudate_ComputeArea+L_caudate_Volume+R_caudate_ComputeArea+R_caudate_Volume+L_putamen_ComputeArea+L_putamen_Volume+R_putam en_ComputeArea+R_putamen_Volume+L_hippocampus_ComputeArea+L_hippocampus_Volume+R_hippocampus_ComputeArea+R_hippocampus_Volume+cerebellum_ComputeArea+cerebellum_Volume+L_fusiform_gyrus_ComputeArea+L_fusiform_gyrus_Volume+R_fusiform_gyr us_ComputeArea+R_fusiform_gyrus_Volume
DemoGeno =~ Weight+Sex+Age+chr12_rs34637584_GT+chr17_rs11868035_GT+chr17_rs11012_GT+chr17_rs393152_GT+chr17_rs12185268_GT
UPDRS =~ UPDRS_Part_I_Summary_Score_Baseline+UPDRS_Part_I_Summary_Score_Month_03+UPDRS_Part_I_Summary_Score_Month_06+UPDRS_Part_I_Summary_Score_Month_09+UPDRS_Part_I_Summary_Score_Month_12+UPDRS_Part_I_Summary_Score_Month_18+UPDRS_Part_I_Summa ry_Score_Month_24+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06+UPDRS_Part_II_Patient_Questionnaire_Sum mary_Score_Month_09+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18+UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24+UPDRS_Part_III_Summary_Score_Baseline +UPDRS_Part_III_Summary_Score_Month_03+UPDRS_Part_III_Summary_Score_Month_06+UPDRS_Part_III_Summary_Score_Month_09+UPDRS_Part_III_Summary_Score_Month_12+UPDRS_Part_III_Summary_Score_Month_18+UPDRS_Part_III_Summary_Score_Month_24+X_Ass essment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline+X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06+X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12+X_Assessment_Non.Motor_Epw orth_Sleepiness_Scale_Summary_Score_Month_24+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06+X_Assessment_Non.Motor_ Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24

# (2) Regressions
# ResearchGroup ~ Imaging + DemoGeno + UPDRS
# transform cat variable to numeric:
# myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 0,
# ifelse(myData$\$$ResearchGroup == "PD", 2, 1))
RG_ranked ~ Imaging + DemoGeno + UPDRS

# (3) Residual Variances
L_insular_cortex_ComputeArea ~~ L_insular_cortex_ComputeArea
L_insular_cortex_Volume ~~ L_insular_cortex_Volume
R_insular_cortex_ComputeArea ~~ R_insular_cortex_ComputeArea
R_insular_cortex_Volume ~~ R_insular_cortex_Volume
L_cingulate_gyrus_ComputeArea ~~ L_cingulate_gyrus_ComputeArea
L_cingulate_gyrus_Volume ~~ L_cingulate_gyrus_Volume
R_cingulate_gyrus_ComputeArea ~~ R_cingulate_gyrus_ComputeArea
R_cingulate_gyrus_Volume ~~ R_cingulate_gyrus_Volume
L_caudate_ComputeArea ~~ L_caudate_ComputeArea
L_caudate_Volume ~~ L_caudate_Volume
R_caudate_ComputeArea ~~ R_caudate_ComputeArea
R_caudate_Volume ~~ R_caudate_Volume
L_putamen_ComputeArea ~~ L_putamen_ComputeArea
L_putamen_Volume ~~ L_putamen_Volume
R_putamen_ComputeArea ~~ R_putamen_ComputeArea
R_putamen_Volume ~~ R_putamen_Volume
L_hippocampus_ComputeArea ~~ L_hippocampus_ComputeArea
L_hippocampus_Volume ~~ L_hippocampus_Volume
R_hippocampus_ComputeArea ~~ R_hippocampus_ComputeArea
R_hippocampus_Volume ~~ R_hippocampus_Volume
cerebellum_ComputeArea ~~ cerebellum_ComputeArea
cerebellum_Volume ~~ cerebellum_Volume
L_fusiform_gyrus_ComputeArea ~~ L_fusiform_gyrus_ComputeArea
L_fusiform_gyrus_Volume ~~ L_fusiform_gyrus_Volume
R_fusiform_gyrus_ComputeArea ~~ R_fusiform_gyrus_ComputeArea
R_fusiform_gyrus_Volume ~~ R_fusiform_gyrus_Volume
R_fusiform_gyrus_ShapeIndex ~~ R_fusiform_gyrus_ShapeIndex
R_fusiform_gyrus_Curvedness ~~ R_fusiform_gyrus_Curvedness
Sex ~~ Sex
Weight ~~ Weight
ResearchGroup ~~ ResearchGroup
VisitID ~~ VisitID
Age ~~ Age
chr12_rs34637584_GT ~~ chr12_rs34637584_GT
chr17_rs11868035_GT ~~ chr17_rs11868035_GT
chr17_rs11012_GT ~~ chr17_rs11012_GT
chr17_rs393152_GT ~~ chr17_rs393152_GT
chr17_rs12185268_GT ~~ chr17_rs12185268_GT
chr17_rs199533_GT ~~ chr17_rs199533_GT
UPDRS_Part_I_Summary_Score_Baseline ~~ UPDRS_Part_I_Summary_Score_Baseline
UPDRS_Part_I_Summary_Score_Month_03 ~~ UPDRS_Part_I_Summary_Score_Month_03
UPDRS_Part_I_Summary_Score_Month_06 ~~ UPDRS_Part_I_Summary_Score_Month_06
UPDRS_Part_I_Summary_Score_Month_09 ~~ UPDRS_Part_I_Summary_Score_Month_09
UPDRS_Part_I_Summary_Score_Month_12 ~~ UPDRS_Part_I_Summary_Score_Month_12
UPDRS_Part_I_Summary_Score_Month_18 ~~ UPDRS_Part_I_Summary_Score_Month_18
UPDRS_Part_I_Summary_Score_Month_24 ~~ UPDRS_Part_I_Summary_Score_Month_24
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Baseline
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03 ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_03
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06 ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_06
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09 ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_09
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12 ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_12
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18 ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_18
UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24 ~~ UPDRS_Part_II_Patient_Questionnaire_Summary_Score_Month_24
UPDRS_Part_III_Summary_Score_Baseline ~~ UPDRS_Part_III_Summary_Score_Baseline
UPDRS_Part_III_Summary_Score_Month_03 ~~ UPDRS_Part_III_Summary_Score_Month_03
UPDRS_Part_III_Summary_Score_Month_06 ~~ UPDRS_Part_III_Summary_Score_Month_06
UPDRS_Part_III_Summary_Score_Month_09 ~~ UPDRS_Part_III_Summary_Score_Month_09
UPDRS_Part_III_Summary_Score_Month_12 ~~ UPDRS_Part_III_Summary_Score_Month_12
UPDRS_Part_III_Summary_Score_Month_18 ~~ UPDRS_Part_III_Summary_Score_Month_18
UPDRS_Part_III_Summary_Score_Month_24 ~~ UPDRS_Part_III_Summary_Score_Month_24
X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline ~~ X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Baseline
X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06 ~~ X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_06
X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12 ~~ X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_12
X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24 ~~ X_Assessment_Non.Motor_Epworth_Sleepiness_Scale_Summary_Score_Month_24
X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline ~~ X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline
X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06 ~~ X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_06
X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12 ~~ X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_12
X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24 ~~ X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Month_24

# (4) Residual Covariances
Sex ~~ Weight
'
# confirmatory factor analysis (CFA)
# The baseline is a null model constraining the observed variables to covary with no other variables.
# That is, the covariances are fixed to 0 and only individual variances are estimated. This is represents
# a “reasonable worst-possible fitting model”, against which the new fitted model is compared
# to calculate appropriate model-quality indices (e.g., CFA).

# standardize all variable to avoid huge variations between variable distributions
library("MASS")
# myData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1&verifier=3bYRT9FXgBGMCQv8MNxsclWnMgodiJRYo3ODFtDq",header=TRUE)

summary(myData)
myData2<-scale(myData); summary(myData2)

myDF <- data.frame(myData2)
# myDF3 <- subset(myDF, select=c("L_cingulate_gyrus_ComputeArea", "cerebellum_Volume", "Weight", "Sex", "Age", " UPDRS_part_I", "UPDRS_part_II", "UPDRS_part_III", "ResearchGroup"))

myDF3 <- subset(myDF, select=c("R_insular_cortex_ComputeArea", "R_insular_cortex_Volume", "Sex", "Weight", "ResearchGroup", "Age", "chr12_rs34637584_GT", "chr17_rs11868035_GT", "chr17_rs11012_GT"))

model3 <-
'
# latent variable definitions - defining how the latent variables are “manifested by” a set of observed
# (or manifest) variables, aka “indicators”
# (1) Measurement Model
# Imaging =~ L_cingulate_gyrus_ComputeArea + cerebellum_Volume
Imaging =~ R_insular_cortex_ComputeArea + R_insular_cortex_Volume
DemoGeno =~ Weight+Sex+Age
# UPDRS =~ UPDRS_Part_I_Summary_Score_Baseline+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline
UPDRS =~ UPDRS_part_I +UPDRS_part_II + UPDRS_part_III
# (2) Regressions
ResearchGroup ~ Imaging + DemoGeno + UPDRS

fit3 <- cfa(model3, data= myData2, missing='FIML') # deal with missing values (missing='FIML')
summary(fit3, fit.measures=TRUE)
lavaan (0.5-18) converged normally after 2044 iterations
Number of observations    1764
Number of missing patterns    3
Estimator    ML
Minimum Function Test Statistic    455.923
Degrees of freedom    15
P-value (Chi-square)    0.000
Model test baseline model:
Minimum Function Test Statistic    2625.020
Degrees of freedom    28
P-value    0.000
User model versus baseline model:
Comparative Fit Index (CFI)    0.830
Tucker-Lewis Index (TLI)    0.683
Loglikelihood and Information Criteria:
Loglikelihood user model (H0)    -51499.484
Loglikelihood unrestricted model (H1)    -51271.522
Number of free parameters    29
Akaike (AIC)    103056.967
Bayesian (BIC)    103215.752
Sample-size adjusted Bayesian (BIC)    103123.621
Root Mean Square Error of Approximation:
RMSEA    0.129
90 Percent Confidence Interval    0.119 0.139
P-value RMSEA <= 0.05    0.000
Standardized Root Mean Square Residual:
SRMR    0.062
Parameter estimates:
Information    Observed
Standard Errors    Standard

Estimate Std.err Z-value P(>|z|)

Latent variables:
Imaging =~
R_cnglt_gyr_V    1.000
L_cadt_CmptAr    493.058
DemoGeno =~
Weight    1.000
Sex    24.158
Age    0.094
UPDRS =~
UPDRS_part_I    1.000
UPDRS_part_II    7.389
Regressions:
ResearchGroup ~
Imaging    -0.000
DemoGeno    0.002
UPDRS    -0.323
Covariances:
Imaging ~~
DemoGeno    0.001
UPDRS    0.002
DemoGeno ~~
UPDRS    0.000
Intercepts:
R_cnglt_gyr_V    7895.658
L_cadt_CmptAr    635.570
Weight    82.048
Sex    1.340
Age    61.073
UPDRS_part_I    1.126
UPDRS_part_II    4.905
ResearchGroup    0.290
Imaging    0.000
DemoGeno    0.000
UPDRS    0.000
Variances:
R_cnglt_gyr_V    17070159.189
L_cadt_CmptAr    -536243845.090
Weight    274.912
Sex    96.664
Age    105.347
UPDRS_part_I    2.442
UPDRS_part_II    -0.256
ResearchGroup    0.149
Imaging    2206.397
DemoGeno    -0.165
UPDRS    0.550
'

====Output====
3 parts of the Lavaan SEM output
*First six lines are called the header contains the following information:
*lavaan version number
*lavaan converge info (normal or not), and # iterations needed
*the number of observations that were effectively used in the analysis
*the estimator that was used to obtain the parameter values (here: ML)
*the model test statistic, the degrees of freedom, and a corresponding p-value

# Next, is the Model test baseline model and the value for the SRMR
# The last section contains the parameter estimates, standard errors (if the information matrix is expected or observed, and if the standard errors are standard, robust, or based on the bootstrap). Then, it tabulates all free (and fixed) parameters that were included in the model. Typically, first the latent variables are shown, followed by covariances and (residual) variances. The first column (Estimate) contains the (estimated or fixed) parameter value for each model parameter; the second column (Std.err) contains the standard error for each estimated parameter; the third column (Z-value) contains the Wald statistic (which is simply obtained by dividing the parameter value by its standard error), and the last column contains the p-value for testing the null hypothesis that the parameter equals zero in the population.

Note: You can get this type of error ''“…system is computationally singular: reciprocal condition…”,'' which indicates that the design matrix is not invertible. Thus, it can't be used to develop a regression model. This is due to linearly dependent columns, i.e. strongly correlated variables. Resolve pairwise covariances (or correlations) of your variables to investigate if there are any variables that can potentially be removed. You're looking for covariances (or correlations) >> 0. We can also automate this variable selection by using a forward stepwise regression.

# Graphical fit model visualization
library(semPlot)
semPaths(fit3)

<center>[[Image:SMHS_BigDataBigSci4.png|500px]]</center>

semPaths(fit3, "std", ask = FALSE, as.expression = "edges", mar = c(3, 1, 5, 1))

<center>[[Image:SMHS_BigDataBigSci5.png|500px]]</center>

==See also==
* [[SMHS_BigDataBigSci_SEM_sem_vs_cfa| Next See: Differences and Similarities between '''sem'''() and '''cfa'''() ]]
* [[SMHS_BigDataBigSci_SEM| Back to Structural Equation Modeling (SEM)]]
* [[SMHS_BigDataBigSci_SEM_Ex1| Back to SEM Example 1: School Kids Mental Abilities]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci_SEM_Ex2}}

SMHS BigDataBigSci

2016-05-23T19:09:25Z

Pineaumi: /* Overview */

==[[SMHS| Scientific Methods for Health Sciences]] - Model-based Analyses ==

Structural Equation Modeling (SEM), Growth Curve Models (GCM), and Generalized Estimating Equation (GEE) Modeling

==Questions ==

*How to represent dependencies in linear models and examine causal effects?
*Is there a way to study population average effects of a covariate against specific individual effects?

==Overview==

SEM allow re-parameterization of random-effects to specify latent variables that may affect measures at different time points using structural equations. SEM show variables having predictive (possibly causal) effects on other variables (denoted by arrows) where coefficients index the strength and direction of predictive relations. SEM does not offer much more than what classical regression methods do, but it does allow simultaneous estimation of multiple equations modeling complementary relations.

Growth Curve (or latent growth) modeling is a statistical technique employed in SEM for estimating growth trajectories for longitudinal data (over time). It represent repeated measures of dependent variables as functions of time and other covariates. When subjects or units are observed repeatedly over known time points latent growth curve models reveal the trend of an individual as a function of an underlying growth process where the growth curve parameters can be estimated for each subject/unit.

GEE is a marginal longitudinal method that directly assesses the mean relations of interest (i.e., how the mean dependent variable changes over time), accounting for covariances among the observations within subjects, and getting a better estimate and valid significance tests of the relations. Thus, GEE estimates two different equations, (1) for the mean relations, and (2) for the covariance structure. An advantage of GEE over random-effect models is that it does not require the dependent variable to be normally distributed. However, a disadvantage of GEE is that it is less flexible and versatile – commonly employed algorithms for it require a small-to-moderate number of time points evenly (or approximately evenly) spaced, and similarly spaced across subjects. Nevertheless, it is a little more flexible than repeated-measure ANOVA because it permits some missing values and has an easy way to test for and model away the specific form of autocorrelation within subjects.

GEE is mostly used when the study is focused on uncovering the population average effect of a covariate vs. the individual specific effect. These two things are only equivalent for linear models, but not in non-linear models.

For instance, suppose $Y_{i,j}$ is the random effects logistic model of the $j^{th}$, observation of the $i^{th}$ subject, then
$
log\Bigg(\frac{p_{i,j}}{1-p_{i,j}} \Bigg)=μ+ν_i,
$
where $ν_i \sim N(0,σ^2)$ is a random effect for subject i and $p_{i,j}=P(Y_{i,j}=1|ν_i).$

(1) When using a random effects model on such data, the estimate of μ accounts for the fact that a mean zero normally distributed perturbation was applied to each individual, making it ''individual-specific''.

(2) When using a GEE model on the same data, we estimate the population average log odds,

\begin{equation}
δ=log\Bigg(\frac{E_v(\frac{1}{1+e^{-μ+v}i})}{1-E_v(\frac{1}{1+e^{-μ+v}i})}
\Bigg),
\end{equation}

in general $μ≠δ$.

If $μ=1$ and $σ^2=1$, then $δ≈.83$.

empirically:

m <- 1; s <- 1; v<-rnorm(1000, 0,s); v2 <- 1/(1+exp(-m+v)); v_mean <- mean(v2)

d <- log(v_mean/(1-v_mean)); d

Note that the random effects have mean zero on the transformed, linked, scale, but their effect is not mean zero on the original scale of the data. We can also simulate data from a mixed effects logistic regression model and compare the population level average with the inverse-logit of the intercept to see that they are not equal. This leads to a difference of the interpretation of the coefficients between GEE and random effects models, or SEM.

That is, there will be a difference between the GEE population average coefficients and the individual specific coefficients (random effects models).

# theoretically, if it can be computed:

$E(Y)=μ=1$ (in this specific case), but the expectation of the population average log odds
$δ=log\Bigg[\frac{P(Y_{i,j}=1|v_i)}{1-P(Y_{i,j}=1|v_i)}\Bigg]$ would be $< 1$ 1.
Note that this is kind of related to the fact that a grand-total average need not be equal to an average of partial averages.

The mean of the $i^{th}$ person in the $j^{th}$ observation (e.g., location, time, etc.) can be expressed by:

$E(Yij | Xij,α_j)= g[μ(Xij|β)+Uij(α_j,Xij)]$,

Where $μ(X_{ij}|β)$ is the average “response” of a person with the same covariates $X_{ij}$, $β$ a set of fixed effect coefficients, and $Uij(α_j,Xij)$ is an error term that is a function of the (time, space) random effects, $α_j$, and also a function of the covariates $X_{ij}$, and $g$ is the '''link function''' which specifies the regression type -- e.g.,

*linear:''' $g^{-1} (u)=u,$

*log:''' $g^{-1} (u)= log(u),$

*logistic:''' $g^{-1} (u)=log(\frac{u}{1-u})$

*$E(Uij(α_j,Xij)|Xij)=0.$

The link function, $g(u)$, provides the relationship between the linear predictor and the mean of the distribution function. For practical applications there are many commonly used link functions. It makes sense to try to match the domain of the link function to the range of the distribution function's mean.

<center>Common distributions with typical uses and canonical link functions</center>
<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|Distribution ||Support of distribution||Typical uses||Link name||Link function||Mean function
|-
|Normal||real: $(-∞, +∞)$||Linear-response data||Identity||$X\beta=\mu$||$\mu=X\beta$
|-
|Exponential, Gamma||real:$(0, +∞)$||Exponential-response data, scale parameters||Inverse||$X\beta=-\mu^{-1}$||$\mu=-(X\beta)^{-1}$
|-
|Inverse Gaussian||real:$(0, +∞)$|| ||Inverse squared||$X\beta=-\mu^{-2}$||$\mu=(-X\beta)^{-1/2}$
|}
</center>

===Footnotes===

*1 http://www.researchgate.net/publication/41895248

==Model-based Analytics==

===[[SMHS_BigDataBigSci_SEM| Structural Equation Modeling (SEM)]]===

===[[SMHS_BigDataBigSci_GCM| Growth Curve Modeling (GCM)]]===

===[[SMHS_BigDataBigSci_GEE| Generalized Estimating Equation (GEE) Modeling]]===

===[[SMHS_BigDataBigSci_CrossVal|Internal Validation - Statistical n-fold cross-validaiton]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_BigDataBigSci}}

SMHS MethodsHeterogeneity CER

2016-05-23T19:03:37Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study15===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study16===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===Footnotes===

*13 Based on 2009 NPC report, http://www.npcnow.org/publication/demystifying-comparative-effectiveness-research-case-study-learning-guide
*14 http://www.cancer.gov/cancertopics/druginfo/fda-cetuximab
*15 http://www.nejm.org/doi/full/10.1056/NEJMoa072761
*16 http://jech.bmj.com/content/59/9/740.short

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T19:02:54Z

Pineaumi: /* Case-Study 3: The Nurses’ Health Study */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study15===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study16===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===Footnotes===

*13 Based on 2009 NPC report, http://www.npcnow.org/publication/demystifying-comparative-effectiveness-research-case-study-learning-guide
*14 http://www.cancer.gov/cancertopics/druginfo/fda-cetuximab
*15 http://www.nejm.org/doi/full/10.1056/NEJMoa072761

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T19:02:19Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study15===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===Footnotes===

*13 Based on 2009 NPC report, http://www.npcnow.org/publication/demystifying-comparative-effectiveness-research-case-study-learning-guide
*14 http://www.cancer.gov/cancertopics/druginfo/fda-cetuximab
*15 http://www.nejm.org/doi/full/10.1056/NEJMoa072761

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T19:01:38Z

Pineaumi: /* Case-Study 2: The Rosiglitazone Study */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study15===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===Footnotes===

*13 Based on 2009 NPC report, http://www.npcnow.org/publication/demystifying-comparative-effectiveness-research-case-study-learning-guide
*14 http://www.cancer.gov/cancertopics/druginfo/fda-cetuximab

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T19:00:26Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===Footnotes===

*13 Based on 2009 NPC report, http://www.npcnow.org/publication/demystifying-comparative-effectiveness-research-case-study-learning-guide
*14 http://www.cancer.gov/cancertopics/druginfo/fda-cetuximab

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T18:58:39Z

Pineaumi: /* Compete Dataset for N-of-1 Example */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===Footnotes===

*13 Based on 2009 NPC report, www.npcnow.org/publication/demystifying-comparative-effectiveness-research-case-study-learning-guide
*14 http://www.cancer.gov/cancertopics/druginfo/fda-cetuximab

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T18:57:45Z

Pineaumi: /* Case-Study 1: The Cetuximab Study */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study14===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T18:57:20Z

Pineaumi: /* Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research - Comparative Effectiveness Research (CER) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research: Case Studies 13 (CER) ==

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity CER

2016-05-23T18:56:34Z

Pineaumi: /* Observational Studies: Tips for the CER Practitioners */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Comparative Effectiveness Research (CER) ==

===Overview===

===Observational Studies: Tips for the CER Practitioners===

*Different study types can offer different understandings; neither should be discounted without closer examination.

*RCTs provide an accurate understanding of the effect of a particular intervention in a well-defined patient group under “controlled” circumstances.

*Observational studies provide an understanding of real-world care and its impact, but can be biased due to uncontrolled factors.

*Observational studies differ in the types of databases used. These databases may lack clinical detail and contain incomplete or inaccurate data.

*Before accepting the findings from an observational study, consider whether confounding factors may have influenced the results.

*In this scenario, subgroup analysis was vital in clarifying both study designs; what is true for the many (e.g., overall, estrogen appeared to be detrimental) may not be true for the few (e.g., that for the younger post-menopausal woman, the benefits were greater and the harms less frequent).

*Carefully examine the generalizability of the study. Do the study’s patients and intervention match those under consideration?

*Observational studies can identify associations but cannot prove cause-and-effect relationships.

===Case-Study 1: The Cetuximab Study===

What was done and what was found?

Cetuximab, an anti-epidermal growth factor receptor (EGFR) agent, has recently been added to the therapeutic armamentarium. Two important CRTs examined its impact in patients with mCRC (metastatic-stage Colorectal cancer). In the first one, 56 centers in 11 European countries investigated the outcomes associated with cetuximab therapy in 329 mCRC patients who experienced disease progression either on irinotecan therapy or within 3 months thereafter. The study reported that the group on a combination of irinotecan and cetuximab had a significantly higher rate of overall response to treatment (primary endpoint) than the group on cetuximab alone: 22.9% (95% CI, 17.5-29.1%) vs. 10.8% (95% CI, 5.7-18.1%) (P=0.007), respectively. Similarly, the median time to progression was significantly longer in the combination therapy group (4.1 vs. 1.5 months, P<0.001). As these patients had already progressed on irinotecan prior to the study, any response was viewed as positive. Safety between the two treatment arms was similar: approximately 80% of patients in each arm experienced a rash. Grade 3 or 4 (the more severe) toxic effects on the skin were slightly more frequent in the combination-therapy group compared to cetuximab monotherapy, observed in 9.4% and 5.2% of participants, respectively. Other side effects, such as diarrhea and neutropenia observed in the combination-therapy arm, were considered to be in the range expected for irinotecan alone. Data from this study demonstrated the efficacy and safety of cetuximab and were instrumental in the FDA’s 2004 approval.

A second CRT (2007) examined 572 patients and suggested efficacy of cetuximab in the treatment of mCRC. This study was a randomized, non-blinded, controlled trial that examined cetuximab monotherapy plus best supportive care compared to best supportive care alone in patients who had received and failed prior chemotherapy regimens. It reported that median overall survival (the primary endpoint) was significantly higher in patients receiving cetuximab plus best supportive care compared to best supportive care alone (6.1 vs. 4.6 months, respectively) (hazard ratio for death=0.77; 95% CI: 0.64- 0.92, P=0.005). This RCT described a greater incidence of adverse events in the cetuximab plus best supportive care group compared to best supportive care alone including (most significantly) rash, as well as edema, fatigue, nausea and vomiting.

Was this the right answer?

These RCTs had fairly broad enrollment criteria and the cetuximab benefits were modest. Emerging scientific theories raised the possibility that genetically defined population subsets might experience a greater-than-average treatment benefit. One such area of inquiry entailed examining “biomarkers,” or genetic indicators of a patient’s greater response to therapy. Even as the above RCTs were being conducted, data emerged showing the importance of the KRAS gene.

Emerging Data

Based on the emerging biochemical evidence that the epidermal growth factor receptor (EGFR) treatment mechanism (Cetuximab) was even more finely detailed than previously understood, the study authors of the 2007 RCT undertook a retrospective subgroup analysis using tumor tissue samples preserved from their initial study. Following laboratory analysis, all viable tissue samples were classified as having a wild-type (non-mutated) or a mutated KRAS gene. Instead of the previous two study arms (cetuximab plus best supportive care vs. best supportive care alone), there were 4 for this new analysis: each of the two original study arms was further divided by wild-type vs. mutated KRAS status. Laboratory evaluation determined that 40.9% and 42.3% of all patients in the RCT had a KRAS mutation in the cetuximab plus best supportive care group compared to the best supportive care group alone, respectively. The efficacy of cetuximab was found to be significantly correlated with KRAS status: in patients with wild-type (non-mutated). KRAS genes, cetuximab plus best supportive care compared to best supportive care alone improved overall survival (median 9.5 vs. 4.8 months, respectively; hazard ratio for death=0.55; 95% CI, 0.41-0.74, P<0.001), and progression-free survival (median 3.7 vs. 1.9 months, respectively; hazard ratio for progression or death=0.40; 95% CI, 0.30-0.54, P<0.001). Meanwhile, in patients with mutated KRAS tumors, the authors found no significant difference in outcome between cetuximab plus best supportive care vs. best supportive care alone.

What next?

Based on these and similar results from other studies, the FDA narrowed its product labeling in July 2009 to indicate that cetuximab is not recommended for mCRC patients with mutated KRAS tumors. This distinction reduces the relevant population by approximately 40%. Similarly, the American society of Clinical oncology released a provisional clinical recommendation that all mCRC patients have their tumors tested for KRAS status before receiving anti-EGFR therapy. The benefits of targeted treatment are many. Patients who previously underwent cetuximab therapy without knowing their genetic predisposition would no longer have to be exposed to the drug’s toxic effects if unnecessary, as the efficacy of cetuximab is markedly higher in the genetically defined appropriate patients. In a less-uncertain environment, clinicians can be more confident in advocating a course of action in their care of patients. And finally, knowledge that targeted therapy is possible suggests the potential for further innovation in treatment options. In fact, research continues to demonstrate options for targeted cetuximab treatment of mCRC at an even finer scale than seen with KRAS; and similar genetic targeting is being investigated, and advocated, in other cancer types.

Lessons Learned From this case Study

Although RCTs are generally viewed as the gold standard, results of one or even a series of trials may not accurately reflect the benefits experienced by an individual patient. This case-study suggests that cetuximab initially appeared to have rather modest clinical benefits. Albeit, new information that became available and subsequent genetic subgroup assessments led to very different conclusions. Clinicians should be aware that the current knowledge is likely to evolve and any decisions about patient care should be carefully considered with that sense of uncertainty in mind. As in this case study, subgroup analyses (e.g., genetic subtypes) need a theoretical rationale. Ideally, the analyses should be determined at the time of original RCT design and should not just occur as explorations of the subsequent data. When improperly employed, post hoc analyses may lead to incorrect patient care conclusions.

RCTs Tips for the CER Practitioners

*RCTs can determine whether an intervention can provide benefit in a very controlled environment.

*The controlled nature of an RCT may limit its generalizability to a broader population.

*No results are permanent; advances in scientific knowledge and understanding can influence how we view the effectiveness (or safety) of a therapeutic intervention.

*Targeted therapy illuminated by carefully thought out subgroup analyses can improve the efficacious and safe use of an intervention.

===Case-Study 2: The Rosiglitazone Study===

Meta-analysis

Often the results for the same intervention differ across clinical trials and it may not be clear whether one therapy provides more benefit than another. As CER increases and more studies are conducted, clinicians and policymakers are more likely to encounter this scenario. In a systematic review, a researcher identifies similar studies and displays their results in a table, enabling qualitative comparisons across the studies. With a meta-analysis, the data from included studies are statistically combined into a single “result.” Merging the data from a number of studies increases the effective sample size of the investigation, providing a statistically stronger conclusion about the body of research. By so doing, investigators may detect low frequency events and demonstrate more subtle distinctions between therapeutic alternatives.

When studies have been properly identified and combined, the meta-analysis produces a summary estimate of the findings and a confidence interval that can serve as a benchmark in medical opinion and practice. However, when done incorrectly, the quantitative and statistical analysis can create impressive “numbers” but biased results. The following are important criteria for properly conducted meta-analyses:

1. Carefully defining unbiased inclusion or exclusion criteria for study selection

2. Including only those studies that have similar design elements, such as patient population, drug regimen, outcomes being assessed, and time-frame

3. Applying correct statistical methods to combine and analyze the data

Reporting this information is essential for the reader to determine whether the data were suitable to combine, and if the meta-analysis draws unbiased conclusions. Meta-analyses of randomized clinical trials are considered to be the highest level of medical evidence as they are based upon a synthesis of rigorously controlled trials that systematically reduce bias and confounding. This technique is useful in summarizing available evidence and will likely become more common in the era of publicly funded comparative effectiveness research. The following case study will examine several key principles that will be useful as the reader encounters these publications.

Clinical Application

Heart disease is the leading cause of mortality in the United States, resulting in approximately 20% of all deaths. Diabetics are particularly susceptible to heart disease, with more than 65% of deaths attributable to it. The nonfatal complications of diabetes are wide-ranging and include kidney failure, nerve damage, amputation, stroke and blindness, among other outcomes. In 2007, the total estimated cost of diabetes in the United States was $174B; $116B was derived from direct medical expenditures and the rest from the indirect cost of lost productivity due to the disease. With such serious health effects and heavy direct and indirect costs tied to diabetes, proper disease management is critical. Historically, diabetes treatment has focused on strict blood sugar control, assuming that this goal not only targets diabetes but also reduces other serious comorbidities of the disease.

Anti-diabetic agents have long been associated with key questions as to their benefits/risks in the treatment of diabetes. The sulfonylurea tolbutamide, a first generation anti-diabetic drug, was found in a landmark study in the 1970s to significantly increase the CV mortality rate compared to patients not on this agent. Further analysis by external parties concluded that the methods employed in this trial were significantly flawed (e.g., use of an “arbitrary” definition of diabetes status, heterogeneous baseline characteristics of the populations studied, and incorrect statistical methods). Since these early studies, CV concerns continue to be an issue with selected oral hypoglycemic agents that have subsequently entered the marketplace.

A class of drugs, thiazolidinedione (TZD), was approved in the late 1990s, as a solution to the problems associated with the older generation of sulfonylureas. Rosiglitazone, a member of the TZD class, was approved by the FDA in 1999 and was widely prescribed for the treatment of type-2 diabetes. A number of RCTs supported the benefit of rosiglitazone as an important new oral antidiabetic agent. However, safety concerns developed as the FDA received reports of adverse cardiac events potentially associated with rosiglitazone. It was in this setting that a meta-analysis by Nissen and Wolski was published in the New England Journal of Medicine in June 2007.

What was done?

Nissen and Wolski conducted a meta-analysis examining the impact of rosiglitazone on cardiac events and mortality compared to alternative therapeutic approaches. The study began with a broad search to locate potential studies for review. The authors screened published phase II, III, and IV trials; the FDA website; and the drug manufacturer’s clinical-trial registry for applicable data relating to rosiglitazone use. When the initial search was complete, the studies were further categorized by pre-stated inclusion criteria. Meta-analysis inclusion criteria were simple: studies had to include rosiglitazone and a randomized comparator group treated with either another drug or placebo, study arms had to show similar length of treatment, and all groups had to have received more than 24 weeks of exposure to the study drugs. The studies had to contain outcome data of interest including the rate of myocardial infarction (MI) or death from all CV causes. Out of 116 studies surveyed by the authors, 42 met their inclusion criteria and were included in the meta-analysis. Of the studies they included, 23 had durations of 26 weeks or less, and only five studies followed patients for more than a year. Until this point, the study’s authors were following a path similar to that of any reviewer interested in CV outcomes, examining the results of these 42 studies and comparing them qualitatively. Quantitatively combining the data, however, required the authors to make choices about the studies they could merge and the statistical methods they should apply for analysis. Those decisions greatly influenced the results that were reported.

What was found?

When the studies were combined, the meta-analysis contained data from 15,565 patients in the rosiglitazone group and 12,282 patients as comparators. Analyzing their data, the authors chose one particular statistical method (the Peto odds ratio method, a fixed-effect statistical approach), which calculates the odds of events occurring where the outcomes of interest are rare and small in number. In comparing rosiglitazone with a “control” group that included other drugs or placebo, the authors reported odds ratios of 1.43 (95% CI, 1.03-1.98; P=0.03) and 1.64 (95% CI,
0.98-2.74; P=0.06) for MI and death from CV causes, respectively. In other words, the odds of an MI or death from a CV cause are higher for rosiglitazone patients than for patients on other therapies or placebo. The authors reported that rosiglitazone was significantly associated with an increase in the risk of MI and had borderline significance in increasing the risk of death from all CV causes. These findings appeared online on the same day that the FDA issued a safety alert regarding rosiglitazone. Discussion of the meta-analysis was immediately featured prominently in the news media. By December 2007, prescription claims for the drug at retail pharmacies had fallen by more than 50%.

As diabetic patients and their clinicians reacted to the news, a methodologic debate also ensued. This discussion included statistical issues pertaining to the conduct of the analysis, its implications for clinical care, and finally the FDA and drug manufacturer’s roles in overseeing and regulating rosiglitazone. The concern among patients with diabetes regarding treatment, continues in the medical community today.

Was this the right answer?

Should the studies have been combined? Commentators faulted the authors for including several studies that were not originally intended to investigate diabetes, and for combining both placebo and drug therapy data into one comparator arm. Some critics noted that despite the stated inclusion criteria, some data were derived from studies where the rosiglitazone arm was allowed a longer follow-up than the comparator arm. By failing to account for this longer follow-up period, commentators felt that the authors may have overestimated the effect of rosiglitazone on CV outcomes. Many reviewers were concerned that this meta-analysis excluded trials in which no patients suffered an MI or died from CV causes – the outcomes of greatest interest. Some reviewers also noted that the exclusion of zero-event trials from the pooled dataset not only gave an incomplete picture of the impact of rosiglitazone but could have increased the odds ratio estimate. In general, the pooled dataset was criticized by many for being a faulty microcosm of the information available regarding rosiglitazone.

It is essential that a meta-analysis be based on similarity in the data sources. If studies differ in important areas such as the patient populations, interventions, or outcomes, combining their data may not be suitable. The researchers accepted studies and populations that were clinically heterogeneous, yet pooled them as if they were not. The study reported that the results were combined from a number of trials that were not initially intended to investigate CV outcomes. Furthermore, the available data did not allow for time-to-event analysis, an essential tool in comparing the impact of alternative treatment options. Reviewers considered the data to be insufficiently homogeneous, and the line of cause and effect to be murkier than the authors described.

Were the statistical methods optimal?

The statistical methods for this meta-analysis also came under significant criticism. The critiques focused on the authors’ use of the Peto method as being an incorrect choice because data were pooled from both small and very large studies, resulting in a potential overestimation of treatment effect. Others reviewers pointed that the Peto method should not have been used, as a number of the underlying studies did not have patients assigned equally to rosiglitazone and comparator groups. Finally, critics suggested that the heterogeneity of the included studies required an altogether different set of analytic techniques.

Demonstrating the sensitivity of the authors’ initial analysis to the inclusion criteria and statistical tests used, a number of researchers reworked the data from this study. one researcher used the same studies but analyzed the data with a more commonly used statistical method (Mantel-Haenszel), and found no significant increase in the relative risk or common odds ratio with MI or CV death. When the pool of studies was expanded to include those originally eliminated because they had zero CV events, the odds ratios for MI and death from CV causes dropped from 1.43 to 1.26 (95% CI, 0.93-1.72) and from 1.64 to 1.14 (95% CI, 0.74-1.74), respectively. Neither of the recalculated odd ratios were significant for MI or CV death. Finally, several newer long-term studies have been published since the Nissen meta-analysis. Incorporating their results with the meta-analysis data showed that rosiglitazone is associated with an increased risk of MI but not of CV death. Thus, the findings from these meta-analyses varied with the methods employed, the studies included, and the addition of later trials.

Emerging Data

The controversy surrounding the rosiglitazone meta-analysis authored by Nissen and Wolski forced an unplanned interim analysis of a long-term, randomized trial investigating the CV effects of rosiglitazone among patients with type 2 diabetes. The authors of the RECORD trial noted that even though the follow-up at 3.75 years was shorter than expected, rosiglitazone, when added to standard glucose-lowering therapy, was found to be associated with an increase in the risk of heart failure but was not associated with any increase in death from CV or other causes. Data at the time were found to be insufficient to determine the effect of rosiglitazone on an increase in the risk of MI. the final report of that trial, published in June 2009, confirmed the elevated risk of heart failure in people with type 2 diabetes treated with rosiglitazone in addition to glucose-lowering drugs, but continued to show inconclusive results about the effect of the drug therapy on the risk of MI. Further, the RECORD trial clarified that rosiglitazone does not result in an increased risk of CV morbidity or mortality compared to standard glucose-lowering drugs. Other trials conducted since the publishing of the meta-analysis have corroborated these results, casting further doubt on the findings of the meta-analysis published by Nissen and Wolski.

Now what?

Some sources suggest that the original Nissen meta-analysis delivered more harm than benefit, and that a well-recognized medical journal may have erred in its process of peer review. Despite this criticism, it is important to note that subsequent publications support the risk of adverse CV events associated with rosiglitazone, although rosiglitazone use does not appear to increase deaths. These results and emerging data point to the need for further rigorous research to clarify the benefits and risks of rosiglitazone on a variety of outcomes, and the importance of directing the drug to the population that will maximally benefit from its use.

Lessons Learned From this Case Study

Results from initial randomized trials that seem definitive at one time may not be conclusive, as further trials may emerge to clarify, redirect, or negate previously accepted results. A meta-analysis of those trials can lead to varying results based upon the timing of the analysis and the choices made in its performance.

Meta-Analysis: Tips for CER Practitioners

*The results of a meta-analysis are highly dependent on the studies included (and excluded). Are these criteria properly defined and relevant to the purposes of the meta-analysis? Were the combined studies sufficiently similar? Can results from this cohort be generalized to other populations of interest?

*The statistical methodology can impact study results. Have there been reviews critiquing the methods used in the meta-analysis?

*A variety of statistical tests should be considered, and perhaps reported, in the analysis of results. Do the authors mention their rationale in choosing a statistical method? Do they show the stability of their results across a spectrum of analytical methods?

*Nothing is permanent. Emerging data may change the playing field, and meta- analysis results are only as good as the data and statistics from which they are derived.

===Case-Study 3: The Nurses’ Health Study===

An observational study

An observational study is a very common type of research design in which the effects of a treatment or condition are studied without formally randomizing patients in an experimental design. Such studies can be done prospectively, wherein data are collected about a group of patients going forward in time; or retrospectively, in which the researcher looks into the past, mining existing databases for data that have already been collected. Latter studies are frequently performed by using an electronic database that contains, for example, administrative, “billing,” or claims data. Less commonly, observational research uses electronic health records, which have greater clinical information that more closely resembles the data collected in an RCT. Observational studies often take place in “real- world” environments, which allow researchers to collect data for a wide array of outcomes. Patients are not randomized in these studies, but the findings can be used to generate hypotheses for investigation in a more constrained experimental setting. Perhaps the best known observational study is the “Framingham study,” which collected demographic and health data for a group of individuals over many years (and continues to do so) and has provided an understanding of the key risk factors for heart disease and stroke.

Observational studies present many advantages to the comparative effectiveness researcher. the study design can provide a unique glimpse of the use of a health care intervention in the “real world,” an essential step in gauging the gap between efficacy (can a treatment work in a controlled setting?) and effectiveness (does the treatment work in a real-life situation?). Furthermore, observational studies can be conducted at low cost, particularly if they involve the secondary analysis of existing data sources. CER often uses administrative databases, which are based upon the billing data submitted by providers during routine care. These databases typically have limited clinical information, may have errors in them, and generally do not undergo auditing.

The uncontrolled nature of observational studies allows them to be subject to bias and confounding. For example, doctors may prescribe a new medication only for the sickest patients. Comparing these outcomes (without careful statistical adjustment) with those from less ill patients receiving alternative treatment may lead to misleading results. Observational studies can identify important associations but cannot prove cause and effect. These studies can generate hypotheses that may require RCTs for fuller demonstration of those relationships. Secondary analysis can also be problematic if researchers overwork datasets by doing multiple exploratory analyses (e.g., data-dredging): the more we look, the more we find, even if those findings are merely statistical aberrations. Unfortunately, the growing need for CER and the wide availability of administrative databases may lead to selection of research of poor quality with inaccurate findings.

In comparative effectiveness research, observational studies are typically considered to be less conclusive than RCTs and meta-analyses. Nonetheless, they can be useful, especially because they examine typical care. Due to lower cost and improvements in health information, observational studies will become increasingly common. Critical assessment of whether the described results are helpful or biased (based upon how the study was performed) are necessary. This case will illustrate several characteristics of the types of studies that will assist in evaluating newly published work.

Clinical Applications

Cardiovascular diseases (CVD) are the leading cause of death in women older than the age of 50. Epidemiologic evidence suggests that estrogen is a key mediator in the development of CVD. Estrogen is an ovarian hormone whose production decreases as women approach menopause. The steep increase in CVD in women at menopause and older and in women who have had hysterectomies further supports a relationship between estrogen and CVD. Building on this evidence of biologic plausibility, epidemiological and observational studies suggested that estrogen replacement therapy (a form of hormone replacement therapy, or HRT) had positive effects on the risk of CVD in postmenopausal women, (albeit with some negative effects in its potential to increase the risk for breast cancer and stroke). Based on these findings, in the 1980s and 1990s HRT was routinely employed to treat menopausal symptoms and serve as prophylaxis against CVD.

What was done?

The Nurses’ Health Study (NHS) began collecting data in 1976. In the study, researchers intended to examine a broad range of health effects in women over a long period of time, and a key goal was to clarify the role of HRT in heart disease. The cohort (i.e., the group being followed) included married registered nurses aged 30-55 in 1976 who lived in the 11 most populous states. To collect data, the researchers mailed the study participants a survey every 2 years that asked questions about topics such as smoking, hormone use, menopausal status, and less frequently, diet. Data were collected for key end points that included MI, coronary-artery bypass grafting or angioplasty, stroke, total CVD mortality, and deaths from all causes.

What was found?

At a 10-year follow-up point, the NHS had a study pool of 48,470 women. The researchers found that estrogen use (alone, without progestin) in postmenopausal women was associated with a reduction in the incidence of CVD as well as in CVD mortality compared to non-users. Later, estrogen-progestin combination therapy was shown to be even more cardioprotective than estrogen monotherapy, and lower doses of estrogen replacement therapy were found to deliver equal cardioprotection and lower the risk for adverse events. NHS researchers were alert to the potential for bias in observational studies. Adjustment for risk factors such as age (a typical practice to eliminate confounding) did not change the reported findings.

Was this the right answer?

The NHS was not unique in reporting the benefits associated with HRT; other observational studies corroborated the NHS findings. A secondary retrospective data analysis of the UK primary care electronic medical record database, for example, also showed the protective effect associated with HRT use. Researchers were aware of the fundamental limitations of observational studies, particularly with regard to selection bias. They and practicing clinicians were also aware of the potential negative health effects of HRT, which had to be constantly weighed against the potential cardioprotective benefits in deciding a patient’s course of treatment. As a large section of the population could experience the health effects of HRT, researchers began planning RCTs to verify the promising observational study results. It was highly anticipated that those RCTs would corroborate the belief that estrogen replacement can reduce CVD risk.

Randomized Controlled Trial: The Women’s Health Initiative

The Women’s health Initiative (WHI) was a major study established by the National Institutes of health in 1992 to assess a broad range of health effects in postmenopausal women. The trial was intended to follow these women for 8 years, at a cost of millions of dollars in federal funding. Among its many facets, it included an RCT to confirm the results from the observational studies discussed above. To fully investigate earlier findings, the WHI had two subgroups. One subgroup consisted of women with prior hysterectomies; they received estrogen monotherapy. The second group consisted of women who had not undergone hysterectomy; they received estrogen in combination with progestin. The WHI enrolled 27,347 women in their HRT investigation: 10,739 in the estrogen-alone arm and 16,608 in the estrogen plus progestin arm. Within each arm, women were randomly assigned to receive either HRT or placebo. All women in the trial were postmenopausal and aged 50-79 years; the mean age was 63.6 years (a fact that would be important in later analysis). Some participants had experienced previous CV events. The primary outcome of both subgroups was coronary heart disease (CHD), as described by nonfatal MI or death due to CHD.

The estrogen-progestin arm of the WHI was halted after a mean follow-up of 5.2 years, 3 years earlier than expected, as the HRT users in this arm were found to be at increased risk for CHD compared to those who received placebo. The study also noted elevated rates of breast cancer and stroke, among other poor outcomes. The estrogen-alone arm continued for an average follow-up of 6.8 years before being similarly discontinued ahead of schedule. Although this part of the study did not find an increased risk of CHD, it also did not find any cardioprotective effect. Beyond failing to locate any clear CV benefits, the WHI also found real evidence of harm, including increased risk of blood clots, breast cancer and stroke. Initial WHI publications therefore recommended against HRT being prescribed for the secondary prevention of CVD.

What Next?

Scientists and the clinicians who relied on their data for guidance in treating patients, were faced with conflicting data: epidemiological and observational studies suggested that HRT was cardioprotective while the higher-quality evidence from RCTs strongly suggested the opposite. Clinicians primarily followed the WHI results, so prescriptions for HRT in postmenopausal women quickly declined. Meanwhile, researchers began to analyze the studies for potential discrepancies, and found that the women being followed in the NHS and the WHI differed in several important characteristics.

First, the WHI population was older than the NHS cohort, and many had entered menopause at least 10 years before they enrolled in the RCT. Thus, the WHI enrollees experienced a long duration from the onset of menopause to the commencement of HRT. At the same time, many in the NHS population were closer to the onset of menopause and were still displaying hormonal symptoms when they began HRT. Second, although the NHS researchers adjusted the data for various confounding effects, their results could still have been subject to bias. In general, the NHS cohort was more highly educated and of a higher socioeconomic status than the WHI participants, and therefore more likely to see a physician regularly. The NHS women were also leaner and generally healthier than their RCT counterparts, and had been selected for their evident lack of pre-existing CV conditions. This selection bias in the NHS enrollment may have led to a “healthy woman” effect that in turn led to an overestimation of the benefits of therapy in the observational study. Third, researchers noted that dosing differences between the two study types may have contributed to the divergent results. The NHS reported beneficial results following low-dose estrogen therapy. The WHL, meanwhile, used a higher estrogen dose, exposing women to a larger dosage of hormones and increasing their risk for adverse events. The increased risk profile of the WHI women (e.g., older, more comorbidities, higher estrogen dose) could have contributed to the evidence of harm seen in the WHI results.

Emerging Data
In addition to identifying the inherent differences between the two study populations, researchers began a secondary analysis of the NHS and WHI trials. NHS researchers reported that women who began HRT close to the onset of menopause had a significantly reduced risk of CHD. In the subgroups of women that were older and had a similar duration after menopause compared with the WHI women, they found no significant relationship between HRT and CHD. Also, the WHI study further stratified these results by age, and found that women who began HRT close to their onset of menopause experienced some cardioprotection, while women who were further from the onset of menopause had a slightly elevated risk for CHD.

Secondary analysis of both studies was therefore necessary to show that age and a short duration from the onset of menopause are crucial to HRT success as a cardioprotective agent. Neither study type provided “truth” or rather, both studies provided “truth” if viewed carefully (e.g., both produced valid and important results). The differences seen in the studies were rooted in the timing of HRT and the populations being studied.

Lessons Learned From this case Study

Although RCTs are given a higher evidence grade, observational studies provide important clinical insights. In this example, the study populations differed. For policymakers and clinicians, it is crucial to examine whether the CER was based upon patients similar to those being considered. Any study with a dissimilar population may provide non-relevant results. Thus, readers of CER need to carefully examine the generalizability of the findings being reported.

==Appendix==

General Classification and Regression Tree (CART) data analysis steps part of the R package rpart.

===Growing the Tree===

# To grow a tree, use
rpart(formula, data=, method=,control=), where
formula is in the format outcome ~ predictor1+predictor2+...
data= specifies the data frame
method= "class" for a classification tree, use "anova" for a regression tree
control= optional parameters for controlling tree growth. For example, control=rpart.control(minsplit=30, cp=0.001) requires that the minimum number of observations in a node be 30 before attempting a split and that a split must decrease the overall lack of fit by a factor of 0.001 (cost complexity factor) before being attempted.

===Examining Results===

# These functions help with examining the results.
printcp(fit) display complexity parameter (cp) table
plotcp(fit) plot cross-validation results
rsq.rpart(fit) plot approximate R-squared and relative error for different splits (2 plots). labels are only appropriate for the "anova" method.
print(fit) print results
summary(fit) detailed results including surrogate splits
plot(fit) plot decision tree
text(fit) label the decision tree plot
post(fit, file=) create postscript plot of decision tree
# In trees created by rpart(), move to the LEFT branch when the stated condition is true.

===Pruning Trees===

#In general, trees should be pruned back to avoid overfitting the data. The tree size should minimize the cross-#validated error – xerror column printed by printcp(). Pruning the tree is accomplished by:
prune(fit, cp= )
# use printcp( ) to examine the cross-validation error results, select the complexity parameter (CP) associated with minimum error, and insert the CP it into the prune() function. This (automatically selecting the complexity parameter associated with the smallest cross-validated error) can be done succinctly by:
fit$\$$cptable[which.min(fit$\$$cptable[,"xerror"]),"CP"]

===Compete Dataset for N-of-1 Example===
[[SMHS_MethodsHeterogeneity_CER_Nof1|This N-of-1 Dataset]] includes an example.

===[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]===

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_CER}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:55:03Z

Pineaumi: /* Nonparametric Regression Methods */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income)11. We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

*8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
*9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
*10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14
*11 http://www.ers.usda.gov/media/200576/err32c_1_.pdf

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:53:40Z

Pineaumi: /* Nonparametric Regression Methods */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income)11. We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

*8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
*9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
*10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14
*11 http://www.ers.usda.gov/media/200576/err32c_1_.pdf

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:52:37Z

Pineaumi: /* Quantile Treatment Effect (QTE) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income)11. We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

*8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
*9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
*10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14
*11 http://www.ers.usda.gov/media/200576/err32c_1_.pdf

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:52:00Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income)11. We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

*8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
*9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
*10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14
*11 http://www.ers.usda.gov/media/200576/err32c_1_.pdf

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:51:35Z

Pineaumi: /* Quantile Treatment Effect (QTE) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income)11. We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

*8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
*9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
*10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:48:36Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

*8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
*9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
*10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:48:14Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54
9 http://effectivehealthcare.ahrq.gov/search-for-guides-reviews-and-reports/?pageaction=displayproduct&productID=1857
10 http://jpepsy.oxfordjournals.org/content/39/2/138.full#sec-14

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:47:39Z

Pineaumi: /* Series of “N of 1” trials */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design9. For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:47:08Z

Pineaumi: /* Example */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design . For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant10. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:45:31Z

Pineaumi: /* Simulation Example 1 */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the ''I''2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design . For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:44:03Z

Pineaumi: /* Predictive risk models */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the I2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design . For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

===Footnotes===

8 http://onlinelibrary.wiley.com/enhanced/doi/10.1002/jrsm.54

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity MetaAnalysis

2016-05-23T18:43:21Z

Pineaumi: /* Meta-analysis */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Meta-Analyses ==

==Meta-analysis==

===Overview===

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale8. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

===Simulation Example 1===

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)

# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group

# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group

# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data –
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data. Mantel-Haenszel (MH),
# inverse variance and Peto method are available for pooling.

# method = A character string indicating which method is to be used for pooling of studies.
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be
# used for pooling of studies

# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)

<center>[[Image:SMHS_Methods8.png|500px]] </center>

# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group,
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)

<center>[[Image:SMHS_Methods9.png|500px]] </center>

# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group,
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)

Test of heterogeneity:
Q d.f. p-value
11.99 14 0.6071

<center>[[Image:SMHS_Methods10.png|500px]] </center>

The forest plot shows the I2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

==Series of “N of 1” trials==

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design . For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

====Example====

A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant. The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

<center>[[Image:SMHS_Methods11.png|500px]] </center>

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|...||...||...||...||...||...||...||...||...||...

|}
</center> Complete data is available in the Appendix.

<center>Data Summary

{| class="wikitable" style="text-align:center; " border="1"
|-
|Intercept||Constant
|-
|Physical Activity||PhyAct
|-
|Intervention||Tx
|-
|WP Social Support||WPSS
|-
|PM Social Support (1-3)||PMss3
|-
|Self Efficacy||SelfEff25

|}
</center>

rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE) # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)

<center>

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID||Day||Tx||SelfEff||SelfEff25||WPSS||SocSuppt||PMss||PMss3||PhyAct
|-
|1||1||1||1||33||8||0.97||5.00||4.03||1.03||53
|-
|2||1||2||1||33||8||-0.17||3.87||4.03||1.03||73
|-
|3||1||3||0||33||8||0.81||4.84||4.03||1.03||23
|-
|4||1||4||0||33||8||-0.41||3.62||4.03||1.03||36
|-
|5||1||5||1||33||8||0.59||4.62||4.03||1.03||21
|-
|6||1||6||1||33||8||-1.16||2.87||4.03||1.03||0

|}
</center>

df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25)

# library("lme4")

lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)

Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
Data: df.1

REML criterion at convergence: 8820

<center> Scaled Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
| Min||1Q||Median||3Q||Max
|-
|-2.7012||-0.6833||-0.0333||0.6542||3.9612
|}
</center>

<center> Random Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| Groups ||Name||Variance ||Std.Dev.
|-
| Day||(Intercept) ||0.0 || 0.00
|-

|ID|| (Intercept)||601.5||24.53
|-

|Residual|| ||969.0 ||31.13
|}
Number of obs: 900, groups: Day, 30; ID, 30
</center>

<center> Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Estimate||Std.||Error||t value
|-
|(Intercept)||38.3772||14.4738||2.651
|-
|Tx||4.0283||6.3745||0.632
|-
|SelfEff||0.5818||0.5942||0.979
|-
|Tx:SelfEff||0.9702||0.2617||3.708
|}
</center>

<center> Correlation of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||(Intr)||Tx ||SlfEff
|-
| Tx|| -0.220|| ||
|-
| SelfEff||-0.946 ||0.208 ||
|-
| Tx:SelfEff ||0.208 ||-0.946 ||-0.220
|}
</center>

# Model: PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1)
summary(lm.2)

Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 +
SelfEff25 + Tx * SelfEff25, data = df.1)

<center> Residuals
{| class="wikitable" style="text-align:center; " border="1"
|-
|Min||1Q||Median||3Q||Max
|-
| -102.39||-28.24||-1.47||25.16||122.41

|}
</center>

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t value||$Pr(>|t|)$
|-
|(Intercept)||52.0067||1.8080||28.764||< 2e-16 ***
|-
|Tx||27.7366||2.5569||10.848||< 2e-16 ***
|-
|WPSS||1.9631||2.4272||0.809||0.418853
|-
|PMss3||13.5110||2.7853||4.851||1.45e-06 ***
|-
|SelfEff25||0.6289||0.2205||2.852||0.004439 **
|-
|Tx:WPSS||9.9114||3.4320||2.888||0.003971 **
|-
|Tx:PMss3||8.8422||3.9390||2.245||0.025025 *
|-
|Tx:SelfEff25||1.0460||0.3118||3.354||0.000829 ***

|}
</center>

[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

<center> Type 3 Tests of Fixed Effects
{| class="wikitable" style="text-align:center; " border="1"
|-
|Effect||Num DF||Den DF||F Value||$Pr>F$
|-
|Tx||1||224||67.46||<.0001
|-
|ID||7||224||25.95||<.0001
|-
|Tx*ID||7||224||2.92||0.0060
|}
</center>

==Quantile Treatment Effect (QTE)==

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)

<center>head(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|1||420.1577||255.8394
|-
|2||541.4117||310.9587
|-
|3||901.1575||485.6800
|-
|4||639.0802||402.9974
|-
|5||750.8756||495.5608
|-
|6||945.7989||633.7978

|}
</center>

<center>summary(engel)
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Income||Foodexp
|-
|Min||377.1||242.3
|-
|1st Qu.||638.9||429.7
|-
|Median||884.0||582.5
|-
|Mean||982.5||624.2
|-
|3rd Qu.||1164.0||743.9
|-
|Max||4957.8||2032.7

|}
</center>

Note: If Y be a real valued random variable with cumulative distribution function FY(y)=P(Y≤ y), then the τ-quantile of Y is given by

<center> QY(τ)=FY-1(τ)=inf{ y:FY(y)≥τ} </center>

where 0≤τ≤1.

<center>[[Image:SMHS_Methods12.png|500px]] </center>

# (1) Graphics
plot(income, foodexp, cex=.25, type="n", xlab="Household Income", ylab="Food Expenditure")
points(income, foodexp, cex=.5, col="blue")

# tau - the quantile(s) to be estimated, in the range from 0 to 1. An object "rq.process" and an object "rqs"
# are returned containing the matrix of coefficient estimates at the specified quantiles.
abline( rq(foodexp ~ income, tau=.5), col="blue") # Quantile Regression Model

abline( lm(foodexp ~ income), lty=2, lwd=3, col="red") # linear model
taus <- c(0.05, 0.1, 0.25, 0.75, 0.90, 0.95)
colors <- rainbow(length(taus))

models <- vector(mode = "list", length = length(taus)) # define a vector of models to store QR for diff taus
model.names <- vector(mode = "list", length = length(taus)) # define a vector model names

for( i in 1:length(taus)){
models[[i]] <- rq(foodexp ~ income, tau=taus[i])
var <- taus[i]
model.names[[i]] <- paste("Model [", i , "]: tau=", var)
abline( models[[i]], lwd=2, col= colors[[i]])
}
legend(3000, 1100, model.names, col= colors, pch= taus, bty='n', cex=.75)

<center>[[Image:SMHS_Methods13.png|500px]] </center>

# (2) Inference about quantile regression coefficients. As an alternative to the rank-inversion confidence intervals, we can obtain a table of coefficients, standard errors, t-statistics, and p-values using the summary function:

summary(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])

tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

# Alternatively, we can use summary.rq to compute bootstrapped standard errors.
summary.rq(models[[3]], se = "nid")

Call: rq(formula = foodexp ~ income, tau = taus[i])
tau: [1] 0.25

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
|||Value||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||95.48354||21.39237||4.46344||0.00001
|-
|Income||0.47410||0.02906||16.31729||0.00000

|}
</center>

==Nonparametric Regression Methods ==

Nonparametric regression enables dealing with HTE in RCTs. Different nonparametric methods, such as kernel smoothing methods and series methods, can be used to generate test statistics for examining the presence of HTE. A kernel method is a weighting scheme based on a kernel function (e.g. uniform, Gaussian). When evaluating the treatment effect of a patient in RCTs, the kernel method assigns larger weights to those observations with similar covariates. This is done because it is assumed that patients with similar covariates provide more relevant data on predicted treatment response. Examining participants that have different backgrounds (e.g., demographic, clinical), kernel smoothing methods utilize information from highly divergent participants when estimating a particular subject’s treatment effect. Lower weights are assigned to very different subjects and the kernel methods require choosing a set of smoothing parameters to group patients according to their relative degree of similarities. A drawback is that the corresponding proposed test statistics may be sensitive to the chosen bandwidths, which inhibits the interpretation of the results. Series methods use approximating functions (splines or power series of the explanatory variables) to construct test statistics. Compared to kernel smoothing methods, series methods normally have the advantage of computational convenience; however, the precision of test statistics depends on the number of terms selected in the series.

Canadian Wage Data Example: Nonparametric regression extends the classical parametric regression (e.g., lm, lmer) involving one continuous dependent variable, y, and (1 or more) continuous explanatory variable(s), x. Let’s start with a popular parametric model of a wage equation that we can extend to a fully nonparametric regression model. First, we will compare and contrast the parametric and nonparametric approach towards univariate regression and then proceed to multivariate regression.

Let’s use the Canadian cross-section wage data (cps71) consisting of a random sample taken from the 1971 Canadian Census for male individuals having common education (High-School). N=205 observations, 2 variables, the logarithm of the individual’s wage (logwage) and their age (age). The classical wage equation model includes a quadratic term of age.

# install.packages("np")
library("np")
data("cps71")

# (1) Linear Model -> R2 = 0.2308
model.lin <- lm( logwage ~ age + I(age^2), data = cps71)
summary(model.lin)

Call:
lm(formula = logwage ~ age + I(age^2), data = cps71)

Residuals:
Min 1Q Median 3Q Max
-2.4041 -0.1711 0.0884 0.3182 1.3940

<center>Coefficients
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||Estimate||Std. Error||t Value||$Pr(>|t|)$
|-
|(Intercept)||10.0419773||0.4559986||22.022||< 2e-16 ***
|-
|Age||0.1731310||0.0238317|| 7.265||7.96e-12 ***
|-
|I(age^2)||-0.0019771||0.0002898||-6.822||1.02e-10 ***

|}
</center>

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5608 on 202 degrees of freedom
Multiple R-squared: 0.2308, Adjusted R-squared: 0.2232
F-statistic: 30.3 on 2 and 202 DF, p-value: 3.103e-12

# (2) Next, we consider the local linear nonparametric method employing cross-validated
# bandwidth selection and estimation in one step. Start with computing the least-squares
# cross-validated bandwidths for the local constant estimator (default).
# Note that R2 = 0.3108675
bandwidth <- npregbw(formula= logwage ~ age, data = cps71)
model.np <- npreg(bandwidth, regtype = "ll", bwmethod = "cv.aic", gradients = TRUE, data = cps71)
summary(model.np)

Regression Data: 205 training points, in 1 variable(s) age
Bandwidth(s): 1.892157
Kernel Regression Estimator: Local-Constant
Bandwidth Type: Fixed
Residual standard error: 0.5307943
R-squared: 0.3108675
Continuous Kernel Type: Second-Order Gaussian
No. Continuous Explanatory Vars.: 1

# NP model significance may be tested by
npsigtest(model.np)

Kernel Regression Significance Test
Type I Test with IID Bootstrap (399 replications, Pivot=TRUE, joint=FALSE)
Explanatory variables tested for significance: age (1)

age
Bandwidth(s): 1.892157

Individual Significance Tests
P Value:
age < 2.22e-16 ***

# So, as was the case for the linear parametric model, Age is significant in the local linear NP-model

# (3) Graphical comparison of parametric and nonparametric models.
plot(cps71$\$$age, cps71$\$$logwage, xlab = "age", ylab = "log(wage)", cex=.1)
lines(cps71$\$$age, fitted(model.lin), lty = 2, col = " red")
lines(cps71$\$$age, fitted(model.np), lty = 1, col = "blue")
legend("topright", c("Data", "Linear", "Non-linear"), col=c("Black", "Red", "Blue"), pch = c(1, 1, 1), bty='n', cex=.75)

<center>[[Image:SMHS_Methods14.png|500px]] </center>

# some additional plots resenting the parametric (quadratic, dashed line) and the nonparametric estimates
# (solid line) of the regression function for the cps71 data.
plot(model.np, plot.errors.method = "asymptotic")
plot(model.np, gradients = TRUE)
lines(cps71$\$$age, coef(model.lin)[2]+2*cps71$\$$age*coef(model.lin)[3], lty = 2, col = "red")
plot(model.np, gradients = TRUE, plot.errors.method = "asymptotic")

# (4) using the Lin and NL models to generate predictions based on the obtained appropriate
# bandwidths and estimated a nonparametric model. We need to create a set of explanatory
# variables for which to generate predictions. These can be part of the original dataset or be
# outside its scope. Typically, we don’t have the outcome for the evaluation data and need only
# provide the explanatory variables for which predicted values are generated by the models.
# Occasionally, splitting the dataset into two independent samples (training/testing), allows estimation
# of a model on one sample, and evaluation of its performance on another.

cps.eval.data <- data.frame(age = seq(10,70, by=10)) # simulate some explanatory X values (ages)
pred.lin <- predict(model.lin, newdata = cps.eval.data) # Linear Prediction of log(Wage)
pred.np <- predict(model.np, newdata = cps.eval.data) # non-Linear Prediction of log(Wage)
plot(pred.lin, pred.np)
abline(lm(pred.np ~ pred.lin))

<center>[[Image:SMHS_Methods15.png|500px]] </center>

.
.
.

==Predictive risk models ==

Predictive risk models represent a class of methods for identifying potential for HTE when the individual patient risk for disease-related events at baseline depends on observed factors. For instance, common measures are disease staging criteria, such as those used in COPD or heart failure, Framingham risk scores for cardiovascular event risk, or genetic variations, e.g., HER2 for breast cancer. Initial predictive risk modeling, aka risk function estimation, is often performed without accounting for treatment effects. Least squares or Cox proportional hazards regression methods are appropriate in many cases and provide relatively more interpretable risk functions, but rely on linearity assumptions and may not provide optimal predictive metrics. Partial least squares is an extension of least squares methods that can reduce the dimensionality of the predictor space by interposing latent variables, predicted by linear combinations of observable characteristics, as the intermediate predictors of one or more outcomes. Recursive partitioning, such as random forests, support vector machines, and neural networks represent latter methods with better predictive power than linear methods. Risk function estimation can range from highly exploratory analyses to near meta-analytic model validation, and may be useful at any stage of product development.

HIV Example: The “hmohiv” dataset represents a study of HIV positive patients examining whether there was a difference in survival times of HIV positive patients between a cohort using intravenous drugs (drug=1) and a cohort not using the IV drug (drug=0). The hmohiv data includes the following variables:

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID||Time||Age||Drug||Censor||Entdate||Enddate
|-
|1||5||46||0||1||5/15/1990||10/14/1990
|-
|2||6||35||1||0||9/19/1989||3/20/1990
|-
|3||8||30||1||1||4/21/1991||12/20/1991
|-
|4||3||30||1||1||1/3/1991||4/4/1991
|-
|5||22||36||0||1||9/18/1989||7/19/1991
|-
|6||1||32||1||0||3/18/1991||4/17/1991
|-
|...||...||...||...||...||...||...

|}
</center>

#cleaning up environment
rm(list=ls())

# load survival library
library(survival)

# load hmohiv data
hmohiv<-read.table("http://www.ats.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE)
attach(hmohiv)

# Fit Cox proportional hazards regression model
cox.model <- coxph( Surv(time, censor) ~ drug, method="breslow")
fit.1 <- survfit(cox.model, newdata=drug.new)

# construct a frame of the 2 cohorts IV_drug and no-IV-drug
drug.new<-data.frame(drug=c(0,1))

# plot results
plot(fit.1, xlab="Survival Time (Months)", ylab="Survival Probability")
points(fit.1$\$$time, fit.1$\$$surv[,1], pch=1)
points(fit.1$\$$time, fit.1$\$$surv[,2], pch=2)
legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))

<center>[[Image:SMHS_Methods16.png|500px]] </center>

# to inslect the resulting Cox Proportional Hazard Model
cox.model
Call:
coxph(formula = Surv(time, censor) ~ drug, method = "breslow")

coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013

Likelihood ratio test=10.2 on 1 df, p=0.00141 n= 100, number of events= 80

.
.
.

==[[SMHS_MethodsHeterogeneity_CER|Next see: Comparative Effectiveness Research (CER)]]==

*[[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_MetaAnalysis}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:42:37Z

Pineaumi: /* Latent growth and growth mixture modeling (LGM/GMM) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

====(3) Random Forests ====
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package6,7) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the '''hmle''' call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
'''Convergence criteria satisfied'''
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$\$$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/
*6 http://cran.r-project.org/web/packages/lcmm/
*7 http://arxiv.org/pdf/1503.00890v1.pdf

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:40:39Z

Pineaumi: /* Latent growth and growth mixture modeling (LGM/GMM) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

====(3) Random Forests ====
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package6,7) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the '''hmle''' call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
'''Convergence criteria satisfied'''
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/
*6 http://cran.r-project.org/web/packages/lcmm/
*7 http://arxiv.org/pdf/1503.00890v1.pdf

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:34:56Z

Pineaumi: /* Latent growth and growth mixture modeling (LGM/GMM) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

====(3) Random Forests ====
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package6,7) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the '''hmle''' call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
'''Convergence criteria satisfied'''
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/
*6 http://cran.r-project.org/web/packages/lcmm/
*7 http://arxiv.org/pdf/1503.00890v1.pdf

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:33:22Z

Pineaumi: /* Footnotes */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

====(3) Random Forests ====
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package6,7) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the hmle call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
Convergence criteria satisfied
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/
*6 http://cran.r-project.org/web/packages/lcmm/
*7 http://arxiv.org/pdf/1503.00890v1.pdf

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:32:31Z

Pineaumi: /* Latent growth and growth mixture modeling (LGM/GMM) */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

====(3) Random Forests ====
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package6,7) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the hmle call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
Convergence criteria satisfied
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:31:20Z

Pineaumi: /* Random Forests */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

====(3) Random Forests ====
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the hmle call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
Convergence criteria satisfied
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

SMHS MethodsHeterogeneity HTE

2016-05-23T18:31:00Z

Pineaumi: /* Example Fifth Dutch growth study */

==[[SMHS_MethodsHeterogeneity| Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research]] - Methods and Approaches for HTE Analytics ==

===Overview===

Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R rpart package1 provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines2. The Appendix includes description of the main CART analysis steps.

install.packages("rpart")
library("rpart")

===CART===
Classification and Regression Tree (CART) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.

CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.

====Example Fifth Dutch growth study====

# Let’s use the Fifth Dutch growth study (2009) fdgs3. Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”4?

#install.packages("mice")
library("mice")
?fdgs
head(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
|}
</center>

summary(fdgs)
summary(fdgs)

<center>
{| class="wikitable" style="text-align:center; " border="1"
|-
|ID ||Reg ||Age ||Sex ||HGT
|-
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
|-
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
|-
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
|-
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
|-
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
|-
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
|-
| || || || ||NA's: 23
|}
</center>

====(1) Classification Tree====

Let's use the data frame '''fdgs''' to predict Region, from Age, Height, and Weight.
# grow tree
fit.1 <- rpart(reg ~ age + hgt + wgt, method="class", data= fdgs[,-1])

printcp(fit.1) # display the results
plotcp(fit.1) # visualize cross-validation results
summary(fit.1) # detailed summary of splits

# plot tree
par(oma=c(0,0,2,0))
plot(fit.1, uniform=TRUE, margin=0.3, main="Classification Tree for Region (FDGS Data)")
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)

<center>[[Image:SMHS_Methods2.png|500px]] </center>

# create a better plot of the classification tree
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")

<center>[[Image:SMHS_Methods3.png|500px]] </center>

====(2) Pruning the tree====

pruned.fit.1<- prune(fit.1, cp= fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])

# plot the pruned tree
plot(pruned.fit.1, uniform=TRUE, main="Pruned Classification Tree for Region (FDGS Data)")
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
post(pruned.fit.1, title = "Pruned Classification Tree for Region (FDGS Data)")

Not much change, as the initial tree is not complex!

===Random Forests ===
Random forests may improve predictive accuracy by generating a large number of bootstrapped trees (based on random samples of variables). It classifies cases using each tree in this new "forest", and decides the final predicted outcome by combining the results across all of the trees (an average in regression, a majority vote in classification). See the randomForest package5.

library(randomForest)
fit.2 <- randomForest(reg ~ age + hgt + wgt, method="class", na.action = na.omit, data= fdgs[,-1])
print(fit.2) # view results
importance(fit.2) # importance of each predictor

Note on missing values/incomplete data: If the data have missing values, we have 3 choices:

1. Use a different tool (rpart handles missing values well)

2. Impute the missing values

3. For a small number of missing cases, we can use na.action = na.omit

===Latent growth and growth mixture modeling (LGM/GMM)===

LGM and GMM represent structural equation modeling techniques that capture inter-individual differences in longitudinal change corresponding to a particular treatment. For instance, patients’ different timing patterns of the treatment effects may represent the underlying sources of HTE. LGM distinguish if (yes/no) and how (fast/slow, temporary/lasting) patients respond to treatment. The heterogeneous individual growth trajectories are estimated from intra-individual changes over time by examining common population parameters, i.e., slopes, intercepts, and error variances. Suppose each individual has unique initial status (intercept) and response rate (slope) during a specific time interval. Then the variances of the individuals’ baseline measures (intercepts) and changes (slopes) in health outcomes will represent the degree of HTE. The LGM-identified HTE of individual growth curves can be attributed to observed predictors, including both fixed and time varying covariates.

LGM assumes that all individuals are from the same population (too restrictive in some cases). If the HTE is due to observed demographic variables, such as age, gender, and marital status, one may utilize multiple-group LGM. Despite its successful applications for modeling longitudinal change, there may be multiple subpopulations with unobserved heterogeneities. Growth mixture modeling (GMM) extends LGM to allow the identification and prediction of unobserved subpopulations in longitudinal data analysis. Each unobserved subpopulation may constitute its own latent class and behave differently than individuals in other latent classes. Within each latent class, there are also different trajectories across individuals; however, different latent classes don’t share common population parameters. Suppose we are interested in studying retirees’ psychological well-being change trajectory when multiple unknown subpopulations exist. We can add another layer (a latent class variable) on the LGM framework so that the unobserved latent classes can be inferred from the data. The covariates in GMM are designed to affect growth factors distinctly across different latent classes. Therefore, there are two types of HTE: 1) the latent class variable in GMM divides individuals into groups with different growth curves; and 2) coefficient estimates vary across latent classes.

Latent variables are not directly observed – they are inferred (via a model) from other actually observed and directly measured variables. Models that explain observed variables in terms of latent variables are called latent variable models. Then the latent (unobserved) variable is discrete, it’s referred to as latent class variable.

Breast Cancer Example: Recall the LMER package, earlier review discussions, where Linear Mixed Model (LMM) are used for longitudinal data to examine change over time of outcomes according relative to predictive covariates. LMM assumptions include:

(i) continuous longitudinal outcome

(ii) Gaussian random-effects and errors

(iii) linearity of the relationships with the outcome

(iv) homogeneous population

(v) missing at random data

The objectives of LGM/GMM models (see Latent Class Mixed Models, lcmm R package) are to extend the linear mixed model estimation to:

(i) heterogeneous populations (relax (iv) above). Use hlme for latent class linear mixed models (i.e. Gaussian continuous outcome)

(ii) other types of longitudinal outcomes : ordinal, (bounded) quantitative non-Gaussian outcomes (relax (i), (ii), (iii), (iv)). Use lcmm for general latent class mixed models with outcomes of different nature

(iii) joint analysis of a time-to-event (relax (iv), (v)). Use Jointlcmm for joint latent class models with a longitudinal outcome and a right-censored (left-truncated) time-to-event</blockquote>

Let’s use these data (http://www.ats.ucla.edu/stat/data/hdp.csv), representing cancer phenotypes and predictors (e.g., "IL6", "CRP", "LengthofStay", "Experience") and outcome measures (e.g., remission) collected on patients, nested within doctors (DID) and within hospitals (HID).

We can illustrate the latent class linear mixed models implemented in hlme through a study of the quadratic trajectories of the response (remission) with TumorSize, adjusting for CO2*Pain interaction and assuming correlated random-effects for the functions of SmokingHx and Sex. To estimate the corresponding standard linear mixed model using 1 latent class where CO2 interacts with Pain:

# install.packages("lcmm")
library("lcmm")

hdp <- read.csv("http://www.ats.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

add a new subject ID column (last column in the data, “ID”), this is necessary for the hmle call
hdp$\$$ID <- seq.int(nrow(hdp))

model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)
summary(model.hlme)

Heterogenous linear mixed model
fitted by maximum likelihood method

hlme(fixed = remission ~ IL6 + CRP + LengthofStay + Experience +
I(tumorsize^2) + co2 * pain + I(tumorsize^2) * pain, random = ~SmokingHx +
Sex, subject = "ID", ng = 1, data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 21

Iteration process:
Convergence criteria satisfied
Number of iterations: 34
Convergence criteria: parameters= 1.2e-09
: likelihood= 8.3e-06
: second derivatives= 2.7e-05

Goodness-of-fit statistics:
maximum log-likelihood: -5223.9
AIC: 10489.79
BIC: 10637.86

Maximum Likelihood Estimates:

<center>Fixed effects in the Longitudinal Model:

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||coef||Se||Wald||p-value
|-
|Intercept||0.28636||0.24314||1.178||0.23890
|-
|IL6||-0.01134||0.00183||-6.184||0.00000
|-
|CRP||-0.00674||0.00167||-4.043||0.00005
|-
|LengthofStay||-0.04834||0.00463||-10.436||0.00000
|-
|Experience||0.01695||0.00119||14.263||0.00000
|-
|I(tumorsize^2)||0.00000||0.00001||-0.076||0.93953
|-
|co2||-0.03549||0.16204||-0.219||0.82663
|-
|pain||0.03930||0.04278||0.919||0.35832
|-
|co2:pain||-0.01489||0.02871||-0.519||0.60395
|-
|I(tumorsize^2):pain||0.00000||0.00000||0.553||0.58045
|}
</center>

<center>Variance-covariance matrix of the random-effects

{| class="wikitable" style="text-align:center; " border="1"
|-
| ||intercept||SmokingHxformer||SmokingHxnever||Sexmale
|-
|intercept||0.19310943|| || ||
|-
|SmokingHxformer||-0.10617988||0.209155186|| ||
|-
|SmokingHxnever||-0.12388534||0.068342049||2.262655e-01||
|-
|Sexmale||-0.08130975||-0.007353491||-1.873934e-05||0.1730187

|}
</center>

Residual standard error:

coef: 0.1299767

se: 1.187426

Results interpretation:

(1) The first part of the summary provides information about the dataset, the number of subjects, observations, observations deleted (since by default, missing observations are deleted), number of latent classes and number of parameters.

(2) Next, details about the algorithm convergence is provided along with the number of iterations, the convergence criteria, and the information indicating if the model converged correctly: "convergence criteria satisfied".

(3) The maximum log-likelihood, Akaike criterion (AIC) and Bayesian Information criterion (BIC) are reported.

(4) Estimates of parameters, the estimated standard error, the Wald Test statistics (with Normal approximation) and the corresponding p-values are reported below.

(5) For the random-effect distribution, the estimated matrix of covariance of the random-effects is displayed.

(6) The standard error of the residuals is given along with its estimated standard error.

(7) The effect of TumorSize seems not associated with change over Pain of Remission. This may be formally assessed using a multivariate Wald test:

WaldMult(model.hlme, pos=c(6,8))
# pos - a vector containing the indices in model.hlme of the parameters to test
Wald Test p_value
I(tumorsize^2) = pain = 0 0.85562 0.65193

We may consider the model with an adjustment for CRP only on the intercept. Below we estimate the corresponding models for a varying number of latent classes (from 1 to 3) using the default initial values:

# Initial Model: model.hlme <- hlme(remission ~ IL6 + CRP + LengthofStay + Experience + I(tumorsize^2) + co2*pain + I(tumorsize^2)*pain, random=~ SmokingHx + Sex, subject='ID', data=hdp, ng=1)

model.hlme.1 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay, subject='ID', data=hdp, ng=1)
model.hlme.2 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=2)
model.hlme.3 <- hlme(tumorsize ~ IL6 + CRP + LengthofStay + SmokingHx, mixture=~ SmokingHx, subject='ID', data=hdp, ng=3)

The estimation process for a varying number of latent classes can be summarized with summarytable, which gives the log-likelihood, the number of parameters, the Bayesian Information Criterion, and the posterior proportion of each class:

summarytable(model.hlme.1, model.hlme.2, model.hlme.3)
G loglik npm BIC %class1 %class2 %class3
model.hlme.1 1 -33301.82 5 66648.89 100.000000
model.hlme.2 2 -31592.79 11 63285.15 99.214076 0.7859238
model.hlme.3 3 -31589.55 15 63314.86 6.357771 82.2991202 11.34311

The program took 404.65 seconds

In this example, the optimal number of latent classes according to the BIC is two (the smallest BIC). The posterior classification is described with:

postprob(model.hlme.2)

Posterior classification:
class1 class2
N 8458.00 67.00
% 99.21 0.79

Posterior classification table:
--> mean of posterior probabilities in each class
prob1 prob2
class1 0.8555 0.1445
class2 0.4362 0.5638

Posterior probabilities above a threshold (%):
class1 class2
prob>0.7 92.48 2.99
prob>0.8 77.38 0.00
prob>0.9 38.53 0.00

In this example, the first class includes a posteriori 8458 subjects (99%) while class 2 includes 67 (0.79%) subjects. Subjects were classified in class 1 with a mean posterior probability of 0.8555 %.

In class 1, 92.48% were classified with a posterior probability above 0.7 while 2.99% of the subjects were classified in class 2 with a posterior probability above 0.7. Goodness-of-fit of the model can be assessed by displaying the residuals as in figure and the mean predictions of the model as in figure, according to the time variable given in var.time:

plot(model.hlme.2)
# Figure (left panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2)
# Figure (right panel)
plot(model.hlme.2, which="fit", var.time="Age", bty="l", ylab=" Remission ", xlab="Age", lwd=2, marg=FALSE)

<center>[[Image:SMHS_Methods4.png|500px]] </center>

<center>[[Image:SMHS_Methods5.png|500px]] </center>

<center>[[Image:SMHS_Methods6.png|500px]] </center>

The latent process mixed models implemented in lcmm are illustrated through the study of the linear trajectory of ntumors with Age adjusted for Sex and assuming correlated random-effects for the intercept and Age. Lines estimate the corresponding latent process mixed model with different link functions:

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', data=hdp)
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta')
model.hlme.spl <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='splines')
model.hlme.spl5q <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='5-quant-splines')

link function: An optional family of link functions. By default,

*"linear" option specifies a linear link function leading to a standard linear mixed model (homogeneous or heterogeneous as estimated in hlme).
*"beta" for estimating a link function from the family of Beta cumulative distribution functions, "thresholds" for using a threshold model to describe the correspondence between each level of an ordinal outcome and the underlying latent process, and
*"Splines" for approximating the link function by I-splines. For this latter case, the number of nodes and the nodes location should be also specified. The number of nodes is first entered followed by,
* -, then the location is specified with "equi", "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the marker distribution or interior nodes entered manually in argument
*intnodes. It is followed by - and finally "splines" is indicated. For example, "7-equi-splines" means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6 nodes located at the quantiles of the marker distribution and "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior nodes being entered in the argument intnodes.

summary (model.hlme.lin)

General latent class mixed model fitted by maximum likelihood method

lcmm(fixed = ntumors ~ Age * Sex, random = ~Age, subject = "ID",data = hdp)

Statistical Model:
Dataset: hdp
Number of subjects: 8525
Number of observations: 8525
Number of latent classes: 1
Number of parameters: 8
Link function: linear

Iteration process:
Maximum number of iteration reached without convergence
Number of iterations: 100
Convergence criteria: parameters= 5.4e-10
: likelihood= 5.5e-10
: second derivatives= 1

Goodness-of-fit statistics:
maximum log-likelihood: -19915.24
AIC: 39846.49
BIC: 39902.89

Discrete posterior log-likelihood: 0
Discrete AIC: 16

Mean discrete AIC per subject: 9e-04
Mean UACV per subject: 0
Mean discrete LL per subject: 0

Maximum Likelihood Estimates:

Fixed effects in the longitudinal model:

coef Se Wald p-value
intercept (not estimated) 0.00000
Age 0.09491
Sexmale -0.66303
Age:Sexmale 0.01132

Variance-covariance matrix of the random-effects:
intercept Age
intercept 20.5013715
Age -0.2889814 0.007696382

Residual standard error (not estimated) = 1

Parameters of the link function:

coef Se Wald p-value
Linear 1 (intercept) -0.36768
Linear 2 (std err) 0.71432

Objects mlin, mbeta, mspl and mspl3eq are latent process mixed models that assume the exact same trajectory for the underlying latent process but respectively a linear, BetaCDF, I-splines with 5 equidistant knots (default with link=’splines’) and I-splines with 5 knots at percentiles. mlin reduces to a standard linear mixed model (link=’linear’ by default). The only difference with a hlme object is the parameterization for the intercept and the residual standard error that are considered as rescaling parameters.

col <- rainbow(4)
plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(model.hlme.beta, which="linkfunction", add=T, col=col[2], lwd=2)
plot(model.hlme.spl, which="linkfunction", add=T, col=col[3], lwd=2)
plot(model.hlme.spl5q, which="linkfunction", add=T, col=col[4], lwd=2)
legend(x="topleft",legend=c("linear", "beta","splines (5equidistant)", "splines (5 at quantiles)"), lty=1,col=col,bty="n",lwd=2)

# to obtain confidence bands use function predictlink
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

Error in predictlink.lcmm(model.hlme.spl, ndraws = 2000):
No confidence intervals can be produced since the program did not converge properly

model.hlme.lin$\$$conv # double-check the convergence of the algorithm[1] 2
# status of convergence:
# =1 if the convergence criteria were satisfied,
# =2 if the maximum number of iterations was reached,
# =4 or 5 if a problem occured during optimisation

model.hlme.lin <- lcmm(ntumors ~ Age*Sex, random=~ Age ,subject='ID', epsY = 0.5, convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200, data=hdp); model.hlme.lin$conv

# Now that we have convergence, we can obtain CI’s!!!
link.lin <- predictlink(model.hlme.lin, ndraws=2000)

# plot(model.hlme.lin, which="linkfunction", bty='l', ylab="Number-of-Tumors", col=col[1], lwd=2, xlab="underlying latent process")
plot(link.lin, add=TRUE, col=col[1], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for linear fit"), lty=c(2,NA), col=c(col[1],NA), bty="n", lwd=2)

<center>[[Image:SMHS_Methods7.png|500px]] </center>

# Repeat using the other link functions … model.hlme.beta, model.hlme.spl, …
model.hlme.beta <- lcmm(ntumors ~ Age*Sex, random=~ Age, subject='ID', data=hdp, link='beta',
convB = 1e-01, convL = 1e-01, convG = 1e-01, maxiter=200); model.hlme.beta$\$$conv
link.beta <- predictlink(model.hlme.beta, ndraws=2000)
plot(link.beta, add=TRUE, col=col[2], lty=2, lwd=2)
legend(x="left", legend=c("95% confidence bands", "for BETA fit"), lty=c(3,NA), col=c(col[2],NA), bty="n", lwd=1)

===Footnotes===
*1 http://cran.r-project.org/web/packages/rpart/index.html
*2 http://www.mayo.edu/hsr/techrpt/61.pdf
*3 http://dx.doi.org/10.1371/journal.pone.0027608
*4 http://www.nature.com/pr/journal/v73/n3/abs/pr2012189a.html
*5 http://stat-www.berkeley.edu/users/breiman/RandomForests/

===[[SMHS_MethodsHeterogeneity_MetaAnalysis|Next see: Meta-Analysis]]===
* [[SMHS_MethodsHeterogeneity|Back to the Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research section]]

<hr>
* SOCR Home page: http://www.socr.umich.edu

{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}