SMHS MethodsHeterogeneity

From SOCR
Revision as of 08:08, 14 March 2016 by Dinov (talk | contribs)
Jump to: navigation, search

Scientific Methods for Health Sciences - Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research

Methods and Approaches for HTE Analytics ****
Meta-analysis CART* N of 1 trials LGM/GMM** QTE*** Nonparametric Predictive risk models
Intent of the Analysis Exploratory and confirmatory Exploratory Exploratory and initial testing "Exploratory, initial testing, and confirmatory" "Exploratory, initial testing, & confirmatory" Exploratory and confirmatory Initial testing and confirmatory
Data Structure "Trial summary results, possibly with subgroup results" Panel or cross-section Repeated measures for a single patient: time series Time series and panel Panel and cross-sectional "Panel, time series, and cross-sectional" Panel or cross-sectional
Data Size Consideration Advantage of combining small sample sizes Large sample sizes Small sample sizes LGM: small to large sample sizes; GMM: Large sample sizes Moderate to large sample sizes Large sample sizes Sample sizes depends on specific risk function
Key Strength(s) Increase statistical power by pooling of results Does not require assumptions around normality of distribution Can utilize different types of response variables; Possible to identify HTE across trials Possibility to measure and explain covariate's effect on treatment effect Patient is own control; Estimates patient-specific effects Accounting for unobserved characteristics Heterogeneous response across time Robust to outcome outliers Heterogeneous response across quantiles No functional form assumptions Flexible regressions Multivariate approach to identifying risk factors or HTE

Estimates patient-specific effects

Key Limitation(s) Included studies need to be similar enough to be meaningful Assumed distribution; Selection bias Fairly sensitive to changes in underlying data May not fully identify additive impacts of multiple variables Requires de novo study Not applicable to all conditions or treatments Criteria for optimization solutions not clear "Treatment effect designed for a quantile, not a specific patient" Computationally demanding Smoothing parameters required for kernel methods May be more or less interpretable or useful clinically
  • *CART: Classification and regression tree (CART) analysis
  • ** LGM/GMM: Latent growth modeling/Growth mixture modeling.
  • *** QTE: Quantile Treatment Effect.
  • **** Standard meta-analysis like fixed and random effect models, and tests of heterogeneity, together with various plots and summaries, can be found in the R-package rmeta. Non-parametric R approaches are included in the np package.

Additional details are provided in a paper entitled From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer.

HTE Analytics, Latent growth and growth mixture modeling (LGM/GMM)

Meta-analysis

Meta-analysis is an approach to combine treatment effects across trials or studies into an aggregated treatment effect with higher statistical power than observed in each individual trials. It may detect HTE by testing for differences in treatment effects across similar RCTs. It requires that the individual treatment effects are similar to ensure pooling is meaningful. In the presence of large clinical or methodological differences between the trials, it may be to avoid meta-analyses. The presence of HTE across studies in a meta-analysis may be due to differences in the design or execution of the individual trials (e.g., randomization methods, patient selection criteria). Cochran's Q is a methods for detection of heterogeneity, which is computed as the weighted sum of squared differences between each study's treatment effect and the pooled effects across the studies. It is a barometer of inter-trial differences impacting the observed study result. A possible source of error in a meta-analysis is publication bias. Trial size may introduce publication bias since larger trials are more likely to be published. Language and accessibility represent other potential confounding factors. When the heterogeneity is not due to poor study design, it may be useful to optimize the treatment benefits for different cohorts of participants.

Cochran's Q statistics is the weighted sum of squares on a standardized scale. The corresponding P value indicates the strength of the evidence of presence of heterogeneity. This test may have low power to detect heterogeneity sometimes and it is suggested to use a value of 0.10 as a cut-off for significance (Higgins et al., 2003). The Q statistics also may have too much power as a test of heterogeneity when the number of studies is large.

Simulation Example 1

# Install and Load library
install.packages("meta")
library(meta)

# Set number of studies
n.studies = 15

# number of treatments: case1, case2, control
n.trt = 3

# number of outcomes
n.event = 2

# simulate the (balanced) number of cases (case1 and case2) and controls in each study
ctl.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case1.group = rbinom(n = n.studies, size = 200, prob = 0.3)
case2.group = rbinom(n = n.studies, size = 200, prob = 0.3)
# Simulate the number of outcome events (e.g., deaths) and no events in the control group
event.ctl.group = rbinom(n = n.studies, size = ctl.group, prob = rep(0.1, length(ctl.group)))
noevent.ctl.group = ctl.group - event.ctl.group

# Simulate the number of events and no events in the case1 group
event.case1.group = rbinom(n = n.studies, size = case1.group, prob = rep(0.5, length(case1.group)))
noevent.case1.group = case1.group - event.case1.group
# Simulate the number of events and no events in the case2 group
event.case2.group = rbinom(n = n.studies, size = case2.group, prob = rep(0.6, length(case2.group)))
noevent.case2.group = case2.group - event.case2.group
# Run the univariate meta-analysis using metabin(), Meta-analysis of binary outcome data – 
# Calculation of fixed and random effects estimates (risk ratio, odds ratio, risk difference or arcsine
# difference) for meta-analyses with binary outcome data.   Mantel-Haenszel (MH), 
#  inverse variance and Peto method are available for pooling.
# method = A character string indicating which method is to be used for pooling of studies. 
# one of "MH" , "Inverse" , or "Cochran"
# sm = A character string indicating which summary measure (“OR”, "RR" "RD"=risk difference) is to be 
# used for pooling of studies
# Control vs. Case1, n.e and n.c are numbers in experimental and control groups
meta.ctr_case1 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.ctl.group, 
n.c = ctl.group, method = "MH", sm = "OR")
# in this case we use Odds Ratio, of the odds of death in the experimental and control studies
forest(meta.ctr_case1)
SMHS Methods8.png
# Control vs. Case2
meta.ctr_case2 <- metabin(event.e = event.case2.group, n.e = case2.group, event.c = event.ctl.group, 
n.c = ctl.group, method = "MH", sm = "OR")
forest(meta.ctr_case2)
SMHS Methods9.png
# Case1 vs. Case2
meta.case1_case2 <- metabin(event.e = event.case1.group, n.e = case1.group, event.c = event.case2.group, 
n.c = case2.group, method = "MH", sm = "OR")
forest(meta.case1_case2)
summary(meta.case1_case2)
Test of heterogeneity:
    Q 	d.f.  	p-value
11.99   	14   	0.6071
SMHS Methods10.png

The forest plot shows the I2 test indicates the evidence to reject the null hypothesis (no study heterogeneity and the fixed effects model should be used).

Series of “N of 1” trials

This technique combines (a “series of”) n-of-1 trial data to identify HTE. An n-of-1 trial is a repeated crossover trial for a single patient, which randomly assigns the patient to one treatment vs. another for a given time period, after which the patient is re-randomized to treatment for the next time period, usually repeated for 4-6 time periods. Such trials are most feasibly done in chronic conditions, where little or no washout period is needed between treatments and treatment effects are identifiable in the short-term, such as pain or reliable surrogate markers. Combining data from identical n-of-1 trials across a set of patients enables the statistical analysis controlling for patient fixed or random effects, covariates, centers, or sequence effects, see Figure below. These combined trials are often analyzed within a Bayesian context using shrinkage estimators that combine individual and group mean treatment effects to create a “posterior” individual mean treatment effect estimate which is a form of inverse variance-weighted average of the individual and group effects. Such trials are typically more expensive than standard RCTs on a per-patient basis, however, they require much smaller sample sizes, often less than 100 patients (due to the efficient individual-as-own-control design), and create individual treatment effect estimates that are not possible in a non-crossover design . For the individual patient, the treatment effect can be re-estimated after each time period, and the trial stopped at any point when the more effective treatment is identified with reasonable statistical certainty.

Example: A study involving 8 participants collected data across 30 days, in which 15 treatment days and 15 control days are randomly assigned within each participant . The treatment effect is represented as a binary variable (control day=0; treatment day=1). The outcome variable represents the response to the intervention within each of the 8 participants. Study employed a fixed-effects modeling. By creating N − 1 dummy-coded variables representing the N=8 participants, where the last (i=8) participant serves as the reference (i.e., as the model intercept). So, each dummy-coded variable represents the difference between each participant (i) and the 8th participant. Thus, all other patients' values will be relative to the values of the 8th (reference) subject. The overall differences across participants in fixed effects can be evaluated with multiple degree-of-freedom F-tests.

SMHS Methods11.png
ID Day Tx SelfEff SelfEff25 WPSS SocSuppt PMss PMss3 PhyAct
1 1 1 33 8 0.97 5.00 4.03 1.03 53
1 2 1 33 8 -0.17 3.87 4.03 1.03 73
1 3 0 33 8 0.81 4.84 4.03 1.03 23
1 4 0 33 8 -0.41 3.62 4.03 1.03 36
... ... ... ... ... ... ... ... ... ...

Complete data is available in the Appendix.


Data Summary
Intercept Constant
Physical Activity PhyAct
Intervention Tx
WP Social Support WPSS
PM Social Support (1-3) PMss3
Self Efficacy SelfEff25
rm(list=ls())
Nof1 <-read.table("https://umich.instructure.com/files/330385/download?download_frd=1&verifier=DwJUGSd6t24dvK7uYmzA2aDyzlmsohyaK6P7jK0Q", sep=",", header = TRUE)    # 02_Nof1_Data.csv
attach(Nof1)
head(Nof1)
ID Day Tx SelfEff SelfEff25 WPSS SocSuppt PMss PMss3 PhyAct
1 1 1 1 33 8 0.97 5.00 4.03 1.03 53
2 1 2 1 33 8 -0.17 3.87 4.03 1.03 73
3 1 3 0 33 8 0.81 4.84 4.03 1.03 23
4 1 4 0 33 8 -0.41 3.62 4.03 1.03 36
5 1 5 1 33 8 0.59 4.62 4.03 1.03 21
6 1 6 1 33 8 -1.16 2.87 4.03 1.03 0
df.1 = data.frame(PhyAct, Tx, WPSS, PMss3, SelfEff25) 
# library("lme4")
lm.1 = model.lmer <- lmer(PhyAct ~ Tx + SelfEff + Tx*SelfEff + (1|Day) + (1|ID) , data= df.1)
summary(lm.1)
Linear mixed model fit by REML ['lmerMod']
Formula: PhyAct ~ Tx + SelfEff + Tx * SelfEff + (1 | Day) + (1 | ID)
  Data: df.1
REML criterion at convergence: 8820
Scaled Residuals
Min 1Q Median 3Q Max


Random Effects
Groups Name Variance Std.Dev.
Day (Intercept) 0.0 0.00
ID (Intercept) 601.5 24.53
Residual 969.0 31.13

Number of obs: 900, groups: Day, 30; ID, 30


Fixed Effects
Estimate Std. Error t value
(Intercept) 38.3772 14.4738 2.651
Tx 4.0283 6.3745 0.632
SelfEff 0.5818 0.5942 0.979
Tx:SelfEff 0.9702 0.2617 3.708


Correlation of Fixed Effects
(Intr) Tx SlfEff
Tx -0.220
SelfEff -0.946 0.208
Tx:SelfEff 0.208 -0.946 -0.220


# Model:  PhyAct = Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25 + ε
lm.2 = lm(PhyAct ~ Tx + WPSS + PMss3 + Tx*WPSS + Tx*PMss3 + SelfEff25 + Tx*SelfEff25, df.1) 
summary(lm.2)
Call:
lm(formula = PhyAct ~ Tx + WPSS + PMss3 + Tx * WPSS + Tx * PMss3 + 
   SelfEff25 + Tx * SelfEff25, data = df.1)
Residuals
Min 1Q Median 3Q Max
-102.39 -28.24 -1.47 25.16 122.41
Coefficients
Estimate Std. Error t value t|)$
(Intercept) 52.0067 1.8080 28.764 < 2e-16 ***
Tx 27.7366 2.5569 10.848 < 2e-16 ***
WPSS 1.9631 2.4272 0.809 0.418853
PMss3 13.5110 2.7853 4.851 1.45e-06 ***
SelfEff25 0.6289 0.2205 2.852 0.004439 **
Tx:WPSS 9.9114 3.4320 2.888 0.003971 **
Tx:PMss3 8.8422 3.9390 2.245 0.025025 *
Tx:SelfEff25 1.0460 0.3118 3.354 0.000829 ***


[Using SAS (StudyI_Analyses.sas, StudyIIab_Analyses.sas)]

Type 3 Tests of Fixed Effects
Effect Num DF Den DF F Value $Pr>F$
Tx 1 224 67.46 <.0001
ID 7 224 25.95 <.0001
Tx*ID 7 224 2.92 0.0060

Quantile Treatment Effect (QTE)

QTE employs quantile regression estimation (QRE) to examine the central tendency and statistical dispersion of the treatment effect in a population. These may not be revealed by the conventional mean estimation in RCTs. For instance, patients with different comorbidity scores may respond differently to a treatment. Quantile regression has the ability to reveal HTE according to the ranking of patients’ comorbidity scores or some other relevant covariate by which patients may be ranked. Therefore, in an attempt to inform patient-centered care, quantile regression provides more information on the distribution of the treatment effect than typical conditional mean treatment effect estimation. QTE characterizes the heterogeneous treatment effect on individuals and groups across various positions in the distributions of different outcomes of interest. This unique feature has given quantile regression analysis substantial attention and has been employed across a wide range of applications, particularly when evaluating the economic effects of welfare reform.

One caveat of applying QRE in clinical trials for examining HTE is that the QTE doesn’t demonstrate the treatment effect for a given patient. Instead, it focuses on the treatment effect among subjects within the qth quantile, such as those who are exactly at the top 10th percent in terms of blood pressure or a depression score for some covariate of interest, for example, comorbidity score. It is not uncommon for the qth quantiles to be two different sets of patients before and after the treatment. For this reason, we have to assume that these two groups of patients are homogeneous if they were in the same quantiles.

Income-Food Expenditure Example: Let’s examine the Engel data (N=235) on the relationship between food expenditure (foodexp) and household income (income). We can plot the data and then explore the superposition of the six fitted quantile regression lines.

install.packages("quantreg")
library(quantreg)
data(engel)
attach(engel)


|}

head(engel)
Income Foodexp
1 420.1577 255.8394
2 541.4117 310.9587
3 901.1575 485.6800
4 639.0802 402.9974
5 750.8756 495.5608
6 945.7989 633.7978



Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif