Difference between revisions of "SMHS MethodsHeterogeneity"

From SOCR
Jump to: navigation, search
(Methods and Approaches for HTE Analytics)
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==[[SMHS| Scientific Methods for Health Sciences]] - Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research ==
 
==[[SMHS| Scientific Methods for Health Sciences]] - Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research ==
  
<center>[[Image:SMHS_Methods1.png|500px]] </center>
+
==Methods and Approaches for HTE Analytics==
 
 
Adopted from: http://dx.doi.org/10.1186/1471-2288-12-185
 
 
 
* *CART: Classification and regression tree (CART) analysis
 
* ** LGM/GMM: Latent growth modeling/Growth mixture modeling.
 
* *** QTE: Quantile Treatment Effect.
 
* **** Standard meta-analysis like fixed and random effect models, and tests of heterogeneity, together with various plots and summaries, can be found in the R-package <bn>rmeta</b> (http://cran.r-project.org/web/packages/rmeta). Non-parametric R approaches are included in the <b>np</b> package, http://cran.r-project.org/web/packages/np/vignettes/np.pdf.
 
 
 
===Methods Summaries===
 
 
 
<b>Overview</b>
 
 
 
Recursive partitioning is a data mining technique for exploring structure and patterns in complex data. It facilitates the visualization of decision rules for predicting categorical (classification tree) or continuous (regression tree) outcome variables. The R <b>rpart</b> package  provides the tools for Classification and Regression Tree (CART) modeling, conditional inference trees, and random forests. Additional resources include an Introduction to Recursive Partitioning Using the RPART Routines  . The <b>Appendix</b> includes description of the main CART analysis steps.
 
 
 
<b>install.packages("rpart")</b>
 
<b>library("rpart")</b>
 
 
 
I. <b><u>CART</u></b> (Classification and Regression Tree) is a decision-tree based technique that considers how variation observed in a given response variable (continuous or categorical) can be understood through a systematic deconstruction of the overall study population into subgroups, using explanatory variables of interest. For HTE analysis, CART is best suited for early-stage, exploratory analyses. Its relative simplicity can be powerful in identifying basic relationships between variables of interest, and thus identify potential subgroups for more advanced analyses. The key to CART is its ‘systematic’ approach to the development of the subgroups, which are constructed sequentially through repeated, binary splits of the population of interest, one explanatory variable at a time. In other words, each ‘parent’ group is divided into two ‘child’ groups, with the objective of creating increasingly homogeneous subgroups. The process is repeated and the subgroups are then further split, until no additional variables are available for further subgroup development. The resulting tree structure is oftentimes overgrown, but additional techniques are used to ‘trim’ the tree to a point at which its predictive power is balanced against issues of over-fitting. Because the CART approach does not make assumptions regarding the distribution of the dependent variable, it can be used in situations where other multivariate modeling techniques often used for exploratory predictive risk modeling would not be appropriate – namely in situations where data are not normally distributed.
 
 
 
CART analyses are useful in situations where there is some evidence to suggest that HTE exists, but the subgroups defining the heterogeneous response are not well understood. CART allows for an exploration of response in a myriad of complex subpopulations, and more recently developed ensemble methods (such as Bayesian Additive Regression Trees) allow for more robust analyses through the combination of multiple CART analyses.
 
 
 
<b>Example Fifth Dutch growth study</b>
 
 
 
# Let’s use the Fifth Dutch growth study (2009) <b>fdgs</b>  . Is it true that “the world’s tallest nation has stopped growing taller: the height of Dutch children from 1955 to 2009”?
 
 
 
#install.packages("mice")
 
library("mice")
 
?fdgs
 
head(fdgs)
 
  
 
<center>
 
<center>
{| class="wikitable" style="text-align:center; " border="1"
+
{| class="wikitable" style="text-align:center; width:99%" border="1"
|-
+
! colspan="8" |Methods and Approaches for HTE Analytics ****
| ||ID ||Reg ||Age ||Sex ||HGT ||WGT ||HGT.Z ||WGT.Z
 
 
|-
 
|-
|1 ||100001||West||13.09514||boy||175.5||75.0||1.751||2.410
+
|||Meta-analysis||CART*||N of 1 trials||LGM/GMM**||QTE***||Nonparametric||Predictive risk models
 
|-
 
|-
|2 ||100003||West||13.81793 ||boy||148.4||40.0||2.292||1.494
+
|Intent of the Analysis||Exploratory and confirmatory||Exploratory||Exploratory and initial testing||"Exploratory, initial testing, and confirmatory"||"Exploratory, initial testing, & confirmatory"||Exploratory and confirmatory||Initial testing and confirmatory
 
|-
 
|-
|3 ||100004||West||13.97125||boy||159.9||46.5||0.743||0.783
+
|Data Structure ||"Trial summary results, possibly with subgroup results"||Panel or cross-section||Repeated measures for a single patient: time series||Time series and panel||Panel and cross-sectional||"Panel, time series, and cross-sectional"||Panel or cross-sectional
 
|-
 
|-
|4 ||100005||West||13.98220 ||girl||159.7||46.5 ||0.743 ||0.783
+
|Data Size Consideration ||Advantage of combining small sample sizes||Large sample sizes||Small sample sizes||LGM: small to large sample sizes; GMM: Large sample sizes ||Moderate to large sample sizes||Large sample sizes||Sample sizes depends on specific risk function
 
|-
 
|-
|5||100006||West||13.52225||girl||160.3||47.8||0.414||0.355
+
|Key Strength(s)||Increase statistical power by pooling of results||Does not require assumptions around normality of distribution Can utilize different types of response variables; Possible to identify HTE across trials Possibility to measure and explain covariate's effect on treatment effect ||Patient is own control; Estimates patient-specific effects ||Accounting for unobserved characteristics Heterogeneous response across time||Robust to outcome outliers Heterogeneous response across quantiles||No functional form assumptions Flexible regressions||Multivariate approach to identifying risk factors or HTE
 +
Estimates patient-specific effects
 
|-
 
|-
|6||100018||East||10.21492||boy||157.8||39.7||2.025||0.823
+
|Key Limitation(s)||Included studies need to be similar enough to be meaningful Assumed distribution; Selection bias||Fairly sensitive to changes in underlying data May not fully identify additive impacts of multiple variables||Requires de novo study Not applicable to all conditions or treatments||Criteria for optimization solutions not clear||"Treatment effect designed for a quantile, not a specific patient"||Computationally demanding Smoothing parameters required for kernel methods||May be more or less interpretable or useful clinically
 
|}
 
|}
 
</center>
 
</center>
  
summary(fdgs)       
+
Adopted from: http://dx.doi.org/10.1186/1471-2288-12-185
summary(fdgs)
 
  
<center>
+
* *CART: Classification and regression tree (CART) analysis
{| class="wikitable" style="text-align:center; " border="1"
+
* LGM/GMM: Latent growth modeling/Growth mixture modeling.
|-
+
* QTE: Quantile Treatment Effect.
|ID ||Reg ||Age ||Sex ||HGT
+
* Standard meta-analysis like fixed and random effect models, and tests of heterogeneity, together with various plots and summaries, can be found in the [http://cran.r-project.org/web/packages/rmeta R-package rmeta]. Non-parametric R approaches are included in the [http://cran.r-project.org/web/packages/np/vignettes/np.pdf np package].  
|-
 
|Min.:100001||North:732||Min.:0.008214||boy:4829||Min.:46.0
 
|-
 
|1st Qu.:106353||East:2528||1st Qu.:1.618754||girl:5201||1st Qu.:83.8
 
|-
 
|Median:203855||South:2931||Median:8.084873|| ||Median:131.5
 
|-
 
|Mean:180091||West:2578||Mean:8.157936|| ||Mean:123.9
 
|-
 
|3rd Qu.210591||City:1261||3rd Qu.:13.547570|| ||3rd Qu.:162.3
 
|-
 
|Max:401955|| ||Max.:21.993155|| ||Max.:208.0
 
|-
 
| || || || ||NA's: 23
 
|}
 
</center>
 
 
 
(1) Classification Tree
 
 
 
Let's use the data frame fdgs to predict Region, from Age, Height, and Weight.
 
# grow tree
 
fit.1 <- rpart(reg ~ age + hgt + wgt,  method="class", data= fdgs[,-1])
 
 
 
printcp(fit.1) # display the results
 
plotcp(fit.1) # visualize cross-validation results
 
summary(fit.1) # detailed summary of splits
 
 
 
# plot tree
 
par(oma=c(0,0,2,0))
 
plot(fit.1, uniform=TRUE,  margin=0.3, main="Classification Tree for Region (FDGS Data)")
 
text(fit.1, use.n=TRUE, all=TRUE, cex=1.0)
 
 
 
<center>[[Image:SMHS_Methods2.png|500px]] </center>
 
  
# create a better plot of the classification tree
+
Additional details are provided in a paper entitled [http://dx.doi.org/10.1186/1471-2288-12-185 From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer].
post(fit.1, title = "Classification Tree for Region (FDGS Data)", file = "")
 
  
<center>[[Image:SMHS_Methods3.png|500px]] </center>
+
==[[SMHS_MethodsHeterogeneity_HTE |HTE Analytics, Latent growth and growth mixture modeling (LGM/GMM)]]==
  
(2) Pruning the tree
+
==[[SMHS_MethodsHeterogeneity_MetaAnalysis |Meta-analysis]]==
  
pruned.fit.1<- prune(fit.1, cp=   fit.1$\$$cptable[which.min(fit.1$\$$\$$cptable[,"xerror"]),"CP"])
+
==[[SMHS_MethodsHeterogeneity_CER| Comparative Effectiveness Research (CER)]]==
  
# plot the pruned tree
+
<hr>
plot(pruned.fit.1, uniform=TRUE,  main="Pruned Classification Tree for Region (FDGS Data)")
+
* SOCR Home page: http://www.socr.umich.edu
text(pruned.fit.1, use.n=TRUE, all=TRUE, cex=1.0)
 
post(pruned.fit.1,  title = "Pruned Classification Tree for Region (FDGS Data)")
 
  
# not much change, as the initial tree is not complex!
+
{{translate|pageName=http://wiki.socr.umich.edu/index.php/SMHS_MethodsHeterogeneity_HTE}}

Latest revision as of 13:06, 23 May 2016

Scientific Methods for Health Sciences - Methods for Studying Heterogeneity of Treatment Effects, Case-Studies of Comparative Effectiveness Research

Methods and Approaches for HTE Analytics

Methods and Approaches for HTE Analytics ****
Meta-analysis CART* N of 1 trials LGM/GMM** QTE*** Nonparametric Predictive risk models
Intent of the Analysis Exploratory and confirmatory Exploratory Exploratory and initial testing "Exploratory, initial testing, and confirmatory" "Exploratory, initial testing, & confirmatory" Exploratory and confirmatory Initial testing and confirmatory
Data Structure "Trial summary results, possibly with subgroup results" Panel or cross-section Repeated measures for a single patient: time series Time series and panel Panel and cross-sectional "Panel, time series, and cross-sectional" Panel or cross-sectional
Data Size Consideration Advantage of combining small sample sizes Large sample sizes Small sample sizes LGM: small to large sample sizes; GMM: Large sample sizes Moderate to large sample sizes Large sample sizes Sample sizes depends on specific risk function
Key Strength(s) Increase statistical power by pooling of results Does not require assumptions around normality of distribution Can utilize different types of response variables; Possible to identify HTE across trials Possibility to measure and explain covariate's effect on treatment effect Patient is own control; Estimates patient-specific effects Accounting for unobserved characteristics Heterogeneous response across time Robust to outcome outliers Heterogeneous response across quantiles No functional form assumptions Flexible regressions Multivariate approach to identifying risk factors or HTE

Estimates patient-specific effects

Key Limitation(s) Included studies need to be similar enough to be meaningful Assumed distribution; Selection bias Fairly sensitive to changes in underlying data May not fully identify additive impacts of multiple variables Requires de novo study Not applicable to all conditions or treatments Criteria for optimization solutions not clear "Treatment effect designed for a quantile, not a specific patient" Computationally demanding Smoothing parameters required for kernel methods May be more or less interpretable or useful clinically

Adopted from: http://dx.doi.org/10.1186/1471-2288-12-185

  • *CART: Classification and regression tree (CART) analysis
  • LGM/GMM: Latent growth modeling/Growth mixture modeling.
  • QTE: Quantile Treatment Effect.
  • Standard meta-analysis like fixed and random effect models, and tests of heterogeneity, together with various plots and summaries, can be found in the R-package rmeta. Non-parametric R approaches are included in the np package.

Additional details are provided in a paper entitled From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer.

HTE Analytics, Latent growth and growth mixture modeling (LGM/GMM)

Meta-analysis

Comparative Effectiveness Research (CER)




Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif