Difference between revisions of "SMHS Usage Rfundamentals HTML Output Example"

From SOCR
Jump to: navigation, search
(Created page with "== R Fundamentals - Example R-generated HTML Output== The HTML file below is an example of an SMHS_Usage_Rfundamentals|R-output automatically e...")
 
(R Fundamentals - Example R-generated HTML Output)
 
Line 13: Line 13:
 
  <meta name="generator" content="pandoc" />
 
  <meta name="generator" content="pandoc" />
 
   
 
   
  <meta name="author" content="Ivo Dinov" />
+
  <meta name="author" content="SOCR Team" />
 
   
 
   
  <meta name="date" content="2016-04-13" />
+
  <meta name="date" content="2016-04-12" />
 
   
 
   
 
  <title>R HTML Export Demo using the SOCR PD Data</title>
 
  <title>R HTML Export Demo using the SOCR PD Data</title>
Line 60: Line 60:
 
  <div id="header">
 
  <div id="header">
 
  <h1 class="title">R HTML Export Demo using the SOCR PD Data</h1>
 
  <h1 class="title">R HTML Export Demo using the SOCR PD Data</h1>
  <h4 class="author"><em>Ivo Dinov</em></h4>
+
  <h4 class="author"><em>SOCR Team</em></h4>
  <h4 class="date"><em>April 13, 2016</em></h4>
+
  <h4 class="date"><em>April 12, 2016</em></h4>
 
  </div>
 
  </div>
 
   
 
   

Latest revision as of 08:07, 12 April 2016

R Fundamentals - Example R-generated HTML Output

The HTML file below is an example of an R-output automatically exported from RStudio using knitr (see details here).

<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />

<meta name="author" content="SOCR Team" />

<meta name="date" content="2016-04-12" />

<title>R HTML Export Demo using the SOCR PD Data</title>

<style type="text/css">code{white-space: pre;}</style>

<style type="text/css">
  pre:not([class]) {
    background-color: white;
  }
</style>
<script type="text/javascript">
if (window.hljs && document.readyState && document.readyState === "complete") {
   window.setTimeout(function() {
      hljs.initHighlighting();
   }, 0);
}
</script>



</head>

<body>

<style type="text/css">
.main-container {
  max-width: 940px;
  margin-left: auto;
  margin-right: auto;
}
code {
  color: inherit;
  background-color: rgba(0, 0, 0, 0.04);
}
img { 
  max-width:100%; 
  height: auto; 
}
</style>



Data Import from SOCR Data server

<code>library(rvest)</code>
<code>## Warning: package 'rvest' was built under R version 3.2.2</code>
<code>## Loading required package: xml2</code>
<code>## Warning: package 'xml2' was built under R version 3.2.2</code>
<code>wiki <- read_html("http://wiki.socr.umich.edu/index.php/SOCR_Data_PD_BiomedBigMetadata")
 html_nodes(wiki, "#content")</code>
<code>## {xml_nodeset (1)}
 ## [1] <div id="content" class="mw-body-primary" role="main">\n\t<a id="top ...</code>
<code>pd <- html_table(html_nodes(wiki,"table")[[1]])</code>

Data Preparation (factorization of patient diagnosis)

<code>## Convert Dx to a dichotomous variable
 ## pd$Dx <- gsub("PD", 0, pd$Dx)
 ## pd$Dx <- gsub("HC", 1, pd$Dx)
 ## pd$Dx <- gsub("SWEDD", 2, pd$Dx)
 pd$Dx <- as.factor(pd$Dx)</code>

Multivariate Linear Regression Analysis (predict binary DX)

<code>## Full Model
 m1 <- glm(Dx ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd, family="binomial")
 summary(m1)</code>
<code>## 
 ## Call:
 ## glm(formula = Dx ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + 
 ##     R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + 
 ##     cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + 
 ##     L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + 
 ##     Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, family = "binomial", 
 ##     data = pd)
 ## 
 ## Deviance Residuals: 
 ##     Min       1Q   Median       3Q      Max  
 ## -2.3126  -0.9250   0.5125   0.8234   1.9586  
 ## 
 ## Coefficients:
 ##                           Estimate Std. Error z value Pr(>|z|)    
 ## (Intercept)             -3.610e+01  1.950e+01  -1.852 0.064086 .  
 ## L_caudate_Volume        -2.210e-03  2.666e-03  -0.829 0.407062    
 ## R_caudate_Volume        -2.253e-03  2.226e-03  -1.012 0.311614    
 ## L_putamen_Volume         3.911e-03  1.985e-03   1.970 0.048863 *  
 ## R_putamen_Volume        -4.274e-03  1.305e-03  -3.274 0.001058 ** 
 ## L_hippocampus_Volume     2.417e-03  1.333e-03   1.813 0.069879 .  
 ## R_hippocampus_Volume    -1.219e-03  1.160e-03  -1.051 0.293196    
 ## cerebellum_Volume        5.974e-04  6.215e-04   0.961 0.336450    
 ## L_lingual_gyrus_Volume   4.474e-04  6.471e-04   0.691 0.489338    
 ## R_lingual_gyrus_Volume   3.031e-04  6.885e-04   0.440 0.659740    
 ## L_fusiform_gyrus_Volume  1.896e-03  6.581e-04   2.882 0.003957 ** 
 ## R_fusiform_gyrus_Volume  8.328e-04  7.843e-04   1.062 0.288270    
 ## Sex                      1.668e-01  1.478e-01   1.129 0.259059    
 ## Weight                  -2.691e-02  7.195e-03  -3.740 0.000184 ***
 ## Age                     -1.520e-02  7.348e-03  -2.069 0.038568 *  
 ## UPDRS_part_I             4.891e-02  1.006e-01   0.486 0.626761    
 ## UPDRS_part_II            4.854e-02  1.487e-02   3.265 0.001095 ** 
 ## UPDRS_part_III           9.177e-02  1.032e-02   8.894  < 2e-16 ***
 ## chr12_rs34637584_GT      1.136e+00  1.527e-01   7.438 1.02e-13 ***
 ## chr17_rs11868035_GT     -7.815e-01  1.481e-01  -5.277 1.31e-07 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 ## 
 ## (Dispersion parameter for binomial family taken to be 1)
 ## 
 ##     Null deviance: 1467.0  on 1127  degrees of freedom
 ## Residual deviance: 1204.2  on 1108  degrees of freedom
 ## AIC: 1244.2
 ## 
 ## Number of Fisher Scoring iterations: 4</code>
<code>par(mfrow=c(2,4))  # default layout() call may not be  sufficiently large  to contain the entire plot, so we extent it
 plot(m1)</code>

Variable Selection

<code>library(leaps)</code>
<code>## Warning: package 'leaps' was built under R version 3.2.2</code>
<code>library(MASS)
 reg1 <- regsubsets(Dx ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd)
 summary(reg1)</code>
<code>## Subset selection object
 ## Call: regsubsets.formula(Dx ~ L_caudate_Volume + R_caudate_Volume + 
 ##     L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + 
 ##     R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + 
 ##     R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + 
 ##     Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, data = pd)
 ## 19 Variables  (and intercept)
 ##                         Forced in Forced out
 ## L_caudate_Volume            FALSE      FALSE
 ## R_caudate_Volume            FALSE      FALSE
 ## L_putamen_Volume            FALSE      FALSE
 ## R_putamen_Volume            FALSE      FALSE
 ## L_hippocampus_Volume        FALSE      FALSE
 ## R_hippocampus_Volume        FALSE      FALSE
 ## cerebellum_Volume           FALSE      FALSE
 ## L_lingual_gyrus_Volume      FALSE      FALSE
 ## R_lingual_gyrus_Volume      FALSE      FALSE
 ## L_fusiform_gyrus_Volume     FALSE      FALSE
 ## R_fusiform_gyrus_Volume     FALSE      FALSE
 ## Sex                         FALSE      FALSE
 ## Weight                      FALSE      FALSE
 ## Age                         FALSE      FALSE
 ## UPDRS_part_I                FALSE      FALSE
 ## UPDRS_part_II               FALSE      FALSE
 ## UPDRS_part_III              FALSE      FALSE
 ## chr12_rs34637584_GT         FALSE      FALSE
 ## chr17_rs11868035_GT         FALSE      FALSE
 ## 1 subsets of each size up to 8
 ## Selection Algorithm: exhaustive
 ##          L_caudate_Volume R_caudate_Volume L_putamen_Volume
 ## 1  ( 1 ) " "              " "              " "             
 ## 2  ( 1 ) " "              " "              " "             
 ## 3  ( 1 ) " "              " "              " "             
 ## 4  ( 1 ) " "              " "              " "             
 ## 5  ( 1 ) " "              " "              " "             
 ## 6  ( 1 ) " "              " "              " "             
 ## 7  ( 1 ) " "              "*"              " "             
 ## 8  ( 1 ) " "              "*"              " "             
 ##          R_putamen_Volume L_hippocampus_Volume R_hippocampus_Volume
 ## 1  ( 1 ) " "              " "                  " "                 
 ## 2  ( 1 ) " "              " "                  " "                 
 ## 3  ( 1 ) " "              " "                  " "                 
 ## 4  ( 1 ) " "              " "                  " "                 
 ## 5  ( 1 ) " "              " "                  " "                 
 ## 6  ( 1 ) "*"              " "                  " "                 
 ## 7  ( 1 ) "*"              " "                  " "                 
 ## 8  ( 1 ) "*"              " "                  " "                 
 ##          cerebellum_Volume L_lingual_gyrus_Volume R_lingual_gyrus_Volume
 ## 1  ( 1 ) " "               " "                    " "                   
 ## 2  ( 1 ) " "               " "                    " "                   
 ## 3  ( 1 ) " "               " "                    " "                   
 ## 4  ( 1 ) " "               " "                    " "                   
 ## 5  ( 1 ) " "               " "                    " "                   
 ## 6  ( 1 ) " "               " "                    " "                   
 ## 7  ( 1 ) " "               " "                    " "                   
 ## 8  ( 1 ) " "               " "                    " "                   
 ##          L_fusiform_gyrus_Volume R_fusiform_gyrus_Volume Sex Weight Age
 ## 1  ( 1 ) " "                     " "                     " " " "    " "
 ## 2  ( 1 ) " "                     " "                     " " " "    " "
 ## 3  ( 1 ) " "                     " "                     " " " "    " "
 ## 4  ( 1 ) " "                     " "                     " " " "    "*"
 ## 5  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 6  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 7  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 8  ( 1 ) "*"                     " "                     " " "*"    "*"
 ##          UPDRS_part_I UPDRS_part_II UPDRS_part_III chr12_rs34637584_GT
 ## 1  ( 1 ) " "          " "           " "            "*"                
 ## 2  ( 1 ) "*"          " "           " "            "*"                
 ## 3  ( 1 ) "*"          " "           "*"            "*"                
 ## 4  ( 1 ) "*"          " "           "*"            "*"                
 ## 5  ( 1 ) "*"          " "           "*"            "*"                
 ## 6  ( 1 ) "*"          " "           "*"            "*"                
 ## 7  ( 1 ) "*"          " "           "*"            "*"                
 ## 8  ( 1 ) "*"          " "           "*"            "*"                
 ##          chr17_rs11868035_GT
 ## 1  ( 1 ) " "                
 ## 2  ( 1 ) " "                
 ## 3  ( 1 ) " "                
 ## 4  ( 1 ) " "                
 ## 5  ( 1 ) " "                
 ## 6  ( 1 ) " "                
 ## 7  ( 1 ) " "                
 ## 8  ( 1 ) " "</code>

Forward Selection

<code>reg2 <- regsubsets(Dx ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd, method="forward")
 summary(reg2)</code>
<code>## Subset selection object
 ## Call: regsubsets.formula(Dx ~ L_caudate_Volume + R_caudate_Volume + 
 ##     L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + 
 ##     R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + 
 ##     R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + 
 ##     Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, data = pd, method = "forward")
 ## 19 Variables  (and intercept)
 ##                         Forced in Forced out
 ## L_caudate_Volume            FALSE      FALSE
 ## R_caudate_Volume            FALSE      FALSE
 ## L_putamen_Volume            FALSE      FALSE
 ## R_putamen_Volume            FALSE      FALSE
 ## L_hippocampus_Volume        FALSE      FALSE
 ## R_hippocampus_Volume        FALSE      FALSE
 ## cerebellum_Volume           FALSE      FALSE
 ## L_lingual_gyrus_Volume      FALSE      FALSE
 ## R_lingual_gyrus_Volume      FALSE      FALSE
 ## L_fusiform_gyrus_Volume     FALSE      FALSE
 ## R_fusiform_gyrus_Volume     FALSE      FALSE
 ## Sex                         FALSE      FALSE
 ## Weight                      FALSE      FALSE
 ## Age                         FALSE      FALSE
 ## UPDRS_part_I                FALSE      FALSE
 ## UPDRS_part_II               FALSE      FALSE
 ## UPDRS_part_III              FALSE      FALSE
 ## chr12_rs34637584_GT         FALSE      FALSE
 ## chr17_rs11868035_GT         FALSE      FALSE
 ## 1 subsets of each size up to 8
 ## Selection Algorithm: forward
 ##          L_caudate_Volume R_caudate_Volume L_putamen_Volume
 ## 1  ( 1 ) " "              " "              " "             
 ## 2  ( 1 ) " "              " "              " "             
 ## 3  ( 1 ) " "              " "              " "             
 ## 4  ( 1 ) " "              " "              " "             
 ## 5  ( 1 ) " "              " "              " "             
 ## 6  ( 1 ) " "              " "              " "             
 ## 7  ( 1 ) " "              "*"              " "             
 ## 8  ( 1 ) " "              "*"              " "             
 ##          R_putamen_Volume L_hippocampus_Volume R_hippocampus_Volume
 ## 1  ( 1 ) " "              " "                  " "                 
 ## 2  ( 1 ) " "              " "                  " "                 
 ## 3  ( 1 ) " "              " "                  " "                 
 ## 4  ( 1 ) " "              " "                  " "                 
 ## 5  ( 1 ) " "              " "                  " "                 
 ## 6  ( 1 ) "*"              " "                  " "                 
 ## 7  ( 1 ) "*"              " "                  " "                 
 ## 8  ( 1 ) "*"              " "                  " "                 
 ##          cerebellum_Volume L_lingual_gyrus_Volume R_lingual_gyrus_Volume
 ## 1  ( 1 ) " "               " "                    " "                   
 ## 2  ( 1 ) " "               " "                    " "                   
 ## 3  ( 1 ) " "               " "                    " "                   
 ## 4  ( 1 ) " "               " "                    " "                   
 ## 5  ( 1 ) " "               " "                    " "                   
 ## 6  ( 1 ) " "               " "                    " "                   
 ## 7  ( 1 ) " "               " "                    " "                   
 ## 8  ( 1 ) " "               " "                    " "                   
 ##          L_fusiform_gyrus_Volume R_fusiform_gyrus_Volume Sex Weight Age
 ## 1  ( 1 ) " "                     " "                     " " " "    " "
 ## 2  ( 1 ) " "                     " "                     " " " "    " "
 ## 3  ( 1 ) " "                     " "                     " " " "    " "
 ## 4  ( 1 ) " "                     " "                     " " " "    "*"
 ## 5  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 6  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 7  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 8  ( 1 ) "*"                     " "                     " " "*"    "*"
 ##          UPDRS_part_I UPDRS_part_II UPDRS_part_III chr12_rs34637584_GT
 ## 1  ( 1 ) " "          " "           " "            "*"                
 ## 2  ( 1 ) "*"          " "           " "            "*"                
 ## 3  ( 1 ) "*"          " "           "*"            "*"                
 ## 4  ( 1 ) "*"          " "           "*"            "*"                
 ## 5  ( 1 ) "*"          " "           "*"            "*"                
 ## 6  ( 1 ) "*"          " "           "*"            "*"                
 ## 7  ( 1 ) "*"          " "           "*"            "*"                
 ## 8  ( 1 ) "*"          " "           "*"            "*"                
 ##          chr17_rs11868035_GT
 ## 1  ( 1 ) " "                
 ## 2  ( 1 ) " "                
 ## 3  ( 1 ) " "                
 ## 4  ( 1 ) " "                
 ## 5  ( 1 ) " "                
 ## 6  ( 1 ) " "                
 ## 7  ( 1 ) " "                
 ## 8  ( 1 ) " "</code>

Backward Selection

<code>reg3 <- regsubsets(Dx ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd, method="backward")
 summary(reg3)</code>
<code>## Subset selection object
 ## Call: regsubsets.formula(Dx ~ L_caudate_Volume + R_caudate_Volume + 
 ##     L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + 
 ##     R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + 
 ##     R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + 
 ##     Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, data = pd, method = "backward")
 ## 19 Variables  (and intercept)
 ##                         Forced in Forced out
 ## L_caudate_Volume            FALSE      FALSE
 ## R_caudate_Volume            FALSE      FALSE
 ## L_putamen_Volume            FALSE      FALSE
 ## R_putamen_Volume            FALSE      FALSE
 ## L_hippocampus_Volume        FALSE      FALSE
 ## R_hippocampus_Volume        FALSE      FALSE
 ## cerebellum_Volume           FALSE      FALSE
 ## L_lingual_gyrus_Volume      FALSE      FALSE
 ## R_lingual_gyrus_Volume      FALSE      FALSE
 ## L_fusiform_gyrus_Volume     FALSE      FALSE
 ## R_fusiform_gyrus_Volume     FALSE      FALSE
 ## Sex                         FALSE      FALSE
 ## Weight                      FALSE      FALSE
 ## Age                         FALSE      FALSE
 ## UPDRS_part_I                FALSE      FALSE
 ## UPDRS_part_II               FALSE      FALSE
 ## UPDRS_part_III              FALSE      FALSE
 ## chr12_rs34637584_GT         FALSE      FALSE
 ## chr17_rs11868035_GT         FALSE      FALSE
 ## 1 subsets of each size up to 8
 ## Selection Algorithm: backward
 ##          L_caudate_Volume R_caudate_Volume L_putamen_Volume
 ## 1  ( 1 ) " "              " "              " "             
 ## 2  ( 1 ) " "              " "              " "             
 ## 3  ( 1 ) " "              " "              " "             
 ## 4  ( 1 ) " "              " "              " "             
 ## 5  ( 1 ) " "              " "              " "             
 ## 6  ( 1 ) " "              " "              " "             
 ## 7  ( 1 ) " "              "*"              " "             
 ## 8  ( 1 ) " "              "*"              " "             
 ##          R_putamen_Volume L_hippocampus_Volume R_hippocampus_Volume
 ## 1  ( 1 ) " "              " "                  " "                 
 ## 2  ( 1 ) " "              " "                  " "                 
 ## 3  ( 1 ) " "              " "                  " "                 
 ## 4  ( 1 ) " "              " "                  " "                 
 ## 5  ( 1 ) " "              " "                  " "                 
 ## 6  ( 1 ) "*"              " "                  " "                 
 ## 7  ( 1 ) "*"              " "                  " "                 
 ## 8  ( 1 ) "*"              " "                  " "                 
 ##          cerebellum_Volume L_lingual_gyrus_Volume R_lingual_gyrus_Volume
 ## 1  ( 1 ) " "               " "                    " "                   
 ## 2  ( 1 ) " "               " "                    " "                   
 ## 3  ( 1 ) " "               " "                    " "                   
 ## 4  ( 1 ) " "               " "                    " "                   
 ## 5  ( 1 ) " "               " "                    " "                   
 ## 6  ( 1 ) " "               " "                    " "                   
 ## 7  ( 1 ) " "               " "                    " "                   
 ## 8  ( 1 ) " "               " "                    " "                   
 ##          L_fusiform_gyrus_Volume R_fusiform_gyrus_Volume Sex Weight Age
 ## 1  ( 1 ) " "                     " "                     " " " "    " "
 ## 2  ( 1 ) " "                     " "                     " " " "    " "
 ## 3  ( 1 ) " "                     " "                     " " " "    " "
 ## 4  ( 1 ) " "                     " "                     " " " "    "*"
 ## 5  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 6  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 7  ( 1 ) "*"                     " "                     " " " "    "*"
 ## 8  ( 1 ) "*"                     " "                     " " "*"    "*"
 ##          UPDRS_part_I UPDRS_part_II UPDRS_part_III chr12_rs34637584_GT
 ## 1  ( 1 ) " "          " "           " "            "*"                
 ## 2  ( 1 ) "*"          " "           " "            "*"                
 ## 3  ( 1 ) "*"          " "           "*"            "*"                
 ## 4  ( 1 ) "*"          " "           "*"            "*"                
 ## 5  ( 1 ) "*"          " "           "*"            "*"                
 ## 6  ( 1 ) "*"          " "           "*"            "*"                
 ## 7  ( 1 ) "*"          " "           "*"            "*"                
 ## 8  ( 1 ) "*"          " "           "*"            "*"                
 ##          chr17_rs11868035_GT
 ## 1  ( 1 ) " "                
 ## 2  ( 1 ) " "                
 ## 3  ( 1 ) " "                
 ## 4  ( 1 ) " "                
 ## 5  ( 1 ) " "                
 ## 6  ( 1 ) " "                
 ## 7  ( 1 ) " "                
 ## 8  ( 1 ) " "</code>

Reduced Model

<code>m2 <- glm(Dx ~ R_caudate_Volume + R_fusiform_gyrus_Volume + Weight + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd, family="binomial")
 summary(m2)</code>
<code>## 
 ## Call:
 ## glm(formula = Dx ~ R_caudate_Volume + R_fusiform_gyrus_Volume + 
 ##     Weight + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, family = "binomial", 
 ##     data = pd)
 ## 
 ## Deviance Residuals: 
 ##     Min       1Q   Median       3Q      Max  
 ## -2.1034  -0.9941   0.5432   0.8438   1.9065  
 ## 
 ## Coefficients:
 ##                           Estimate Std. Error z value Pr(>|z|)    
 ## (Intercept)             -4.6549832  7.7471401  -0.601 0.547930    
 ## R_caudate_Volume        -0.0024399  0.0021297  -1.146 0.251937    
 ## R_fusiform_gyrus_Volume  0.0008031  0.0007508   1.070 0.284794    
 ## Weight                  -0.0269228  0.0069556  -3.871 0.000109 ***
 ## UPDRS_part_I             0.0865148  0.0962768   0.899 0.368863    
 ## UPDRS_part_II            0.0471636  0.0143506   3.287 0.001014 ** 
 ## UPDRS_part_III           0.0860347  0.0099415   8.654  < 2e-16 ***
 ## chr12_rs34637584_GT      1.1672942  0.1421042   8.214  < 2e-16 ***
 ## chr17_rs11868035_GT     -0.7393041  0.1429901  -5.170 2.34e-07 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 ## 
 ## (Dispersion parameter for binomial family taken to be 1)
 ## 
 ##     Null deviance: 1467.0  on 1127  degrees of freedom
 ## Residual deviance: 1238.5  on 1119  degrees of freedom
 ## AIC: 1256.5
 ## 
 ## Number of Fisher Scoring iterations: 4</code>
<code>par(mfrow=c(2,2))
 plot(m2)</code>

Removing Bad Leverage Points

<code>n <- dim(pd)[1]
 case <- c(1:n)
 cooks <- cooks.distance(m2)
 cooks_outlier <- case[cooks>4/(length(pd$Dx)-2)]
 cooks_outlier</code>
<code>##  [1]  653  654  655  656  673  674  675  676 1025 1026 1027 1028</code>
<code>m2 <- glm(Dx ~ R_caudate_Volume + R_fusiform_gyrus_Volume + Weight + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd[-cooks_outlier,], family="binomial")
 summary(m2)</code>
<code>## 
 ## Call:
 ## glm(formula = Dx ~ R_caudate_Volume + R_fusiform_gyrus_Volume + 
 ##     Weight + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, family = "binomial", 
 ##     data = pd[-cooks_outlier, ])
 ## 
 ## Deviance Residuals: 
 ##     Min       1Q   Median       3Q      Max  
 ## -2.1806  -0.9525   0.5194   0.8178   1.8609  
 ## 
 ## Coefficients:
 ##                           Estimate Std. Error z value Pr(>|z|)    
 ## (Intercept)             -6.6561223  7.9427555  -0.838    0.402    
 ## R_caudate_Volume        -0.0026930  0.0021848  -1.233    0.218    
 ## R_fusiform_gyrus_Volume  0.0010330  0.0007694   1.343    0.179    
 ## Weight                  -0.0310176  0.0071300  -4.350 1.36e-05 ***
 ## UPDRS_part_I             0.1332157  0.0987274   1.349    0.177    
 ## UPDRS_part_II            0.0595924  0.0148204   4.021 5.80e-05 ***
 ## UPDRS_part_III           0.0947331  0.0103083   9.190  < 2e-16 ***
 ## chr12_rs34637584_GT      1.2794104  0.1458133   8.774  < 2e-16 ***
 ## chr17_rs11868035_GT     -0.7348668  0.1466157  -5.012 5.38e-07 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 ## 
 ## (Dispersion parameter for binomial family taken to be 1)
 ## 
 ##     Null deviance: 1446.8  on 1115  degrees of freedom
 ## Residual deviance: 1190.5  on 1107  degrees of freedom
 ## AIC: 1208.5
 ## 
 ## Number of Fisher Scoring iterations: 4</code>
<code>par(mfrow=c(2,2))
 plot(m2)</code>

Log Transformation

<code>m3 <- glm(Dx ~ log(R_caudate_Volume) + log(R_fusiform_gyrus_Volume) + log(Weight) + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=pd[-cooks_outlier,], family="binomial")
 summary(m3)</code>
<code>## 
 ## Call:
 ## glm(formula = Dx ~ log(R_caudate_Volume) + log(R_fusiform_gyrus_Volume) + 
 ##     log(Weight) + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + 
 ##     chr12_rs34637584_GT + chr17_rs11868035_GT, family = "binomial", 
 ##     data = pd[-cooks_outlier, ])
 ## 
 ## Deviance Residuals: 
 ##     Min       1Q   Median       3Q      Max  
 ## -2.1723  -0.9456   0.5177   0.8192   1.8614  
 ## 
 ## Coefficients:
 ##                               Estimate Std. Error z value Pr(>|z|)    
 ## (Intercept)                  -63.35131   71.87608  -0.881    0.378    
 ## log(R_caudate_Volume)         -2.84111    2.19100  -1.297    0.195    
 ## log(R_fusiform_gyrus_Volume)  10.00512    7.69485   1.300    0.194    
 ## log(Weight)                   -2.44090    0.56224  -4.341 1.42e-05 ***
 ## UPDRS_part_I                   0.13423    0.09872   1.360    0.174    
 ## UPDRS_part_II                  0.06010    0.01482   4.054 5.02e-05 ***
 ## UPDRS_part_III                 0.09425    0.01029   9.160  < 2e-16 ***
 ## chr12_rs34637584_GT            1.28158    0.14584   8.788  < 2e-16 ***
 ## chr17_rs11868035_GT           -0.73351    0.14663  -5.003 5.66e-07 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 ## 
 ## (Dispersion parameter for binomial family taken to be 1)
 ## 
 ##     Null deviance: 1446.8  on 1115  degrees of freedom
 ## Residual deviance: 1190.3  on 1107  degrees of freedom
 ## AIC: 1208.3
 ## 
 ## Number of Fisher Scoring iterations: 4</code>
<code>par(mfrow=c(2,2))
 plot(m3)</code>

Multicollinearity Check

<code>library(car)
 vif(m2)</code>
<code>##        R_caudate_Volume R_fusiform_gyrus_Volume                  Weight 
 ##                1.024786                1.053087                1.056551 
 ##            UPDRS_part_I           UPDRS_part_II          UPDRS_part_III 
 ##                1.019879                1.027360                1.046336 
 ##     chr12_rs34637584_GT     chr17_rs11868035_GT 
 ##                1.054836                1.060063</code>
<code>## Here, each of the variance inflaction factors is less than 5, so the multicollinearity between these predictors does not affect the result significantly.</code>

Likelihood Ratio Test

<code>with(m2, null.deviance - deviance)</code>
<code>## [1] 256.336</code>
<code>with(m2, df.null - df.residual)</code>
<code>## [1] 8</code>
<code>with(m2, pchisq(null.deviance - deviance, df.null - df.residual, lower.tail = FALSE))</code>
<code>## [1] 7.811597e-51</code>
<code>## The chi-square of 403 with 8 degrees of freedom and an associated p-value of less than 0.001 indicates that the model as a whole fits significantly better than an empty model.</code>

Linear Discriminant Analysis

<code>## Consider time = 0 only
 pdzero <- pd[which(pd$Time==0),]
 ## Split dataset into 70% training and 30% testing set
 set.seed(666)
 train_index <- sample(dim(pdzero)[1], trunc(dim(pdzero)[1]*0.7))
 train <- pdzero[train_index,]
 test <- pdzero[-train_index,]
 library(MASS)
 lda.fit <- lda(Dx~L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=train)
 lda.predict <- predict(lda.fit, test)
 table(lda.predict$class, test$Dx)</code>
<code>##        
 ##         HC PD SWEDD
 ##   HC    17  3     8
 ##   PD     9 22     3
 ##   SWEDD  7 11     5</code>

Logistic Regression Analysis

<code>## Break up the three-level dependent variable into two binary variables
 pdzero$PD_YN <- ifelse(pdzero$Dx=="PD", 1, 0)
 pdzero$SWEDD_YN <- ifelse(pdzero$Dx=="SWEDD", 1, 0)
 ## Split the dataset into 70% training and 30% testing set
 set.seed(666)
 train_index <- sample(dim(pdzero)[1], trunc(dim(pdzero)[1]*0.7))
 train <- pdzero[train_index,]
 test <- pdzero[-train_index,]
 ## PD_YN as outcome
 log1 <- glm(PD_YN ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=train, family="binomial")
 probPD <- predict(log1, test, type="response")
 median(probPD) # 0.30</code>
<code>## [1] 0.2983791</code>
<code>resultPD <- ifelse(probPD>0.3, 1, 0)
 table(resultPD, test$PD_YN)</code>
<code>##         
 ## resultPD  0  1
 ##        0 30 13
 ##        1 19 23</code>
<code>## SWEDD as outcome
 log2 <- glm(SWEDD_YN ~ L_caudate_Volume + R_caudate_Volume + L_putamen_Volume + R_putamen_Volume + L_hippocampus_Volume + R_hippocampus_Volume + cerebellum_Volume + L_lingual_gyrus_Volume + R_lingual_gyrus_Volume + L_fusiform_gyrus_Volume + R_fusiform_gyrus_Volume + Sex + Weight + Age + UPDRS_part_I + UPDRS_part_II + UPDRS_part_III + chr12_rs34637584_GT + chr17_rs11868035_GT, data=train, family="binomial")
 probSW <- predict(log2, test, type="response")
 median(probSW) # 0.301</code>
<code>## [1] 0.3013772</code>
<code>resultSW <- ifelse(probSW>0.2, 1, 0)
 table(resultSW, test$SWEDD_YN)</code>
<code>##         
 ## resultSW  0  1
 ##        0 19  2
 ##        1 50 14</code>
<code>## Combine the results of two logistic model together
 finalpred <- NULL
 for (i in 1:dim(test)[1]) {
   if (resultPD[i]==1 & resultSW[i]==0) {
     finalpred[i] <- "PD"
   }
   if (resultPD[i]==1 & resultSW[i]==1) {
     finalpred[i] <- ifelse(probPD[i]>probSW[i], "PD", "SW")
   }
   if (resultPD[i]==0 & resultSW[i]==1) {
     finalpred[i] <- "SW"
   }
   if (resultPD[i]==0 & resultSW[i]==0) {
     finalpred[i] <- "HC"
   }
 }
 table(finalpred, test$Dx)</code>
<code>##          
 ## finalpred HC PD SWEDD
 ##        HC  3  1     1
 ##        PD 15 21     3
 ##        SW 15 14    12</code>


<script>

// add bootstrap table styles to pandoc tables
$(document).ready(function () {
  $('tr.header').parent('thead').parent('table').addClass('table table-condensed');
});

</script>

<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

</body>
</html>





Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif