SOCR EduMaterials AnalysesCommandLineVolumeMultipleRegression

From SOCR
Jump to: navigation, search

Analyses Command-Line - Volume-based Multiple Linear Regression Analysis

This page includes the information on how to access the Multiple Linear Regression library for the purpose of computing VOLUME/IMAGE MLR analyses. Access is provided via shell-based command-line interface on local machines. More information about other SOCR Analyses command-line interfaces is available here.

Introduction

In addition to the graphical user interfaces, via a web-browser, all SOCR Analyses allow command-line shell execution on local systems.

General Usage

  • Get the latest SOCR JAR files from the SOCR page (http://socr.ucla.edu/htmls/jars/).
  • The command-line interface to SOCR Analyses generally uses EXAMPLE 1 from the list of example data files for the corresponding analysis.
  • All Input files are ASCII (see examples within each of the specific analyses).
  • a -h flag at the end of the command-line indicates that the first row in all ASCII input data files is a HEADER row (so it's not interpreted as data)
  • Number of variables can be indicated at the end (after -h flag). If no number of variables is specified, 3 is set as default.

Try-It-Online

You can test the Multivariate Regression functionality using the Pipeline PWS Web-start server.

Volume Multiple Linear Regression Usage

  • Generic Setting:

java -cp /ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_core.jar:/ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_plugin.jar edu.ucla.stat.SOCR.analyses.command.volume.VolumeMultipleRegression -dm DesignMatrix.txt -h -regressors [name1,name2,...,name_k] -dim Zmax Ymax XMax [-intercept interceptConstant] [-p PValue_Filename] [-r RValue_Filename] [-t TStat_Filename] -data_type [0,1,2,3,4] -mask /ifs/tmp/myMaskVolume.img [-byteorder string]

  • Options:
    • -help: print usage
    • -dm [DesignMatrix.txt]: specify a tab-separated text file containing the design matrix. Note: Be careful with the construction of the design matrix ... The dm matrix file may need to be imported as an excel spreadsheet, first, and then recopied back to text edit using a PC/Windows machine. Mac and other platforms may introduce hidden characters (e.g., tab/return keys). So if you get an error like Beginning the stat analyses ... VolumeMultipleRegression Error!!!!!!!!!!!!!!, then please review your Design Matrix file. Missing values in the DM are indicated by "." Subjects with missing values in one predictor variable (see regressors) are not used in the analysis if this specific predictor is selected as a covariate.
    • -mask [Mask-volume.img]: specify a mask-volume (0 or 1 intensities) restricting the voxels, where the regression models are computed (optional), 1 Unsigned-Byte Analyze format volume of the same dimensions as the data (intensity spectrum [0:255], all intensities >0 are consuiderd part of the mask and processed)
    • -h: DesignMatrix contains a header (first row)
    • -regressors [name1,name2,...name_k]: specify which columns/variables from the Design-Matrix should be used as regressors/covariates
      • Interactions: Interactions are specified within the -regressors flag. For interactions, each interaction variable must be composed of individual regressor variables that must also be previously included in the regressor list. For instance, if we need a triple interaction name1*name2*name3, we must (minimally) specify:
        -regressors name1,name2,name3,name1*name2,name2*name3,name1*name3,name1*name2*name3
    • -dim Zmax Ymax XMax: specify the dimension-sizes (for 2D images use ZMax=1, for 1D, Zmax=Y_Max=1
    • -intercept [Base_Filename]: Base Filename for the 4 Intercept Estimates. This base filename will be appended with:
      • _Intercept_Pvalue.img
      • _Intercept_Beta.img
      • _Intercept_RPartCorr.img
      • _Intercept_TStat.img
    • -i [Base_Filename]: Identical to -intercept [Base_Filename]
    • -p [PValue_Filename]: output the p-value volume (enter only the base of the filename)
    • -b [Beta_Filename]: output the [AP_Statistics_Curriculum_2007_GLM_Regress#Estimating_the_Best_Linear_Fit | Beta effect-size coefficient]] volume (enter only the base of the filename)
    • -r [RValue_Filename]: output the partial-correlation volume (enter only the base of the filename)
    • -t [Tstat_Filename]: output the T-Statistics for the effect-size Beta (enter only the base of the filename). Note that if you have one-factor (like CDR scores), this factor may have different levels (e.g., CDR score in {0.0, 0.5, 1.0, 1.5, etc.}). When the number of levels > 2, the results stored in the files specified by the –t flag contain the F-statistics (F-maps) for the effect of this one-factor (CDR). In the special case of the CDR factor having only 2 levels, then the F-map is actually identical to the T-statistics (T-Map).
    • -data_type [0,1,2,3,4]: Type=0 is for Unsigned Byte, Type=1 is for Signed Byte, Type=2 is for Unsigned Short Integer, Type=3 is for Signed Short Integer and Type=4 is for 4Byte=Float Volume Input;
    • -byteorder string: string is one of {big, little, other}.
      • big = BIG_ENDIAN processor
      • little = LITTLE_ENDIAN processor
      • other = default processor (java.nio.ByteOrder.nativeOrder())
    • -byteswap: (deprecated) Only enter this flag if you want the input data to be read in and byteswapped! Note that -byteswap effects : input data, mask-volume and output results!
      • A better alternative is to always swap bytes (if necessary) outside this program (e.g., %> dd conv=swab if=20088_jacobian.img of=20088_jacobian_BS.img).
    • Memory Use: Note that for some large file sizes, you may need to request more memory form the JVM. If your data is larger than 2003 then use these parameters after the initial java call (-ms1000m -mx2000m), see the example below. This requests 1-2GB or RAM memory for this process. You may need more or less memory depending on the number of volumes and dimension sizes.
  • Example: Edit a new file (VolumeMultipleRegression.csh) using any editor and paste this inside (make sure the file has executable permissions). Some operating systems/platforms may require variants of this (C-shell) script.

#!/bin/csh

date

java -ms200m -mx500m -cp /ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_core.jar:/ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_plugin.jar edu.ucla.stat.SOCR.analyses.command.volume.VolumeMultipleRegression -dm /ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/SOCR_CSV_test_Scripts_Data/DM.txt -h -regressors CDR,MMSE -dim 220 220 220 -intercept interceptConstant -p /ifs/tmp/P_Value -r /ifs/tmp/R_Value -t /ifs/tmp/TStat_Value -data_type 2

# Or

java -ms200m -mx500m -cp /ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_core.jar:/ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_plugin.jar edu.ucla.stat.SOCR.analyses.command.volume.VolumeMultipleRegression -dm /ifs/ccb/CCB_SW_Tools/Statistics/SOCR_Statistics/SOCR_CSV_test_Scripts_Data/DM.txt -h -regressors AGE,CDR,AGE*CDR -dim 220 220 220 -p /ifs/ccb/CCB_SW_Tools/Statistics/SOCR_Statistics/SOCR_CSV_test_Scripts_Data/VolumeMultipleRegressionTest/P_Value_mask_New -r /ifs/ccb/CCB_SW_Tools/Statistics/SOCR_Statistics/SOCR_CSV_test_Scripts_Data/VolumeMultipleRegressionTest/R_Value_mask_New -mask /ifs/ccb/CCB_SW_Tools/Statistics/SOCR_Statistics/SOCR_CSV_test_Scripts_Data/VolumeMultipleRegressionTest/UC_mask_final8bit.img -t /ifs/ccb/CCB_SW_Tools/Statistics/SOCR_Statistics/SOCR_CSV_test_Scripts_Data/VolumeMultipleRegressionTest/T_Value_mask_New -data_type 2 -byteorder little &

  • Note the specification of the AGE*CDR interaction term.

date

exit

Example Input data files

The design-matrix datafile must be provided as tab-separated ASCII/text file (DM.txt). The ASCII content of each of these files should follow the syntax below. Note that the first lines in these files are column headers. This requires the "-h" flag on the command line at execution so that these first lines are interpreted as column headers. The first two columns are the Subject Identifier and filenames for the corresponding imaging volumes, respectively. Columns 3 and on store the corresponding predictor variable (covariate) values. Typically there will be between 1 and 10 covariates. Note that as of March 2009, all covariates have to be numerical values - you can encode all string variables as numbers. For example, SEX can become 0(Male), 1(Female); and GROUP_ID can be 0(Normal), 1(MCI), 2(AD), as shown below.

SUBJECT_ID FILENAME SEX GROUP_ID AGE CDR MMSE
1 /dir_1/1.img 1 0 76.38 0 29
2 /dir_2/2.img 0 0 79.37 0 30
3 /dir_3/3.img 1 1 65.22 0.5 27
4 /dir_4/4.img 1 1 69.42 0.5 25
5 /dir_5/5.img 1 2 70.75 0.5 26
6 /dir_6/6.img 0 1 73.73 0.5 25
7 /dir_7/7.img 1 0 71.2 0 30
8 /dir_8/8.img 0 1 82.78 0.5 28
9 /dir_9/9.img 1 0 70.8 0 30
10 /dir_10/10.img 1 1 75.35 0.5 24
11 /dir_11/11.img 0 1 85.65 0.5 26
12 /dir_12/12.img 0 0 84.76 0 30
13 /dir_13/13.img 0 1 78.87 0.5 26
14 /dir_14/14.img 0 0 70.87 0 29
15 /dir_15/15.img 1 2 71.12 0.5 21
16 /dir_16/16.img 0 0 72.98 0 29
17 /dir_17/17.img 1 0 73.44 0 30
18 /dir_18/18.img 1 2 74.76 1 25
19 /dir_19/19.img 0 2 63.65 0.5 22
20 /dir_20/20.img 0 0 75.91 0 28
21 /dir_21/21.img 1 0 75.4 0 27
22 /dir_22/22.img 0 1 76.31 0.5 26
23 /dir_23/23.img 0 1 83.24 0.5 28
24 /dir_24/24.img 1 2 90.99 0.5 26
25 /dir_25/25.img 0 0 87.33 0 30
26 /dir_26/26.img 1 0 71.94 0 30
27 /dir_27/27.img 1 1 68.98 0.5 26
28 /dir_28/28.img 1 1 56.25 0.5 24
29 /dir_29/29.img 0 1 67.49 0.5 27
30 /dir_30/30.img 0 1 84.44 0.5 29
31 /dir_31/31.img 0 1 66.37 0.5 28
32 /dir_32/32.img 1 2 83.66 1 20
33 /dir_33/33.img 0 0 76.11 0 30
34 /dir_34/34.img 0 1 77.87 0.5 29
35 /dir_35/35.img 0 1 68.54 0.5 30
36 /dir_36/36.img 0 2 83.37 1 20
37 /dir_37/37.img 0 0 59.98 0 30
38 /dir_38/38.img 0 1 70.03 0.5 25
39 /dir_39/39.img 0 2 65.56 1 24
40 /dir_40/40.img 1 1 80.04 0.5 29
41 /dir_41/41.img 1 0 77.81 0 30
42 /dir_42/42.img 0 0 70.2 0 30
43 /dir_43/43.img 1 0 77.14 0 29
44 /dir_44/44.img 1 1 76.07 0.5 24
45 /dir_45/45.img 1 2 65.94 0.5 25
46 /dir_46/46.img 1 1 64.66 0.5 27
47 /dir_47/47.img 0 1 70.55 0.5 28
48 /dir_48/48.img 0 1 71.3 0.5 25
49 /dir_49/49.img 0 1 86.59 0.5 25
50 /dir_50/50.img 1 1 84.09 0.5 24

Assumptions

The SOCR MLR analysis implements the General Linear Model (GLM) and does *not* require normality. Only the 2 assumptions listed are:

  • The design matrix X must have full column rank, otherwise the parameter vector β will not be identified — at most we will be able to narrow down its value to some linear subspace of \({R}^p\). For this property to hold, we must have n > p, where n is the sample size, and p is the column rank. Methods for fitting.
  • The regressors \(X_i\) are assumed to be error-free, that is they are not contaminated with measurement errors. Although not realistic in many settings, dropping this assumption leads to significantly more difficult errors-in-variables models.

Supplementary information

Auxiliary tools

Masking statistical volumes

This tool provides the functionality to construct Masks of Stat Analysis results (e.g., P, R or T-stat maps outputted by VolumeMultipleRegression) given a user-specified threshold value. The output masks are 0 and 1 Byte volumes (determined by the threshold value).

  • Example call:

java -cp /ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_core.jar:/ifs/ccb/CCB_SW_Tools/others/Statistics/SOCR_Statistics/bin/SOCR_plugin.jar edu.ucla.stat.SOCR.analyses.command.SplitMaskPositiveNegativeAnalysisResults -dim Zmax Ymax XMax -input filename -mask mask_filename -threshold Value [-below filename] [-above filename] -data_type [0,1,2,3,4] -byteorder little

  • Options:
    • -help: print usage
    • -mask [Mask-volume.img]: specify a mask-volume (0 or 1 intensities) restricting the voxels where the regression models are computed (optional), 1Byte Analyze format volume of the same dimensions as the data
    • -dim Zmax Ymax XMax: specify the dimension-sizes (for 2D images use ZMax=1, for 1D, Zmax=Y_Max=1
    • -below [Filename]: output mask of the intensities, where input <= threshold (1-Byte)
    • -above [Filename]: output mask of the intensities, where input < threshold (1-Byte)
    • -threshold [threshold_value]: threshold value separating the below/above intensities of the input file
    • -input [Filename]: input file-name
    • -data_type [0,1,2,3,4]:
      • Type=0 is for Unsigned Byte input volumes;
      • Type=1 is for Signed Byte input volumes;
      • Type=2 is for Unsigned-short integers
      • Type=3 is for Signed-short integers
      • Type=4 is for 4Byte=Float Volumes
    • -byteorder string: string is one of {big, little, other}.
      • big = BIG_ENDIAN processor
      • little = LITTLE_ENDIAN processor
      • other = default processor (java.nio.ByteOrder.nativeOrder())
    • -byteswap: (deprecated) Only enter this flag if you want the input data to be read in and byteswapped! Note that -byteswap effects: input data, mask-volume and output results!

References



Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif