Difference between revisions of "SOCR Data Analysis Documentation"

From SOCR
Jump to: navigation, search
 
(9 intermediate revisions by 2 users not shown)
Line 9: Line 9:
 
# Result Classes: under the directory of edu.ucla.stat.SOCR.analyses.result.
 
# Result Classes: under the directory of edu.ucla.stat.SOCR.analyses.result.
  
Data class has a public, non-static method called "getAnalysis." This is the main and the only method you needed to implement the data analysis. It identifies the requested analysis model by taking a parameter, anaylisisType, from the caller. The legal analysis types are short primitaves defined in edu.ucla.stat.SOCR.analysis.model.AnalysisType.java. You must use those defined in this class in order for it to work. Then a model class is called by the Data class to do all the mathematical operations. All the computed results (e.g. parameter estimates and their standard error), will be put in Result class. The form of storage is by using Result's HashMap member variable (variable name is: texture). Just fetch this HashMap using "getTexture" method to get the results.
+
A Data Object is the one that does the work for you. It stores the data you sibmit, as its member variables, and it calls Model classes (behind the scene) for results. It has "getter" methods that you can get computed results from. All the methods are public and non-static. See example below for how this works.
  
  
Line 18: Line 18:
 
<pre>
 
<pre>
  
double[] heightDoubleArray = new double[] {60, 55, 51, 54, 63};
+
double[] height = new double[] {60, 55, 51, 54, 63};
double[] weightDoubleArray = new double[] {190, 160, 110, 120, 130};
+
double[] weight = new double[] {190, 160, 110, 120, 130};
 
 
// you need to instantiate a data instance first.
+
// First, you need to instantiate a data instance.
Data testData = new Data();
+
Data testData = new Data();  
// submit the independent variable by appendX.
 
appendX("HEIGHT", heightDoubleArray, DataType.QUANTITATIVE);
 
// submit the independent variable by appendX.
 
appendY("WEIGHT", weightDoubleArray, DataType.QUANTITATIVE);
 
 
 
// then use the following line to get the result.
+
// Second, submit the independent variable by addPredictor,
 +
// and submit the dependent variable by addResponse.
 +
data.addPredictor(height , DataType.QUANTITATIVE);
 +
data.addResponse(finalGrade, DataType.QUANTITATIVE);
 +
 
 +
// Last, use the following to get the result.
 
try {
 
try {
Result result = data.getAnalysis(AnalysisType.SIMPLE_LINEAR_REGRESSION);
+
SimpleLinearRegressionResult result = data.modelSimpleLinearRegression();
// Result.getTexture() returns a HashMap that holds some result data.
+
if (result != null) {
if (result != null) {  
+
/* Getting the model's parameter estiamtes and statistics.
// is null if something in data goes wrong, e.g. exceptio. throwing.
+
                Here are two of them: alpha(intercept) and beta(slope).
HashMap texture = result.getTexture();
 
double alpha = 0, beta = 0;
 
try {
 
        alpha =
 
                        ((Double)texture.get(SimpleLinearRegressionResult.ALPHA)).doubleValue();
 
        System.out.println("alpha = " + alpha);
 
  
} catch (NullPointerException e) {
+
double alpha = result.getAlpha();
System.out.println("alpha could not be computed.");
+
double beta = result.getBeta();
}
+
try {
+
System.out.println("alpha = " + alpha);
beta =  
+
System.out.println("beta = " + beta );
                        ((Double)texture .get(SimpleLinearRegressionResult.BETA)).doubleValue();
 
System.out.println("alpha = " + alpha);
 
} catch (NullPointerException e) {
 
System.out.println("beta could not be computed.");
 
}
 
 
}
 
}
 
} catch (Exception e) {
 
} catch (Exception e) {
System.out.println("Something is wrong. No result generated.");
+
System.out.println("Something is wrong. No result generated. " + e);
 
}
 
}
 
</pre>
 
</pre>
Line 62: Line 52:
 
Here are a few things you should know when you code analysis method calls:
 
Here are a few things you should know when you code analysis method calls:
  
* the Exception of the outer try above consists of: DataIsEmptyException (SOCR defined), WrongAnalysisException (SOCR defined), InstantiationException, IllegalAccessException, ClassNotFoundException. What if anaysis type is not specified corretly? A WrongAnalysisException would be generated. Or, what if the array that holds data is actually null or with length zero, a DataIsEmptyException will be generated. In a public method in Data, "Exception" is thrown and that include all the above. During any of the exception situation, check exception stacks for details.  
+
* The Exception of the outer try above consists of: DataIsEmptyException (SOCR defined), WrongAnalysisException (SOCR defined), InstantiationException, IllegalAccessException, ClassNotFoundException. What if anaysis type is not specified corretly? A WrongAnalysisException would be generated. Or, what if the array that holds data is actually null or with length zero, a DataIsEmptyException will be generated. In a public method in Data, "Exception" is thrown and that include all the above. During any of the exception situation, check exception stacks for details.  
  
 
* For the complete set of avaialble results, please see more examples below or check class API under edu.ucla.stat.SOCR.analsyses.example directory. All of the methods to fetch the output are under under edu.ucla.stat.SOCR.analsyses.result directory.
 
* For the complete set of avaialble results, please see more examples below or check class API under edu.ucla.stat.SOCR.analsyses.example directory. All of the methods to fetch the output are under under edu.ucla.stat.SOCR.analsyses.result directory.
Line 70: Line 60:
 
* A few words about addPreditor and addResponse methods: All the QUANTITATIVE varialbles must be submited in a form of int, long, float or double. All the FACTOR must be submitted in form of String. If the mapping of "QUANTITATIVE as double" and "FACTOR as String" is not used, an exception will be caused and the computation is paused. For example, if you intend to submit an array of data as QUANTITATIVE but you submit them as String, the code cannot compute for results.  
 
* A few words about addPreditor and addResponse methods: All the QUANTITATIVE varialbles must be submited in a form of int, long, float or double. All the FACTOR must be submitted in form of String. If the mapping of "QUANTITATIVE as double" and "FACTOR as String" is not used, an exception will be caused and the computation is paused. For example, if you intend to submit an array of data as QUANTITATIVE but you submit them as String, the code cannot compute for results.  
  
 +
* Note that, for the models and Parametric and Non-Parametric categories, you only need to instantiant a "Data" object then call one of the Data class's methods to get a "Result" Object back. (For example, getting "TwoPairedTResult", which is a subclass of "Result". However, For those models under the Linear Model categoriy, you will have to call "addPredictor" and "addResponse" to submit the data.
  
 
=== Implemented Analysis Models===
 
=== Implemented Analysis Models===
Line 88: Line 79:
 
Under Non-Parametric Testing:
 
Under Non-Parametric Testing:
 
* Two Independent Sample Wilconxon Test
 
* Two Independent Sample Wilconxon Test
 +
* Multiple Independent Sample Kruskal-Wallis Test
 +
 +
Under Survival Analysis:
 +
* Kaplan-Meier Method
 +
  
 
=== More Examples===
 
=== More Examples===
Line 112: Line 108:
  
 
* Example: [[Two Independent Sample Wilcoxon Test]]
 
* Example: [[Two Independent Sample Wilcoxon Test]]
 +
 +
* Example: [[Multiple Independent Sample Kruskal Wallis Test]]
 +
 +
 +
Survival Analysis using Kaplan-Meier
 +
* Example: [[Survival Analysis Using Kaplan-Meier]]
 +
 +
 +
{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=SOCR_Data_Analysis_Documentation}}

Latest revision as of 20:07, 21 January 2007

Framework and Implementation

Framework of the Analysis Component

In the "anaylsis component", there are three "sets" of classes: data, Model, and Result. They are at:

  1. Data: edu.ucla.stat.SOCR.analysis.data.Data class.
  2. Model Classes: under the directory of edu.ucla.stat.SOCR.analyses.model.
  3. Result Classes: under the directory of edu.ucla.stat.SOCR.analyses.result.

A Data Object is the one that does the work for you. It stores the data you sibmit, as its member variables, and it calls Model classes (behind the scene) for results. It has "getter" methods that you can get computed results from. All the methods are public and non-static. See example below for how this works.


Example

Here is an example snippet. Suppose we would like to run a simple linear regression on two variables: height and weight, and the data are of the same length.


double[] height = new double[] {60, 55, 51, 54, 63};
double[] weight = new double[] {190, 160, 110, 120, 130};
	
// First, you need to instantiate a data instance.
Data testData = new Data(); 
	
// Second, submit the independent variable by addPredictor,
// and submit the dependent variable by addResponse.
data.addPredictor(height , DataType.QUANTITATIVE);
data.addResponse(finalGrade, DataType.QUANTITATIVE);

// Last, use the following  to get the result.
try {
	SimpleLinearRegressionResult result = data.modelSimpleLinearRegression();
	if (result != null) {
		/* Getting the model's parameter estiamtes and statistics.
                Here are two of them: alpha(intercept) and beta(slope).

		double alpha = result.getAlpha();
		double beta = result.getBeta();
 
		System.out.println("alpha = " + alpha);	
		System.out.println("beta = " + beta );		
	}
} catch (Exception e) {
	System.out.println("Something is wrong. No result generated. " + e);
}


A Few Points to Note

Here are a few things you should know when you code analysis method calls:

  • The Exception of the outer try above consists of: DataIsEmptyException (SOCR defined), WrongAnalysisException (SOCR defined), InstantiationException, IllegalAccessException, ClassNotFoundException. What if anaysis type is not specified corretly? A WrongAnalysisException would be generated. Or, what if the array that holds data is actually null or with length zero, a DataIsEmptyException will be generated. In a public method in Data, "Exception" is thrown and that include all the above. During any of the exception situation, check exception stacks for details.
  • For the complete set of avaialble results, please see more examples below or check class API under edu.ucla.stat.SOCR.analsyses.example directory. All of the methods to fetch the output are under under edu.ucla.stat.SOCR.analsyses.result directory.
  • Data Type: Note that we have two big categories of data: QUANTITATIVE and FACTOR. For example, QUANTITATIVE can be variables like height, weight, SAT score, etc. And FACTOR can be catogorical variables such as sex(Male/Female), race(White/Black/Asian/Hispanic, etc.). The variables "QUANTITATIVE" and "FACTOR" are coded as Java constants and they are declared in DataType.java class.
  • A few words about addPreditor and addResponse methods: All the QUANTITATIVE varialbles must be submited in a form of int, long, float or double. All the FACTOR must be submitted in form of String. If the mapping of "QUANTITATIVE as double" and "FACTOR as String" is not used, an exception will be caused and the computation is paused. For example, if you intend to submit an array of data as QUANTITATIVE but you submit them as String, the code cannot compute for results.
  • Note that, for the models and Parametric and Non-Parametric categories, you only need to instantiant a "Data" object then call one of the Data class's methods to get a "Result" Object back. (For example, getting "TwoPairedTResult", which is a subclass of "Result". However, For those models under the Linear Model categoriy, you will have to call "addPredictor" and "addResponse" to submit the data.

Implemented Analysis Models

As of August 1, 2006, we have implemented:

Under Linear Models:

  • One Way ANOVA
  • Two Way ANOVA
  • Simple Linear Regression
  • Multiple Linear Regression

Under Parametric Testing:

  • One Sample T-Test
  • Two Independent Sample T-Test
  • Two Paired Sample T-Test

Under Non-Parametric Testing:

  • Two Independent Sample Wilconxon Test
  • Multiple Independent Sample Kruskal-Wallis Test

Under Survival Analysis:

  • Kaplan-Meier Method


More Examples

The linear model ones are based on the logic described in section 1.


The parametric and non-parametric tests are even easier to use with ad hoc static methods.


Survival Analysis using Kaplan-Meier




Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif