Difference between revisions of "SMHS MultimodelInference"

From SOCR
Jump to: navigation, search
(Created page with '== Scientific Methods for Health Sciences - Multi-Model Inference == <hr> * SOCR Home page: http://www.socr.umich.edu {{translate|pageName=http://wiki.socr.umich.e…')
 
Line 1: Line 1:
 
==[[SMHS| Scientific Methods for Health Sciences]] - Multi-Model Inference ==
 
==[[SMHS| Scientific Methods for Health Sciences]] - Multi-Model Inference ==
  
 +
===Motivation===
 +
 +
===Theory===
 +
====Akaike Information Criterion====
 +
For a given dataset, the Akaike Information Criterion (AIC) measures the relative quality of a statistical model. AIC is rooted in information entropy and quantifies relatively the quality of a model. It would not facilitate a hypothesis testing in an absolute sense. For instance, AIC will not give any warning if all the candidate models provide marginal fit to the data.
 +
 +
$$AIC= 2k -2\ln(L),$$
 +
where $k$ is the number of parameters in the statistical model, and $L$ is the maximal value of the likelihood function for the estimated model. In R, AIC may be computed using [http://stat.ethz.ch/R-manual/R-devel/library/stats/html/extractAIC.html extractAIC].
 +
 +
If we have a collection of candidate models for the dataset, the optimal model is the one that minimizes the AIC value, i.e., has maximal [[SMHS_CIs#Maximum_likelihood_estimation_.28MLE.29|log-likelihood]] relative to the number of parameters estimated by the model! AIC includes a fidelity term (rewarding goodness of fit) and a regularization term (penalizing models including large number of parameters that need to be estimated, which discourages [http://en.wikipedia.org/wiki/Overfitting model overfitting]).
 +
 +
The relation between sample size ($n$) and number of estimated parameters ($k$) is important. For instance, smaller sample-sizes relative to number of parameters , $\frac{n}{k} < 40$, need to use the corrected AIC (AICc):
 +
$$ AICc = AIC + 2k\frac{k+1}{n-(k+1)},$$
 +
 +
Note that AICc converges to AIC as $n$ increases. So, as this information criterion is relative, the use of AICc generally may be justified.
 +
 +
====Calculating model weights====
 +
We can rank-order the AIC values for each model, and compute $∆AIC$, the differences between the AIC of each model and the smallest one. That is, compute the relative AIC for each model with respect to the to best model with the smallest AIC value). The Akaike coefficient ($w_i$) represent the model weight:
 +
$$ w_i = \frac{e^{-\frac{1}{2}∆AIC_i}}{\sum_{j=1}^K{e^{-\frac{1}{2}∆AIC_j}}},$$
 +
Note that the weight of the $i^{th}$ model is $0 \leq w_i \leq 1$ and the sum of AIC weights equals 1. The larger the model weight, the better the model. For example, a model with the model weight $w=0.52$ indicates the probability of this model to be the best possible model is 52%.
 +
 +
====Model selection and multimodel inference====
 +
 +
In practice, the model selection protocol may never be perfect especially when the $∆AIC$ is small. When $∆AIC_{(2)} > 2$, the first (rank-ordered!) model is likely to be the best model.
 +
 +
When $∆AIC_{(2)} < 2$ As the model with very small AIC values, which also have small model weights, so the model-averaged result will not be influenced a lot by such models. Suppose $\hat{W}$ is the observed dependent value, we can use model averaging to obtain better estimates:
 +
$$\hat{\bar{W}}=\sum(i=1}^K{w_i $\hat{W_i}}.$$
 +
 +
The model-averaged prediction is calculated using theh prediction from all models weight-averaged by the model AIC weight. Similarly, we can obtain the model-averaged estimates of parameters. If model $i$-driven estimate of the parameter θ is $\hat{\theta}_i$, the averaged estimate of parameter is computed by:
 +
$$\hat{\bar{\theta}}=\sum(i=1}^K{w_i $\hat{\theta}_i}.$$
 +
 +
Then, the unconditional variance estimate of the parameter $\theta$ is:
 +
$$var\big (\hat{\bar{\theta}}\big )=\sum(i=1}^K{ \bigg[ w_i var(\hat{\theta}_i | g_i) + \big ( \hat{\theta}_i =\hat{\bar{\theta}}_i \big )^2 \big ]],$$
 +
where $\hat{\bar{θ}}$ is the model-averaged estimate, $w_i$ is model weight, and $g_i$ denotes the $i^{th}$ model. This estimator of the variance of parameter estimator incorporates sampling variance and a variance component for model selection uncertainty. Finally, confidence intervals for the parameter of hte averaged estimate can be constructed as:
 +
$$\hat{\bar{\theta}} \pm z_{\alpha/2} \times \sqrt{var(\hat{\bar{\theta}})}.$$
 +
 +
 +
===Applications===
  
  

Revision as of 16:46, 30 September 2014

Scientific Methods for Health Sciences - Multi-Model Inference

Motivation

Theory

Akaike Information Criterion

For a given dataset, the Akaike Information Criterion (AIC) measures the relative quality of a statistical model. AIC is rooted in information entropy and quantifies relatively the quality of a model. It would not facilitate a hypothesis testing in an absolute sense. For instance, AIC will not give any warning if all the candidate models provide marginal fit to the data.

$$AIC= 2k -2\ln(L),$$ where $k$ is the number of parameters in the statistical model, and $L$ is the maximal value of the likelihood function for the estimated model. In R, AIC may be computed using extractAIC.

If we have a collection of candidate models for the dataset, the optimal model is the one that minimizes the AIC value, i.e., has maximal log-likelihood relative to the number of parameters estimated by the model! AIC includes a fidelity term (rewarding goodness of fit) and a regularization term (penalizing models including large number of parameters that need to be estimated, which discourages model overfitting).

The relation between sample size ($n$) and number of estimated parameters ($k$) is important. For instance, smaller sample-sizes relative to number of parameters , $\frac{n}{k} < 40$, need to use the corrected AIC (AICc): $$ AICc = AIC + 2k\frac{k+1}{n-(k+1)},$$

Note that AICc converges to AIC as $n$ increases. So, as this information criterion is relative, the use of AICc generally may be justified.

Calculating model weights

We can rank-order the AIC values for each model, and compute $∆AIC$, the differences between the AIC of each model and the smallest one. That is, compute the relative AIC for each model with respect to the to best model with the smallest AIC value). The Akaike coefficient ($w_i$) represent the model weight: $$ w_i = \frac{e^{-\frac{1}{2}∆AIC_i}}{\sum_{j=1}^K{e^{-\frac{1}{2}∆AIC_j}}},$$ Note that the weight of the $i^{th}$ model is $0 \leq w_i \leq 1$ and the sum of AIC weights equals 1. The larger the model weight, the better the model. For example, a model with the model weight $w=0.52$ indicates the probability of this model to be the best possible model is 52%.

Model selection and multimodel inference

In practice, the model selection protocol may never be perfect especially when the $∆AIC$ is small. When $∆AIC_{(2)} > 2$, the first (rank-ordered!) model is likely to be the best model.

When $∆AIC_{(2)} < 2$ As the model with very small AIC values, which also have small model weights, so the model-averaged result will not be influenced a lot by such models. Suppose $\hat{W}$ is the observed dependent value, we can use model averaging to obtain better estimates: $$\hat{\bar{W}}=\sum(i=1}^K{w_i $\hat{W_i}}.$$

The model-averaged prediction is calculated using theh prediction from all models weight-averaged by the model AIC weight. Similarly, we can obtain the model-averaged estimates of parameters. If model $i$-driven estimate of the parameter θ is $\hat{\theta}_i$, the averaged estimate of parameter is computed by: $$\hat{\bar{\theta}}=\sum(i=1}^K{w_i $\hat{\theta}_i}.$$

Then, the unconditional variance estimate of the parameter $\theta$ is: $$var\big (\hat{\bar{\theta}}\big )=\sum(i=1}^K{ \bigg[ w_i var(\hat{\theta}_i | g_i) + \big ( \hat{\theta}_i =\hat{\bar{\theta}}_i \big )^2 \big ]],$$ where $\hat{\bar{θ}}$ is the model-averaged estimate, $w_i$ is model weight, and $g_i$ denotes the $i^{th}$ model. This estimator of the variance of parameter estimator incorporates sampling variance and a variance component for model selection uncertainty. Finally, confidence intervals for the parameter of hte averaged estimate can be constructed as: $$\hat{\bar{\theta}} \pm z_{\alpha/2} \times \sqrt{var(\hat{\bar{\theta}})}.$$


Applications




Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif