Difference between revisions of "SMHS Cronbachs"

Jump to: navigation, search
Line 196: Line 196:
* [http://mirlyn.lib.umich.edu/Record/004199238 Statistical inference / Casella & Berger]
* [http://en.wikipedia.org/wiki/Cronbach's_alpha  Cronbach's alpha Wikipedia]
*[http://en.wikipedia.org/wiki/Kuder–Richardson_Formula_20  Kuder-Richardson Formula 20 Wikipedia]

Revision as of 13:25, 29 August 2014

Scientific Methods for Health Sciences - Instrument Performance Evaluation: Cronbach's α


Cronbach’s alpha $\alpha$ is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and measures whether several items that propose to measure the same general construct and produce similar scores. Cronbach’s alpha is widely used in the social science, nursing, business and other disciplines. Here we present a general introduction to Cronbach’s alpha, how is it calculated, how to apply it in research and what are some common problems when using Cronbach’s alpha.


We have discussed about internal and external consistency and their importance in researches and studies. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap of patients suffering from certain disease. The dataset contains 10records measuring the degree of difficulty experienced in carrying out daily activities. Each item is recorded from 1 (no difficulty) to 4 (can’t do). When those data is used to form a scale they need to have internal consistency. All items should measure the same thing, so they could be correlated with one another. Cronbach’s alpha generally increases when correlations between items increase.


Cronbach’s Alpha

Cronbach’s Alpha is a measure of internal consistency or reliability of a psychometric instrument and measures how well a set of items measure a single, one-dimensional latent aspect of individuals.

  • Suppose we measure a quantity X, which is a sum of K components: $X=Y_{1}+ Y_{2}+⋯+Y_{k}$, then Cronbach’s alpha is defined as $\alpha =\frac{K}{K-1}$ $\left( 1-\frac{\sum_{i=1}^{K}\sigma_{{Y}_{i}^{2}}} {\sigma_{X}^{2}}\right)$, where $\sigma_{X}^{2}$ is the variance of the observed total test scores, and $ \sigma_{{Y}_{i}^{2}} $ is the variance of component $i$ for the current sample.
If items are scored from 0 to 1, then $\alpha =\frac{K}{K-1}$ $\left( 1-\frac{\sum_{i=1}^{K}P_{i}Q_{i}} {\sigma_{X}^{2}} \right)$, where $P_{i}$ is the proportion scoring 1 on item $i$ and $Q_{i=1}-P_{i}$, alternatively, Cronbach’s alpha can be defined as $\alpha$=$\frac{K\bar c}{(\bar v +(K-1) \bar c )}$,where K is as above, $\bar v$ is the average variance of each component and $\bar c$ is the average of all covariance between the components across the current sample of persons.
  • The standardized Cronbach’s alpha can be defined as $\alpha_{standardized}=\frac{K\bar r} {(1+(K-1)\bar r )}$, $\bar r$ is the mean of $\frac {K(K-1)}{2}$ non redundant correlation coefficients (i.e., the mean of an upper triangular, or lower triangular, correlation matrix).
  • The theoretical value of alpha varies from 0 to 1 considering it is ratio of two variance. $\rho_{XX}=\frac{\sigma_{T}^{2}} {\sigma_{X}^{2}}$, reliability of test scores is the ratio of the true score and total score variance.

Internal consistency

Internal consistency is a measure of whether several items that proposed to measure the same general construct produce similar score. It is usually measured with Cronbach’s alpha, which is calculated from the pairwise correlation between items. Internal consistency can take values from negative infinity to 1. It is negative when there is greater within subject variability than between-subject variability. Only positive values of Cronbach’s alpha make sense. Cronbach’s alpha will generally increases as the inter-correlations among items tested increase.

Cronbach's alpha Internal consistency
$\alpha$ ≥ 0.9 Excellent (High-Stakes testing)
0.7 ≤ $\alpha$ < 0.9 Good (Low-Stakes testing)
0.6 ≤ $\alpha$ < 0.7 Acceptable
0.5 ≤ $\alpha$ < 0.6 Poor
$\alpha$ < 0.5 Unacceptable

Other Measures

  • Intra-class correlation: Cronbach’s alpha equals to the stepped-up intra-class correlation coefficient, which is commonly used in observational studies if and only if the value of the item variance component equals zero. If this variance component is negative, then alpha will underestimate the stepped-up intra-class correlation coefficient; if it’s positive, alpha will overestimate the stepped-up intra-class correlation.

Generalizability theory

Cronbach’s alpha is an unbiased estimate of the generalizability. It can be viewed as a measure of how well the sum score on the selected items capture the expected score in the entire domain, even if that domain is heterogeneous.

Problems with Cronbach’s alpha

  1. it is dependent not only on the magnitude of the correlations among items, but also on the number of items in the scale. Hence, a scale can be made to look more homogenous simply by increasing the number of items though the average correlation remains the same;
  2. if two scales each measuring a distinct aspect are combined to form a long scale, alpha would probably be high though the merged scale is obviously tapping two different attributes;
  3. if alpha is too high, then it may suggest a high level of item redundancy.

Split-Half Reliability

In Split-Half Reliability assessment, the test is split in half (e.g., odd / even) creating “equivalent forms”. The two “forms” are correlated with each other and the correlation coefficient is adjusted to reflect the entire test length, using the Spearman-Brown Prophecy formula. Suppose the $Corr(Even,Odd)=r$ is the raw correlation between the even and odd items. Then the adjusted correlation will be:$r’ = \frac{n r}{1 + (n-1)r},$ where n = number of items (in this case n=2).


Index Q1 Q2 Q3 Q4 Q5 Q6 Odd Even
1 1 0 0 1 1 0 2 1
2 1 1 0 1 0 1 1 3
3 1 1 1 1 1 0 3 2
4 1 0 0 0 1 0 2 0
5 1 1 1 1 0 0 2 2
6 0 0 0 0 1 0 1 0
mean 1.833333333 1.33333333
SD 0.752772653 1.21106014
corr(Even,Odd) 0.073127242
AdjCorr(Even,Odd)=$\frac{n*r}{(n-1)*(r+1)}$ 0.136288111


The Kuder–Richardson Formula 20 (KR-20) is a very reliable internal reliability estimate which simulates calculating split-half reliability for every possible combination of items. For a test with K test items indexed i=1 to K: $$KR-20 = \frac{K}{K-1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma^2_X} \right),$$ where $p_i$ is the proportion of correct responses to test item i, $q_i$ is the proportion of incorrect responses to test item i (thus $p_i + q_i= 1$), the variance for the denominator is $\sigma^2_X = \frac{\sum_{i=1}^n (X_i-\bar{X})^2\,{}}{n-1},$ and where $n$ is the total sample size.

The Cronbach's α and KR-20 are similar -- KR-20 is a derivative of the Cronbach's α with the advantage that it can handle both dichotomous and continuous variables, however, KR-20 can't be used when multiple-choice questions involve partial credit and require systematic item-based analysis.

Standard Error of Measurement (SEM)

The greater the reliability of the test, the smaller the SEM.

$$SEM=S\sqrt{1-r_{xx}},$$ where $r_{xx’}$ is the correlation between two instances of the measurements under identical conditions, and $S$ is the total standard deviation.


  • This article explores the internal validity and reliability of Kolb’s revised learning style inventory in a sample with 221 graduate and undergraduate business students. It also reviewed research on the LSI and studied on implications of conducting factor analysis using ipsative data (type of data where respondents compare two or more desirable options and pick the one that is most preferred (sometimes called a "forced choice" scale). Experiential learning theory is presented and the concept of learning styles explained. This paper largely supports prior research supporting the internal reliability of scales.
  • This article showed the reason a single-item questions pertaining to a construct are not reliable and should not be used in drawing conclusions. It compared the reliability of a summated, multi-item scale versus a single-item question and showed how unreliable a single item is and therefore not appropriate to make inferences based on analysis of single item question, which are used in measuring a construct.


In R: using the psy package and the psychometry dataset (expsy), which is a frame with 30 rows and 16 columns with missing data, where it1-it10 correspond to the rating of 30 patients with a 10 items scale, r1, r2, r3 to the rating of item 1 by 3 different clinicians of the same 30 patients, rb1, rb2, rb3 to the binary transformation of r1, r2, r3 (1 or 2 -> 0; and 3 or 4 -> 1).

cronbach(v1)  ## v1 is n*p matrix or data frame with n subjects and p items.
## This phrase is used to compute the Cronbach’s reliability coefficient alpha. 
## This coefficient may be applied to a series of items aggregated in a single score. 
## It estimates reliability in the framework of the domain sampling model. 

An example to calculate Cronbach’s alpha:

## this choose the vector of the columns 1 to 10 and calculated the  Cronbach’s Alpha value
$\$ $sample.size
[1] 27
$\$ $number.of.items
[1] 10
$\$ $alpha
[1] 0.1762655
## not good because item 2 is reversed (1 is high and 4 is low)     
## this choose columns 1 and columns 3 to 10 and added in the reversed column 2, 
## and then calculated the Cronbach’s Alpha value for the revised data
$\$ $sample.size
[1] 27
$\$ $number.of.items
[1] 10
$\$ $alpha
[1] 0.3752657
## better to obtain a 95%confidence interval:     
datafile <- cbind(expsy[,c(1,3:10)],-1*expsy[,2])  
## extract the revised data into a new dataset named ‘datafile’
cronbach.boot <- function(data,x) {cronbach(data[x,])3}
res <- boot(datafile,cronbach.boot,1000)   
boot(data = datafile, statistic = cronbach.boot, R = 1000)
Bootstrap Statistics :
    original      bias    std. error
t1* 0.3752657 -0.06104997   0.2372292
quantile(res$\$ $t,c(0.025,0.975))  ## this calculated the 25% and 97.5% value to form the 95% confidence interval of Cronbach’s alpha
     2.5%      97.5% 
-0.2987214  0.6330491
## two-sided bootstrapped confidence interval of Cronbach’s alpha boot.ci(res,type="bca") 
## adjusted bootstrap percentile (BCa) confidence interval (better) 
Based on 1000 bootstrap replicates

boot.ci(boot.out = res, type = "bca")

Intervals : 
Level       BCa          
95%   (-0.1514,  0.6668 )  
Calculations and Intervals on Original Scale

Cronbach's $\alpha$ calculations

The table below illustrates the setting and core calculations involved in computing the Cronbach's $\alpha$.

Subjects Items/Questions Part of the Assessment Instrument Total Score per Subject
$Q_1$ $Q_2$ ... $Q_k$
$S_1$ $Y_{1,1}$ $Y_{1,2}$ $Y_{1,k}$ $X_1=\sum_{j=1}^k{Y_{1,j}}$
$S_2$ $Y_{2,1}$ $Y_{2,2}$ $Y_{2,k}$ $X_2=\sum_{j=1}^k{Y_{2,j}}$
... ... ... ... ... ...
$S_n$ $Y_{n,1}$ $Y_{n,2}$ $Y_{n,k}$ $X_n=\sum_{j=1}^k{Y_{n,j}}$
Variance per Item $\sigma_{Y_{.,1}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,1}-\bar{Y}_{.,1})^2}$ $$\sigma_{Y_{.,2}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,2}-\bar{Y}_{.,2})^2}$$ $$\sigma_{Y_{.,k}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,k}-\bar{Y}_{.,k})^2}$$ $$\sigma_X^2=\frac{1}{n-1}\sum_{i=1}^n{(X_i-\bar{X})^2}$$



Translate this page:

Uk flag.gif

De flag.gif

Es flag.gif

Fr flag.gif

It flag.gif

Pt flag.gif

Jp flag.gif

Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Fi flag.gif

इस भाषा में
In flag.gif

No flag.png

Kr flag.gif

Cn flag.gif

Cn flag.gif

Ru flag.gif

Nl flag.gif

Gr flag.gif

Hr flag.gif

Česká republika
Cz flag.gif

Dk flag.gif

Pl flag.png

Ro flag.png

Se flag.gif