SMHS Cronbachs
Contents
Scientific Methods for Health Sciences - Instrument Performance Evaluation: Cronbach's α
Overview:
Cronbach’s alpha $\alpha$ is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and measures whether several items that propose to measure the same general construct and produce similar scores. Cronbach’s alpha is widely used in the social science, nursing, business and other disciplines. Here we present a general introduction to Cronbach’s alpha, how is it calculated, how to apply it in research and what are some common problems when using Cronbach’s alpha.
Motivation:
We have discussed about internal and external consistency and their importance in researches and studies. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap of patients suffering from certain disease. The dataset contains 10records measuring the degree of difficulty experienced in carrying out daily activities. Each item is recorded from 1 (no difficulty) to 4 (can’t do). When those data is used to form a scale they need to have internal consistency. All items should measure the same thing, so they could be correlated with one another. Cronbach’s alpha generally increases when correlations between items increase.
Theory
Cronbach’s Alpha
Cronbach’s Alpha is a measure of internal consistency or reliability of a psychometric instrument and measures how well a set of items measure a single, one-dimensional latent aspect of individuals.
- Suppose we measure a quantity X, which is a sum of K components: $X=Y_{1}+ Y_{2}+⋯+Y_{k}$, then Cronbach’s alpha is defined as $\alpha =\frac{K}{K-1}$ $\left( 1-\frac{\sum_{i=1}^{K}\sigma_{{Y}_{i}^{2}}} {\sigma_{X}^{2}}\right)$, where $\sigma_{X}^{2}$ is the variance of the observed total test scores, and $ \sigma_{{Y}_{i}^{2}} $ is the variance of component $i$ for the current sample.
- If items are scored from 0 to 1, then $\alpha =\frac{K}{K-1}$ $\left( 1-\frac{\sum_{i=1}^{K}P_{i}Q_{i}} {\sigma_{X}^{2}} \right)$, where $P_{i}$ is the proportion scoring 1 on item $i$ and $Q_{i=1}-P_{i}$, alternatively, Cronbach’s alpha can be defined as $\alpha$=$\frac{K\bar c}{(\bar v +(K-1) \bar c )}$,where K is as above, $\bar v$ is the average variance of each component and $\bar c$ is the average of all covariance between the components across the current sample of persons.
- The standardized Cronbach’s alpha can be defined as $\alpha_{standardized}=\frac{K\bar r} {(1+(K-1)\bar r )}$, $\bar r$ is the mean of $\frac {K(K-1)}{2}$ non redundant correlation coefficients (i.e., the mean of an upper triangular, or lower triangular, correlation matrix).
- The theoretical value of alpha varies from 0 to 1 considering it is ratio of two variance. $\rho_{XX}=\frac{\sigma_{T}^{2}} {\sigma_{X}^{2}}$, reliability of test scores is the ratio of the true score and total score variance.
Internal consistency
Internal consistency is a measure of whether several items that proposed to measure the same general construct produce similar score. It is usually measured with Cronbach’s alpha, which is calculated from the pairwise correlation between items. Internal consistency can take values from negative infinity to 1. It is negative when there is greater within subject variability than between-subject variability. Only positive values of Cronbach’s alpha make sense. Cronbach’s alpha will generally increases as the inter-correlations among items tested increase.
Cronbach's alpha | Internal consistency |
$\alpha$ ≥ 0.9 | Excellent (High-Stakes testing) |
0.7 ≤ $\alpha$ < 0.9 | Good (Low-Stakes testing) |
0.6 ≤ $\alpha$ < 0.7 | Acceptable |
0.5 ≤ $\alpha$ < 0.6 | Poor |
$\alpha$ < 0.5 | Unacceptable |
Other Measures
- Intra-class correlation: Cronbach’s alpha equals to the stepped-up intra-class correlation coefficient, which is commonly used in observational studies if and only if the value of the item variance component equals zero. If this variance component is negative, then alpha will underestimate the stepped-up intra-class correlation coefficient; if it’s positive, alpha will overestimate the stepped-up intra-class correlation.
Generalizability theory
Cronbach’s alpha is an unbiased estimate of the generalizability. It can be viewed as a measure of how well the sum score on the selected items capture the expected score in the entire domain, even if that domain is heterogeneous.
Problems with Cronbach’s alpha
- it is dependent not only on the magnitude of the correlations among items, but also on the number of items in the scale. Hence, a scale can be made to look more homogenous simply by increasing the number of items though the average correlation remains the same;
- if two scales each measuring a distinct aspect are combined to form a long scale, alpha would probably be high though the merged scale is obviously tapping two different attributes;
- if alpha is too high, then it may suggest a high level of item redundancy.
Split-Half Reliability
In Split-Half Reliability assessment, the test is split in half (e.g., odd / even) creating “equivalent forms”. The two “forms” are correlated with each other and the correlation coefficient is adjusted to reflect the entire test length, using the Spearman-Brown Prophecy formula. Suppose the $Corr(Even,Odd)=r$ is the raw correlation between the even and odd items. Then the adjusted correlation will be:$r’ = \frac{n r}{1 + (n-1)r},$ where n = number of items (in this case n=2).
Example:
Index | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Odd | Even |
1 | 1 | 0 | 0 | 1 | 1 | 0 | 2 | 1 |
2 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 3 |
3 | 1 | 1 | 1 | 1 | 1 | 0 | 3 | 2 |
4 | 1 | 0 | 0 | 0 | 1 | 0 | 2 | 0 |
5 | 1 | 1 | 1 | 1 | 0 | 0 | 2 | 2 |
6 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
mean | 1.833333333 | 1.33333333 | ||||||
SD | 0.752772653 | 1.21106014 | ||||||
corr(Even,Odd) | 0.073127242 | |||||||
AdjCorr(Even,Odd)=$\frac{n*r}{(n-1)*(r+1)}$ | 0.136288111 |
KR-20
The Kuder–Richardson Formula 20 (KR-20) is a very reliable internal reliability estimate which simulates calculating split-half reliability for every possible combination of items. For a test with K test items indexed i=1 to K: $$KR-20 = \frac{K}{K-1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma^2_X} \right),$$ where $p_i$ is the proportion of correct responses to test item i, $q_i$ is the proportion of incorrect responses to test item i (thus $p_i + q_i= 1$), the variance for the denominator is $\sigma^2_X = \frac{\sum_{i=1}^n (X_i-\bar{X})^2\,{}}{n-1},$ and where $n$ is the total sample size.
The Cronbach's α and KR-20 are similar -- KR-20 is a derivative of the Cronbach's α with the advantage that it can handle both dichotomous and continuous variables, however, KR-20 can't be used when multiple-choice questions involve partial credit and require systematic item-based analysis.
Standard Error of Measurement (SEM)
The greater the reliability of the test, the smaller the SEM.
$$SEM=S\sqrt{1-r_{xx}},$$ where $r_{xx’}$ is the correlation between two instances of the measurements under identical conditions, and $S$ is the total standard deviation.
Applications
- This article explores the internal validity and reliability of Kolb’s revised learning style inventory in a sample with 221 graduate and undergraduate business students. It also reviewed research on the LSI and studied on implications of conducting factor analysis using ipsative data (type of data where respondents compare two or more desirable options and pick the one that is most preferred (sometimes called a "forced choice" scale). Experiential learning theory is presented and the concept of learning styles explained. This paper largely supports prior research supporting the internal reliability of scales.
- This article showed the reason a single-item questions pertaining to a construct are not reliable and should not be used in drawing conclusions. It compared the reliability of a summated, multi-item scale versus a single-item question and showed how unreliable a single item is and therefore not appropriate to make inferences based on analysis of single item question, which are used in measuring a construct.
Software
- SOCR Cronbach's alpha calculator webapp (coming up) ...
- In R: using the psy package and the psychometry dataset (expsy), which is a frame with 30 rows and 16 columns with missing data, where it1-it10 correspond to the rating of 30 patients with a 10 items scale, r1, r2, r3 to the rating of item 1 by 3 different clinicians of the same 30 patients, rb1, rb2, rb3 to the binary transformation of r1, r2, r3 (1 or 2 -> 0; and 3 or 4 -> 1).
cronbach(v1) ## v1 is n*p matrix or data frame with n subjects and p items. ## This phrase is used to compute the Cronbach’s reliability coefficient alpha. ## This coefficient may be applied to a series of items aggregated in a single score. ## It estimates reliability in the framework of the domain sampling model.
An example to calculate Cronbach’s alpha:
library(psy) data(expsy) cronbach(expsy[,1:10]) ## this choose the vector of the columns 1 to 10 and calculated the Cronbach’s Alpha value
$\$ $sample.size [1] 27 $\$ $number.of.items [1] 10 $\$ $alpha [1] 0.1762655 ## not good because item 2 is reversed (1 is high and 4 is low)
cronbach(cbind(expsy[,c(1,3:10)],-1*expsy[,2])) ## this choose columns 1 and columns 3 to 10 and added in the reversed column 2, ## and then calculated the Cronbach’s Alpha value for the revised data
$\$ $sample.size [1] 27 $\$ $number.of.items [1] 10 $\$ $alpha [1] 0.3752657
## better to obtain a 95%confidence interval: datafile <- cbind(expsy[,c(1,3:10)],-1*expsy[,2]) ## extract the revised data into a new dataset named ‘datafile’ library(boot) cronbach.boot <- function(data,x) {cronbach(data[x,])3} res <- boot(datafile,cronbach.boot,1000) res
Call: boot(data = datafile, statistic = cronbach.boot, R = 1000) Bootstrap Statistics : original bias std. error t1* 0.3752657 -0.06104997 0.2372292
quantile(res$\$ $t,c(0.025,0.975)) ## this calculated the 25% and 97.5% value to form the 95% confidence interval of Cronbach’s alpha 2.5% 97.5% -0.2987214 0.6330491 ## two-sided bootstrapped confidence interval of Cronbach’s alpha boot.ci(res,type="bca") ## adjusted bootstrap percentile (BCa) confidence interval (better)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : boot.ci(boot.out = res, type = "bca") Intervals : Level BCa 95% (-0.1514, 0.6668 ) Calculations and Intervals on Original Scale
The CoefficientAlpha R package provides an alternative methods for computing Cronbach's alpha coefficient in the presence of missing data and for non-normal data. It also reports robust standard error and confidence interval estimates for alpha.
Cronbach's $\alpha$ calculations
The table below illustrates the setting and core calculations involved in computing the Cronbach's $\alpha$.
Subjects | Items/Questions Part of the Assessment Instrument | Total Score per Subject | |||
$Q_1$ | $Q_2$ | ... | $Q_k$ | ||
$S_1$ | $Y_{1,1}$ | $Y_{1,2}$ | … | $Y_{1,k}$ | $X_1=\sum_{j=1}^k{Y_{1,j}}$ |
$S_2$ | $Y_{2,1}$ | $Y_{2,2}$ | … | $Y_{2,k}$ | $X_2=\sum_{j=1}^k{Y_{2,j}}$ |
... | ... | ... | ... | ... | ... |
$S_n$ | $Y_{n,1}$ | $Y_{n,2}$ | … | $Y_{n,k}$ | $X_n=\sum_{j=1}^k{Y_{n,j}}$ |
Variance per Item | $\sigma_{Y_{.,1}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,1}-\bar{Y}_{.,1})^2}$ | $$\sigma_{Y_{.,2}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,2}-\bar{Y}_{.,2})^2}$$ | … | $$\sigma_{Y_{.,k}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,k}-\bar{Y}_{.,k})^2}$$ | $$\sigma_X^2=\frac{1}{n-1}\sum_{i=1}^n{(X_i-\bar{X})^2}$$ |
Problems
References
- SOCR Home page: http://www.socr.umich.edu
Translate this page: