Difference between revisions of "SMHS Cronbachs"

From SOCR
Jump to: navigation, search
(Software)
m (Software)
 
(32 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==[[SMHS| Scientific Methods for Health Sciences]] - Instrument Performance Evaluation: Cronbach's α ==
+
== [[SMHS|Scientific Methods for Health Sciences]] Instrument Performance Evaluation: Cronbach's α ==
  
===Overview:===
+
=== Overview ===
Cronbach’s alpha $\alpha$ is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and measures whether several items that propose to measure the same general construct and produce similar scores. Cronbach’s alpha is widely used in the social science, nursing, business and other disciplines. Here we present a general introduction to Cronbach’s alpha, how is it calculated, how to apply it in research and what are some common problems when using Cronbach’s alpha.
+
Cronbach’s alpha <math>\alpha</math> is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and assesses whether several items that propose to measure the same general construct produce similar scores. Cronbach’s alpha is widely used in the social sciences, nursing, business, and other disciplines. This section presents a general introduction to Cronbach’s alpha, including its calculation, application in research, and common issues in its use.
  
===Motivation:===
+
=== Motivation ===
We have discussed about internal and external consistency and their importance in researches and studies. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap of patients suffering from certain disease. The dataset contains 10records measuring the degree of difficulty experienced in carrying out daily activities. Each item is recorded from 1 (no difficulty) to 4 (can’t do). When those data is used to form a scale they need to have internal consistency. All items should measure the same thing, so they could be correlated with one another. Cronbach’s alpha generally increases when correlations between items increase.
+
We have discussed internal and external consistency and their importance in research. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap among patients suffering from a certain disease. The dataset contains 10 records measuring the degree of difficulty experienced in carrying out daily activities. Each item is scored from 1 (no difficulty) to 4 (can’t do). When these data are used to form a scale, they should exhibit internal consistency—all items should measure the same underlying construct and thus be correlated with one another. Cronbach’s alpha generally increases as the correlations between items increase.
  
 +
=== Theory ===
  
===Theory===
+
==== Cronbach’s Alpha ====
====Cronbach’s Alpha====
+
Cronbach’s alpha is a measure of internal consistency or reliability of a psychometric instrument and assesses how well a set of items measures a single, one-dimensional latent trait.
Cronbach’s Alpha is a measure of internal consistency or reliability of a psychometric instrument and measures how well a set of items measure a single, one-dimensional latent aspect of individuals.  
 
  
*Suppose we measure a quantity X, which is a sum of K components: $X=Y_{1}+ Y_{2}++Y_{k}$, then Cronbach’s alpha is defined as $\alpha =\frac{K}{K-1}$  $\left( 1-\frac{\sum_{i=1}^{K}\sigma_{{Y}_{i}^{2}}} {\sigma_{X}^{2}}\right)$, where $\sigma_{X}^{2}$ is the variance of the observed total test scores, and $ \sigma_{{Y}_{i}^{2}} $ is the variance of component $i$ for the current sample.  
+
Suppose we measure a quantity <math>X</math>, which is the sum of <math>K</math> components:
 +
<math>X = Y_1 + Y_2 + \cdots + Y_K</math>. 
 +
Then Cronbach’s alpha is defined as
 +
<math>
 +
\alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} \sigma_{Y_i}^2}{\sigma_X^2} \right),
 +
</math> 
 +
where <math>\sigma_X^2</math> is the variance of the observed total test scores and <math>\sigma_{Y_i}^2</math> is the variance of component <math>i</math> in the current sample.
  
: If items are scored from 0 to 1, then $\alpha =\frac{K}{K-1}$ $\left( 1-\frac{\sum_{i=1}^{K}P_{i}Q_{i}} {\sigma_{X}^{2}} \right)$, where $P_{i}$ is the proportion scoring 1 on item $i$ and $Q_{i=1}-P_{i}$, alternatively, Cronbach’s alpha can be defined as $\alpha$=$\frac{K\bar c}{(\bar v +(K-1) \bar c )}$,where K is as above, $\bar v$ is the average variance of each component and $\bar c$ is the average of all covariance between the components across the current sample of persons.
+
If items are scored from 0 to 1, then
 +
<math>
 +
\alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} P_i Q_i}{\sigma_X^2} \right),
 +
</math> 
 +
where <math>P_i</math> is the proportion scoring 1 on item <math>i</math>, and <math>Q_i = 1 - P_i</math>.
  
*The standardized Cronbach’s alpha can be defined as $\alpha_{standardized}=\frac{K\bar r} {(1+(K-1)\bar r )}$, $\bar r$ is the mean of $\frac {K(K-1)}{2}$ non redundant correlation coefficients (i.e., the mean of an upper triangular, or lower triangular, correlation matrix).
+
Alternatively, Cronbach’s alpha can be expressed as
 +
<math>
 +
\alpha = \frac{K \bar{c}}{\bar{v} + (K - 1) \bar{c}},
 +
</math> 
 +
where <math>\bar{v}</math> is the average variance of each component and <math>\bar{c}</math> is the average covariance between all item pairs.
  
*The theoretical value of alpha varies from 0 to 1 considering it is ratio of two variance. $\rho_{XX}=\frac{\sigma_{T}^{2}} {\sigma_{X}^{2}}$, reliability of test scores is the ratio of the true score and total score variance.  
+
The standardized Cronbach’s alpha is
 +
<math>
 +
\alpha_{\text{standardized}} = \frac{K \bar{r}}{1 + (K - 1) \bar{r}},
 +
</math> 
 +
where <math>\bar{r}</math> is the mean of the <math>K(K - 1)/2</math> non-redundant correlation coefficients (e.g., from the upper triangle of the correlation matrix).
  
====Internal consistency====
+
The theoretical value of alpha ranges from 0 to 1, as it is a ratio of variances. Reliability of test scores is defined as: 
Internal consistency is a measure of whether several items that proposed to measure the same general construct produce similar score. It is usually measured with Cronbach’s alpha, which is calculated from the pairwise correlation between items. Internal consistency can take values from negative infinity to 1. It is negative when there is greater within subject variability than between-subject variability. Only positive values of Cronbach’s alpha make sense. Cronbach’s alpha will generally increases as the inter-correlations among items tested increase.  
+
<math>
 +
\rho_{XX} = \frac{\sigma_T^2}{\sigma_X^2},
 +
</math> 
 +
the ratio of true-score variance to total-score variance.
 +
 
 +
==== Internal Consistency ====
 +
Internal consistency measures whether several items hypothesized to reflect the same construct yield similar scores. It is usually quantified using Cronbach’s alpha, which is derived from pairwise item correlations. Internal consistency can theoretically range from negative infinity to 1. Negative values occur when within-subject variability exceeds between-subject variability. Only positive values of Cronbach’s alpha are interpretable.
 +
 
 +
Cronbach’s alpha increases as inter-item correlations increase.
  
 
<center>
 
<center>
 
{| class="wikitable" style="text-align:center; width:35%" border="1"
 
{| class="wikitable" style="text-align:center; width:35%" border="1"
 
|-
 
|-
|Cronbach's alpha|| Internal consistency
+
| Cronbach's alpha || Internal consistency
|-
 
| $\alpha$  ≥ 0.9|| Excellent (High-Stakes testing)
 
 
|-
 
|-
|0.7 ≤ $\alpha$ < 0.9|| Good (Low-Stakes testing)
+
| <math>\alpha \geq 0.9</math> || Excellent (High-stakes testing)
 
|-
 
|-
|0.6 ≤ $\alpha$ < 0.7|| Acceptable
+
| <math>0.7 \leq \alpha < 0.9</math> || Good (Low-stakes testing)
 
|-
 
|-
|0.5 ≤ $\alpha$ < 0.6|| Poor
+
| <math>0.6 \leq \alpha < 0.7</math> || Acceptable
 
|-
 
|-
|$\alpha$ < 0.5 ||Unacceptable
+
| <math>0.5 \leq \alpha < 0.6</math> || Poor
 
|-
 
|-
 +
| <math>\alpha < 0.5</math> || Unacceptable
 
|}
 
|}
 
</center>
 
</center>
  
====Other Measures====
+
==== Other Measures ====
* '''Intra-class correlation:''' Cronbach’s alpha equals to the stepped-up intra-class correlation coefficient, which is commonly used in observational studies if and only if the value of the item variance component equals zero. If this variance component is negative, then alpha will underestimate the stepped-up intra-class correlation coefficient; if it’s positive, alpha will overestimate the stepped-up intra-class correlation.
 
  
====Generalizability theory====
+
Intra-class correlation (ICC) assesses the consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. Broadly, ICC is defined as: 
Cronbach’s alpha is an unbiased estimate of the generalizability. It can be viewed as a measure of how well the sum score on the selected items capture the expected score in the entire domain, even if that domain is heterogeneous.  
+
<math>
 +
\text{ICC} = \frac{\text{Variance due to rated subjects (patients)}}{\text{Variance due to subjects} + \text{Variance due to judges} + \text{Residual variance}}.
 +
</math>
  
====Problems with Cronbach’s alpha====
+
Example: Suppose 4 nurses rate 6 patients on a 10-point depression scale:
# it is dependent not only on the magnitude of the correlations among items, but also on the number of items in the scale. Hence, a scale can be made to look more homogenous simply by increasing the number of items though the average correlation remains the same;
 
# if two scales each measuring a distinct aspect are combined to form a long scale, alpha would probably be high though the merged scale is obviously tapping two different attributes;
 
# if alpha is too high, then it may suggest a high level of item redundancy.
 
  
====Split-Half Reliability====
+
<center>
In Split-Half Reliability assessment, the test is split in half (e.g., odd / even) creating “equivalent forms”. The two “forms” are correlated with each other and the correlation coefficient is adjusted to reflect the entire test length, using the Spearman-Brown Prophecy formula. Suppose the $Corr(Even,Odd)=r$ is the raw correlation between the even and odd items. Then the adjusted correlation will be:$r’ = \frac{n r}{1 + (n-1)r},$ where n = number of items (in this case n=2).
+
{| class="wikitable" style="text-align:center; width:75%" border="1"
 +
|-
 +
! PatientID || NurseRater1 || NurseRater2 || NurseRater3 || NurseRater4
 +
|-
 +
| 1 || 9 || 2 || 5 || 8
 +
|-
 +
| 2 || 6 || 1 || 3 || 2
 +
|-
 +
| 3 || 8 || 4 || 6 || 8
 +
|-
 +
| 4 || 7 || 1 || 2 || 6
 +
|-
 +
| 5 || 10 || 5 || 6 || 9
 +
|-
 +
| 6 || 6 || 2 || 4 || 7
 +
|}
 +
</center>
  
Example:
+
This data can also be formatted in long form:
  
 
<center>
 
<center>
{| class="wikitable" style="text-align:center; width:35%" border="1"
+
{| class="wikitable" style="text-align:center; width:75%" border="1"
 +
|-
 +
! PatientID || Rating || Nurse
 +
|-
 +
| 1 || 9 || 1
 +
|-
 +
| 2 || 6 || 1
 +
|-
 +
| 3 || 8 || 1
 +
|-
 +
| 4 || 7 || 1
 +
|-
 +
| 5 || 10 || 1
 +
|-
 +
| 6 || 6 || 1
 +
|-
 +
| 7 || 2 || 2
 +
|-
 +
| 8 || 1 || 2
 +
|-
 +
| 9 || 4 || 2
 +
|-
 +
| 10 || 1 || 2
 +
|-
 +
| 11 || 5 || 2
 +
|-
 +
| 12 || 2 || 2
 +
|-
 +
| 13 || 5 || 3
 +
|-
 +
| 14 || 3 || 3
 +
|-
 +
| 15 || 6 || 3
 
|-
 
|-
|Index|| Q1|| Q2|| Q3|| Q4|| Q5|| Q6|| Odd|| Even
+
| 16 || 2 || 3
 
|-
 
|-
|1 ||1|| 0|| 0|| 1|| 1|| 0|| 2|| 1
+
| 17 || 6 || 3
 
|-
 
|-
|2|| 1|| 1 ||0 ||1|| 0 ||1|| 1|| 3
+
| 18 || 4 || 3
 
|-
 
|-
|3|| 1|| 1|| 1|| 1|| 1|| 0|| 3|| 2
+
| 19 || 8 || 4
 
|-
 
|-
|4 ||1 ||0 ||0 ||0 ||1 ||0|| 2|| 0
+
| 20 || 2 || 4
 
|-
 
|-
|5|| 1|| 1|| 1|| 1|| 0|| 0|| 2|| 2
+
| 21 || 8 || 4
 
|-
 
|-
|6 ||0|| 0 ||0 ||0 ||1 ||0 ||1|| 0
+
| 22 || 6 || 4
 
|-
 
|-
| colspan=6 rowspan=4| ||mean|| 1.833333333|| 1.33333333
+
| 23 || 9 || 4
 
|-
 
|-
| SD|| 0.752772653|| 1.21106014
+
| 24 || 7 || 4
|-
 
| corr(Even,Odd)|| 0.073127242 || rowspan=2|
 
|-
 
| AdjCorr(Even,Odd)=$\frac{n*r}{(n-1)*(r+1)}$|| 0.136288111
 
|-
 
 
|}
 
|}
 
</center>
 
</center>
  
====KR-20====
+
<pre>
The [http://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_Formula_20 Kuder–Richardson Formula 20 (KR-20)] is a very reliable internal reliability estimate which simulates calculating split-half reliability for every possible combination of items. For a test with ''K'' test items indexed ''i''=1 to ''K'':
+
install.packages("ICC")
$$KR-20 = \frac{K}{K-1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma^2_X} \right),$$
+
library("ICC")
where $p_i$ is the proportion of ''correct'' responses to test item ''i'', $q_i$ is the proportion of ''incorrect'' responses to test item ''i'' (thus $p_i + q_i= 1$), the variance for the denominator is
 
$\sigma^2_X = \frac{\sum_{i=1}^n (X_i-\bar{X})^2\,{}}{n-1},$ and where $n$ is the total sample size.
 
  
The Cronbach's α and KR-20 are similar -- KR-20 is a derivative of the Cronbach's α with the advantage that it can handle both dichotomous and continuous variables, however, KR-20 can't be used when multiple-choice questions involve partial credit and require systematic item-based analysis.
+
# Load data (adjust path as needed)
 +
dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE)
 +
dataset <- dataset[, -1]  # Remove PatientID column
  
====Standard Error of Measurement (SEM)====
+
# Fit ICC model
The greater the reliability of the test, the smaller the SEM.
+
icc_result <- ICCest(Rating, Nurse, data = dataset)
 +
icc_result
 +
# ICC: -0.4804401
 +
# 95% CI: (-0.656, -0.035)
 +
</pre>
  
$$SEM=S\sqrt{1-r_{xx}},$$
+
Cronbach’s alpha equals the stepped-up intra-class correlation coefficient in observational studies if and only if the item variance component is zero. If this component is negative, alpha underestimates the ICC; if positive, it overestimates it.
where $r_{xx’}$ is the correlation between two instances of the measurements under identical conditions, and $S$ is the total standard deviation.
 
  
===Applications===
+
==== Generalizability Theory ====
 +
Cronbach’s alpha is an unbiased estimate of generalizability. It can be interpreted as a measure of how well the sum score on selected items captures the expected score over the entire domain—even if the domain is heterogeneous.
  
* [http://link.springer.com/article/10.1007/s10869-005-8262-4 This article] explores the internal validity and reliability of Kolb’s revised learning style inventory in a sample with 221 graduate and undergraduate business students. It also reviewed research on the LSI and studied on implications of conducting factor analysis using ipsative data (type of data where respondents compare two or more desirable options and pick the one that is most preferred (sometimes called a "forced choice" scale). Experiential learning theory is presented and the concept of learning styles explained. This paper largely supports prior research supporting the internal reliability of scales.
+
==== Problems with Cronbach’s Alpha ====
 +
- Alpha depends not only on the magnitude of inter-item correlations but also on the number of items. Scales can appear more homogeneous simply by adding more items, even if average correlation remains unchanged.
 +
- Combining two distinct constructs into one scale may yield a high alpha despite measuring two different attributes.
 +
- Excessively high alpha (<math>> 0.95</math>) may indicate item redundancy.
  
* [https://scholarworks.iupui.edu/bitstream/handle/1805/344/Gliem%20&%20Gliem.pdf?s This article] showed the reason a single-item questions pertaining to a construct are not reliable and should not be used in drawing conclusions. It compared the reliability of a summated, multi-item scale versus a single-item question and showed how unreliable a single item is and therefore not appropriate to make inferences based on analysis of single item question, which are used in measuring a construct.
+
==== Split-Half Reliability ====
 +
In split-half reliability, the test is divided into two halves (e.g., odd vs. even items). The correlation between halves is adjusted using the Spearman–Brown prophecy formula: 
 +
<math>
 +
r' = \frac{n r}{(n - 1)(r + 1)},
 +
</math> 
 +
where <math>r</math> is the raw correlation between halves and <math>n = 2</math>.
  
===Software===
+
Example:
  
'''In R:''' using [http://cran.r-project.org/web/packages/psy/psy.pdf the ''psy'' package]
+
<center>
+
{| class="wikitable" style="text-align:center; width:35%" border="1"
cronbach(v1)  ## v1 is n*p matrix or data frame with n subjects and p items.
+
|-
## This phrase is used to compute the Cronbach’s reliability coefficient alpha.
+
| Index || Q1 || Q2 || Q3 || Q4 || Q5 || Q6 || Odd || Even
## This coefficient may be applied to a series of items aggregated in a single score.
+
|-
## It estimates reliability in the framework of the domain sampling model.
+
| 1 || 1 || 0 || 0 || 1 || 1 || 0 || 2 || 1
 
+
|-
An example to calculate Cronbach’s alpha:
+
| 2 || 1 || 1 || 0 || 1 || 0 || 1 || 1 || 3
library(psy)
+
|-
data(expsy)   
+
| 3 || 1 || 1 || 1 || 1 || 1 || 0 || 3 || 2
cronbach(expsy[,1:10])  ## this choose the vector of the columns 1 to 10 and calculated the  Cronbach’s Alpha value
+
|-
$\$ $sample.size
+
| 4 || 1 || 0 || 0 || 0 || 1 || 0 || 2 || 0
[1] 27
+
|-
$\$ $number.of.items
+
| 5 || 1 || 1 || 1 || 1 || 0 || 0 || 2 || 2
[1] 10
+
|-
$\$ $alpha
+
| 6 || 0 || 0 || 0 || 0 || 1 || 0 || 1 || 0
[1] 0.1762655
+
|-
## not good because item 2 is reversed (1 is high and 4 is low)   
+
| colspan=6 rowspan=4 || mean || 1.833 || 1.333
cronbach(cbind(expsy[,c(1,3:10)],-1*expsy[,2]))  ## this choose columns 1 and columns 3 to 10 and added in the reversed column 2, and then calculated the Cronbach’s Alpha value for the revised data
+
|-
$\$ $sample.size
+
| SD || 0.753 || 1.211
[1] 27
+
|-
$\$ $number.of.items
+
| corr(Even, Odd) || 0.073 ||
[1] 10
+
|-
$\$ $alpha
+
| AdjCorr = <math>\frac{n r}{(n-1)(r+1)}</math> || 0.136 ||
[1] 0.3752657
+
|}
 
+
</center>
## better to obtain a 95%confidence interval:   
 
datafile <- cbind(expsy[,c(1,3:10)],-1*expsy[,2])  ## extract the revised data into a new dataset named ‘datafile’
 
library(boot)
 
cronbach.boot <- function(data,x) {cronbach(data[x,])[[3]]}
 
res <- boot(datafile,cronbach.boot,1000) 
 
res
 
 
 
Call:
 
boot(data = datafile, statistic = cronbach.boot, R = 1000)
 
Bootstrap Statistics :
 
    original      bias    std. error
 
t1* 0.3752657 -0.06104997  0.2372292
 
 
 
quantile(res$t,c(0.025,0.975))  ## this calculated the 25% and 97.5% value to form the 95% confidence interval of Cronbach’s alpha
 
      2.5%      97.5%
 
-0.2987214  0.6330491
 
## two-sided bootstrapped confidence interval of Cronbach’s alpha boot.ci(res,type="bca")
 
## adjusted bootstrap percentile (BCa) confidence interval (better)  
 
 
 
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
 
Based on 1000 bootstrap replicates
 
 
CALL :
 
boot.ci(boot.out = res, type = "bca")
 
 
Intervals :
 
Level      BCa         
 
95%  (-0.1514,  0.6668 ) 
 
Calculations and Intervals on Original Scale
 
 
 
===Problems===
 
  
 +
==== KR-20 ====
 +
The Kuder–Richardson Formula 20 (KR-20) is a reliability estimate for dichotomous items:
  
 +
<math>
 +
\text{KR-20} = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma_X^2} \right),
 +
</math> 
 +
where <math>p_i</math> is the proportion of correct responses to item <math>i</math>, <math>q_i = 1 - p_i</math>, and <math>\sigma_X^2</math> is the sample variance of total scores.
  
===References===
+
KR-20 is a special case of Cronbach’s alpha for binary data. It cannot accommodate partial credit or ordinal responses.
http://mirlyn.lib.umich.edu/Record/004199238
 
http://mirlyn.lib.umich.edu/Record/004232056
 
http://mirlyn.lib.umich.edu/Record/004133572
 
  
 +
==== Standard Error of Measurement (SEM) ====
 +
The greater the reliability, the smaller the SEM:
  
 +
<math>
 +
\text{SEM} = S \sqrt{1 - r_{xx}},
 +
</math> 
 +
where <math>r_{xx}</math> is the reliability and <math>S</math> is the standard deviation of observed scores.
  
 +
=== Applications ===
 +
- A study on Kolb’s revised Learning Style Inventory with 221 business students supports the internal reliability of its scales and discusses factor analysis with ipsative data.
 +
- Research comparing multi-item scales to single-item questions demonstrates that single items yield unreliable results and should not be used for construct inference.
  
 +
=== Software ===
  
 +
SOCR Cronbach's alpha calculator webapp (coming soon).
  
 +
In R, the `psych` package is recommended over older alternatives like `psy`. Example using the `expsy` dataset:
  
 +
<pre>
 +
# Load required packages
 +
library(psy)
 +
data(expsy)
  
 +
# Compute alpha for first 10 items
 +
cronbach(expsy[, 1:10])
 +
# Result: alpha ≈ 0.176 — low due to reversed item (item 2)
  
 +
# Reverse item 2 (assuming 1=high, 4=low)
 +
revised_data <- cbind(expsy[, c(1, 3:10)], -1 * expsy[, 2])
 +
cronbach(revised_data)
 +
# Result: alpha ≈ 0.375
  
 +
# Bootstrap 95% CI
 +
library(boot)
 +
cronbach.boot <- function(data, indices) {
 +
  cronbach(data[indices, ])[[3]]  # extract alpha
 +
}
 +
res <- boot(revised_data, cronbach.boot, R = 1000)
 +
quantile(res$t, c(0.025, 0.975))
 +
# e.g., [-0.30, 0.63]
  
 +
boot.ci(res, type = "bca")
 +
# BCa CI: (-0.15, 0.67)
 +
</pre>
  
 +
The '''coefficientalpha''' R package offers robust estimation with missing data and non-normality, including standard errors and confidence intervals.
  
 +
=== Cronbach's <math>\alpha</math> Calculations ===
  
===Cronbach's $\alpha$ calculations===
+
The table below illustrates the core structure for computing Cronbach’s alpha.
The table below illustrates the setting and core calculations involved in computing the Cronbach's $\alpha$.
 
  
 
<center>
 
<center>
 
{| align="center" border="1"
 
{| align="center" border="1"
 
|-
 
|-
| rowspan="2"| Subjects || colspan="4" align="center"| Items/Questions Part of the Assessment Instrument|| rowspan="2" | Total Score per Subject
+
| rowspan="2" | Subjects || colspan="4" align="center" | Items/Questions || rowspan="2" | Total Score
 
|-
 
|-
| $Q_1$ ||$Q_2$ ||... ||$Q_k$
+
| <math>Q_1</math> || <math>Q_2</math> || || <math>Q_k</math>
 
|-
 
|-
| $S_1$||$Y_{1,1}$||$Y_{1,2}$||…||$Y_{1,k}$||$X_1=\sum_{j=1}^k{Y_{1,j}}$
+
| <math>S_1</math> || <math>Y_{1,1}</math> || <math>Y_{1,2}</math> || … || <math>Y_{1,k}</math> || <math>X_1 = \sum_{j=1}^k Y_{1,j}</math>
 
|-
 
|-
| $S_2$||$Y_{2,1}$||$Y_{2,2}$||…||$Y_{2,k}$||$X_2=\sum_{j=1}^k{Y_{2,j}}$
+
| <math>S_2</math> || <math>Y_{2,1}</math> || <math>Y_{2,2}</math> || … || <math>Y_{2,k}</math> || <math>X_2 = \sum_{j=1}^k Y_{2,j}</math>
 
|-
 
|-
| ... ||... ||... ||...||...||...
+
| || || || || ||
 
|-
 
|-
| $S_n$||$Y_{n,1}$||$Y_{n,2}$||…||$Y_{n,k}$||$X_n=\sum_{j=1}^k{Y_{n,j}}$
+
| <math>S_n</math> || <math>Y_{n,1}</math> || <math>Y_{n,2}</math> || … || <math>Y_{n,k}</math> || <math>X_n = \sum_{j=1}^k Y_{n,j}</math>
 
|-
 
|-
| Variance per Item||$\sigma_{Y_{.,1}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,1}-\bar{Y}_{.,1})^2}$||$$\sigma_{Y_{.,2}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,2}-\bar{Y}_{.,2})^2}$$||…||$$\sigma_{Y_{.,k}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,k}-\bar{Y}_{.,k})^2}$$||$$\sigma_X^2=\frac{1}{n-1}\sum_{i=1}^n{(X_i-\bar{X})^2}$$
+
| Variance || <math>\sigma_{Y_{.,1}}^2</math> || <math>\sigma_{Y_{.,2}}^2</math> || … || <math>\sigma_{Y_{.,k}}^2</math> || <math>\sigma_X^2</math>
 
|}
 
|}
 
</center>
 
</center>
  
 +
=== Cronbach's <math>\alpha</math> Inference ===
 +
Cronbach’s alpha is a point estimate. Its standard error enables interval estimation and hypothesis testing.
 +
 +
: Parametric CIs use the Pearson correlation matrix (assumes normality).
 +
: Non-parametric CIs use Spearman correlations (robust to non-normality).
 +
: For ordinal data, consider ordinal alpha (Zumbo et al.), which uses polychoric correlations.
 +
 +
The general formula for alpha is: 
 +
<math>
 +
\alpha = \frac{N}{N - 1} \left( 1 - \frac{\sum_{j=1}^N V(Y_j)}{V\left( \sum_{j=1}^N Y_j \right)} \right).
 +
</math>
 +
 +
The estimated variance of <math>\hat{\alpha}</math> is: 
 +
<math>
 +
\hat{\sigma}^2_{\hat{\alpha}} = \frac{N^2}{k(N - 1)^2} \cdot d,
 +
</math> 
 +
where <math>d</math> depends on the covariance matrix <math>S</math>, its trace, and the vector of ones <math>j</math>.
 +
 +
A <math>(1 - \gamma)100\%</math> confidence interval is: 
 +
<math>
 +
\left( \hat{\alpha} - z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}},\
 +
\hat{\alpha} + z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}} \right).
 +
</math>
 +
 +
=== Problems ===
 +
Use the [[SOCR_TurkiyeStudentEvalData|Turkiye Student Course Evaluation survey (N=5,000)]] to compute the ICC and Cronbach’s alpha.
 +
 +
=== References ===
 +
 +
: [https://en.wikipedia.org/wiki/Cronbach%27s_alpha Cronbach's alpha – Wikipedia] 
 +
: [https://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_Formula_20 KR-20 – Wikipedia] 
 +
: Tsagris, M. (2014). Confidence intervals for Cronbach’s reliability coefficient. ResearchGate. 
 +
: Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of coefficients alpha and theta. Practical Assessment, Research & Evaluation, 12(13).
  
 
<hr>
 
<hr>
* SOCR Home page: http://www.socr.umich.edu
+
SOCR Home page: http://www.socr.umich.edu
 
 
 
{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_Cronbachs}}
 
{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_Cronbachs}}

Latest revision as of 17:30, 10 February 2026

Scientific Methods for Health Sciences – Instrument Performance Evaluation: Cronbach's α

Overview

Cronbach’s alpha \(\alpha\) is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and assesses whether several items that propose to measure the same general construct produce similar scores. Cronbach’s alpha is widely used in the social sciences, nursing, business, and other disciplines. This section presents a general introduction to Cronbach’s alpha, including its calculation, application in research, and common issues in its use.

Motivation

We have discussed internal and external consistency and their importance in research. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap among patients suffering from a certain disease. The dataset contains 10 records measuring the degree of difficulty experienced in carrying out daily activities. Each item is scored from 1 (no difficulty) to 4 (can’t do). When these data are used to form a scale, they should exhibit internal consistency—all items should measure the same underlying construct and thus be correlated with one another. Cronbach’s alpha generally increases as the correlations between items increase.

Theory

Cronbach’s Alpha

Cronbach’s alpha is a measure of internal consistency or reliability of a psychometric instrument and assesses how well a set of items measures a single, one-dimensional latent trait.

Suppose we measure a quantity \(X\), which is the sum of \(K\) components\[X = Y_1 + Y_2 + \cdots + Y_K\]. Then Cronbach’s alpha is defined as\[ \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} \sigma_{Y_i}^2}{\sigma_X^2} \right), \] where \(\sigma_X^2\) is the variance of the observed total test scores and \(\sigma_{Y_i}^2\) is the variance of component \(i\) in the current sample.

If items are scored from 0 to 1, then\[ \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} P_i Q_i}{\sigma_X^2} \right), \] where \(P_i\) is the proportion scoring 1 on item \(i\), and \(Q_i = 1 - P_i\).

Alternatively, Cronbach’s alpha can be expressed as\[ \alpha = \frac{K \bar{c}}{\bar{v} + (K - 1) \bar{c}}, \] where \(\bar{v}\) is the average variance of each component and \(\bar{c}\) is the average covariance between all item pairs.

The standardized Cronbach’s alpha is\[ \alpha_{\text{standardized}} = \frac{K \bar{r}}{1 + (K - 1) \bar{r}}, \] where \(\bar{r}\) is the mean of the \(K(K - 1)/2\) non-redundant correlation coefficients (e.g., from the upper triangle of the correlation matrix).

The theoretical value of alpha ranges from 0 to 1, as it is a ratio of variances. Reliability of test scores is defined as\[ \rho_{XX} = \frac{\sigma_T^2}{\sigma_X^2}, \] the ratio of true-score variance to total-score variance.

Internal Consistency

Internal consistency measures whether several items hypothesized to reflect the same construct yield similar scores. It is usually quantified using Cronbach’s alpha, which is derived from pairwise item correlations. Internal consistency can theoretically range from negative infinity to 1. Negative values occur when within-subject variability exceeds between-subject variability. Only positive values of Cronbach’s alpha are interpretable.

Cronbach’s alpha increases as inter-item correlations increase.

Cronbach's alpha Internal consistency
\(\alpha \geq 0.9\) Excellent (High-stakes testing)
\(0.7 \leq \alpha < 0.9\) Good (Low-stakes testing)
\(0.6 \leq \alpha < 0.7\) Acceptable
\(0.5 \leq \alpha < 0.6\) Poor
\(\alpha < 0.5\) Unacceptable

Other Measures

Intra-class correlation (ICC) assesses the consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. Broadly, ICC is defined as\[ \text{ICC} = \frac{\text{Variance due to rated subjects (patients)}}{\text{Variance due to subjects} + \text{Variance due to judges} + \text{Residual variance}}. \]

Example: Suppose 4 nurses rate 6 patients on a 10-point depression scale:

PatientID NurseRater1 NurseRater2 NurseRater3 NurseRater4
1 9 2 5 8
2 6 1 3 2
3 8 4 6 8
4 7 1 2 6
5 10 5 6 9
6 6 2 4 7

This data can also be formatted in long form:

PatientID Rating Nurse
1 9 1
2 6 1
3 8 1
4 7 1
5 10 1
6 6 1
7 2 2
8 1 2
9 4 2
10 1 2
11 5 2
12 2 2
13 5 3
14 3 3
15 6 3
16 2 3
17 6 3
18 4 3
19 8 4
20 2 4
21 8 4
22 6 4
23 9 4
24 7 4
install.packages("ICC")
library("ICC")

# Load data (adjust path as needed)
dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE)
dataset <- dataset[, -1]  # Remove PatientID column

# Fit ICC model
icc_result <- ICCest(Rating, Nurse, data = dataset)
icc_result
# ICC: -0.4804401
# 95% CI: (-0.656, -0.035)

Cronbach’s alpha equals the stepped-up intra-class correlation coefficient in observational studies if and only if the item variance component is zero. If this component is negative, alpha underestimates the ICC; if positive, it overestimates it.

Generalizability Theory

Cronbach’s alpha is an unbiased estimate of generalizability. It can be interpreted as a measure of how well the sum score on selected items captures the expected score over the entire domain—even if the domain is heterogeneous.

Problems with Cronbach’s Alpha

- Alpha depends not only on the magnitude of inter-item correlations but also on the number of items. Scales can appear more homogeneous simply by adding more items, even if average correlation remains unchanged. - Combining two distinct constructs into one scale may yield a high alpha despite measuring two different attributes. - Excessively high alpha (\(> 0.95\)) may indicate item redundancy.

Split-Half Reliability

In split-half reliability, the test is divided into two halves (e.g., odd vs. even items). The correlation between halves is adjusted using the Spearman–Brown prophecy formula\[ r' = \frac{n r}{(n - 1)(r + 1)}, \] where \(r\) is the raw correlation between halves and \(n = 2\).

Example:

Index Q1 Q2 Q3 Q4 Q5 Q6 Odd Even
1 1 0 0 1 1 0 2 1
2 1 1 0 1 0 1 1 3
3 1 1 1 1 1 0 3 2
4 1 0 0 0 1 0 2 0
5 1 1 1 1 0 0 2 2
6 0 0 0 0 1 0 1 0
colspan=6 rowspan=4 mean 1.833 1.333
SD 0.753 1.211
corr(Even, Odd) 0.073
AdjCorr = \(\frac{n r}{(n-1)(r+1)}\) 0.136

KR-20

The Kuder–Richardson Formula 20 (KR-20) is a reliability estimate for dichotomous items\[ \text{KR-20} = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma_X^2} \right), \] where \(p_i\) is the proportion of correct responses to item \(i\), \(q_i = 1 - p_i\), and \(\sigma_X^2\) is the sample variance of total scores.

KR-20 is a special case of Cronbach’s alpha for binary data. It cannot accommodate partial credit or ordinal responses.

Standard Error of Measurement (SEM)

The greater the reliability, the smaller the SEM\[ \text{SEM} = S \sqrt{1 - r_{xx}}, \] where \(r_{xx}\) is the reliability and \(S\) is the standard deviation of observed scores.

Applications

- A study on Kolb’s revised Learning Style Inventory with 221 business students supports the internal reliability of its scales and discusses factor analysis with ipsative data. - Research comparing multi-item scales to single-item questions demonstrates that single items yield unreliable results and should not be used for construct inference.

Software

SOCR Cronbach's alpha calculator webapp (coming soon).

In R, the `psych` package is recommended over older alternatives like `psy`. Example using the `expsy` dataset:

# Load required packages
library(psy)
data(expsy)

# Compute alpha for first 10 items
cronbach(expsy[, 1:10])
# Result: alpha ≈ 0.176 — low due to reversed item (item 2)

# Reverse item 2 (assuming 1=high, 4=low)
revised_data <- cbind(expsy[, c(1, 3:10)], -1 * expsy[, 2])
cronbach(revised_data)
# Result: alpha ≈ 0.375

# Bootstrap 95% CI
library(boot)
cronbach.boot <- function(data, indices) {
  cronbach(data[indices, ])[[3]]  # extract alpha
}
res <- boot(revised_data, cronbach.boot, R = 1000)
quantile(res$t, c(0.025, 0.975))
# e.g., [-0.30, 0.63]

boot.ci(res, type = "bca")
# BCa CI: (-0.15, 0.67)

The coefficientalpha R package offers robust estimation with missing data and non-normality, including standard errors and confidence intervals.

Cronbach's \(\alpha\) Calculations

The table below illustrates the core structure for computing Cronbach’s alpha.

Subjects Items/Questions Total Score
\(Q_1\) \(Q_2\) \(Q_k\)
\(S_1\) \(Y_{1,1}\) \(Y_{1,2}\) \(Y_{1,k}\) \(X_1 = \sum_{j=1}^k Y_{1,j}\)
\(S_2\) \(Y_{2,1}\) \(Y_{2,2}\) \(Y_{2,k}\) \(X_2 = \sum_{j=1}^k Y_{2,j}\)
\(S_n\) \(Y_{n,1}\) \(Y_{n,2}\) \(Y_{n,k}\) \(X_n = \sum_{j=1}^k Y_{n,j}\)
Variance \(\sigma_{Y_{.,1}}^2\) \(\sigma_{Y_{.,2}}^2\) \(\sigma_{Y_{.,k}}^2\) \(\sigma_X^2\)

Cronbach's \(\alpha\) Inference

Cronbach’s alpha is a point estimate. Its standard error enables interval estimation and hypothesis testing.

Parametric CIs use the Pearson correlation matrix (assumes normality).
Non-parametric CIs use Spearman correlations (robust to non-normality).
For ordinal data, consider ordinal alpha (Zumbo et al.), which uses polychoric correlations.

The general formula for alpha is\[ \alpha = \frac{N}{N - 1} \left( 1 - \frac{\sum_{j=1}^N V(Y_j)}{V\left( \sum_{j=1}^N Y_j \right)} \right). \]

The estimated variance of \(\hat{\alpha}\) is\[ \hat{\sigma}^2_{\hat{\alpha}} = \frac{N^2}{k(N - 1)^2} \cdot d, \] where \(d\) depends on the covariance matrix \(S\), its trace, and the vector of ones \(j\).

A \((1 - \gamma)100\%\) confidence interval is\[ \left( \hat{\alpha} - z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}},\ \hat{\alpha} + z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}} \right). \]

Problems

Use the Turkiye Student Course Evaluation survey (N=5,000) to compute the ICC and Cronbach’s alpha.

References

Cronbach's alpha – Wikipedia
KR-20 – Wikipedia
Tsagris, M. (2014). Confidence intervals for Cronbach’s reliability coefficient. ResearchGate.
Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of coefficients alpha and theta. Practical Assessment, Research & Evaluation, 12(13).

SOCR Home page: http://www.socr.umich.edu



Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif