Difference between revisions of "SMHS Cronbachs"
(→Cronbach’s Alpha) |
m (→Software) |
||
| (7 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| − | ==[[SMHS| Scientific Methods for Health Sciences]] | + | == [[SMHS|Scientific Methods for Health Sciences]] – Instrument Performance Evaluation: Cronbach's α == |
| − | ===Overview | + | === Overview === |
| − | Cronbach’s alpha | + | Cronbach’s alpha <math>\alpha</math> is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and assesses whether several items that propose to measure the same general construct produce similar scores. Cronbach’s alpha is widely used in the social sciences, nursing, business, and other disciplines. This section presents a general introduction to Cronbach’s alpha, including its calculation, application in research, and common issues in its use. |
| − | ===Motivation | + | === Motivation === |
| − | We have discussed | + | We have discussed internal and external consistency and their importance in research. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap among patients suffering from a certain disease. The dataset contains 10 records measuring the degree of difficulty experienced in carrying out daily activities. Each item is scored from 1 (no difficulty) to 4 (can’t do). When these data are used to form a scale, they should exhibit internal consistency—all items should measure the same underlying construct and thus be correlated with one another. Cronbach’s alpha generally increases as the correlations between items increase. |
| + | === Theory === | ||
| − | + | ==== Cronbach’s Alpha ==== | |
| − | ====Cronbach’s Alpha==== | + | Cronbach’s alpha is a measure of internal consistency or reliability of a psychometric instrument and assesses how well a set of items measures a single, one-dimensional latent trait. |
| − | Cronbach’s | ||
| − | + | Suppose we measure a quantity <math>X</math>, which is the sum of <math>K</math> components: | |
| + | <math>X = Y_1 + Y_2 + \cdots + Y_K</math>. | ||
| + | Then Cronbach’s alpha is defined as: | ||
| + | <math> | ||
| + | \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} \sigma_{Y_i}^2}{\sigma_X^2} \right), | ||
| + | </math> | ||
| + | where <math>\sigma_X^2</math> is the variance of the observed total test scores and <math>\sigma_{Y_i}^2</math> is the variance of component <math>i</math> in the current sample. | ||
| − | + | If items are scored from 0 to 1, then: | |
| + | <math> | ||
| + | \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} P_i Q_i}{\sigma_X^2} \right), | ||
| + | </math> | ||
| + | where <math>P_i</math> is the proportion scoring 1 on item <math>i</math>, and <math>Q_i = 1 - P_i</math>. | ||
| − | + | Alternatively, Cronbach’s alpha can be expressed as: | |
| + | <math> | ||
| + | \alpha = \frac{K \bar{c}}{\bar{v} + (K - 1) \bar{c}}, | ||
| + | </math> | ||
| + | where <math>\bar{v}</math> is the average variance of each component and <math>\bar{c}</math> is the average covariance between all item pairs. | ||
| − | + | The standardized Cronbach’s alpha is: | |
| + | <math> | ||
| + | \alpha_{\text{standardized}} = \frac{K \bar{r}}{1 + (K - 1) \bar{r}}, | ||
| + | </math> | ||
| + | where <math>\bar{r}</math> is the mean of the <math>K(K - 1)/2</math> non-redundant correlation coefficients (e.g., from the upper triangle of the correlation matrix). | ||
| − | ====Internal | + | The theoretical value of alpha ranges from 0 to 1, as it is a ratio of variances. Reliability of test scores is defined as: |
| − | Internal consistency | + | <math> |
| + | \rho_{XX} = \frac{\sigma_T^2}{\sigma_X^2}, | ||
| + | </math> | ||
| + | the ratio of true-score variance to total-score variance. | ||
| + | |||
| + | ==== Internal Consistency ==== | ||
| + | Internal consistency measures whether several items hypothesized to reflect the same construct yield similar scores. It is usually quantified using Cronbach’s alpha, which is derived from pairwise item correlations. Internal consistency can theoretically range from negative infinity to 1. Negative values occur when within-subject variability exceeds between-subject variability. Only positive values of Cronbach’s alpha are interpretable. | ||
| + | |||
| + | Cronbach’s alpha increases as inter-item correlations increase. | ||
<center> | <center> | ||
{| class="wikitable" style="text-align:center; width:35%" border="1" | {| class="wikitable" style="text-align:center; width:35%" border="1" | ||
|- | |- | ||
| − | |Cronbach's alpha|| Internal consistency | + | | Cronbach's alpha || Internal consistency |
|- | |- | ||
| − | | | + | | <math>\alpha \geq 0.9</math> || Excellent (High-stakes testing) |
|- | |- | ||
| − | |0.7 | + | | <math>0.7 \leq \alpha < 0.9</math> || Good (Low-stakes testing) |
|- | |- | ||
| − | |0.6 | + | | <math>0.6 \leq \alpha < 0.7</math> || Acceptable |
|- | |- | ||
| − | |0.5 | + | | <math>0.5 \leq \alpha < 0.6</math> || Poor |
| − | |||
| − | |||
|- | |- | ||
| + | | <math>\alpha < 0.5</math> || Unacceptable | ||
|} | |} | ||
</center> | </center> | ||
| − | ====Other Measures==== | + | ==== Other Measures ==== |
| − | |||
| − | |||
| − | + | Intra-class correlation (ICC) assesses the consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. Broadly, ICC is defined as: | |
| + | <math> | ||
| + | \text{ICC} = \frac{\text{Variance due to rated subjects (patients)}}{\text{Variance due to subjects} + \text{Variance due to judges} + \text{Residual variance}}. | ||
| + | </math> | ||
| + | |||
| + | Example: Suppose 4 nurses rate 6 patients on a 10-point depression scale: | ||
<center> | <center> | ||
{| class="wikitable" style="text-align:center; width:75%" border="1" | {| class="wikitable" style="text-align:center; width:75%" border="1" | ||
|- | |- | ||
| − | !PatientID||NurseRater1||NurseRater2||NurseRater3||NurseRater4 | + | ! PatientID || NurseRater1 || NurseRater2 || NurseRater3 || NurseRater4 |
|- | |- | ||
| − | |1||9||2||5||8 | + | | 1 || 9 || 2 || 5 || 8 |
|- | |- | ||
| − | |2||6||1||3||2 | + | | 2 || 6 || 1 || 3 || 2 |
|- | |- | ||
| − | |3||8||4||6||8 | + | | 3 || 8 || 4 || 6 || 8 |
|- | |- | ||
| − | |4||7||1||2||6 | + | | 4 || 7 || 1 || 2 || 6 |
|- | |- | ||
| − | |5||10||5||6||9 | + | | 5 || 10 || 5 || 6 || 9 |
|- | |- | ||
| − | |6||6||2||4||7 | + | | 6 || 6 || 2 || 4 || 7 |
| − | |}</center> | + | |} |
| + | </center> | ||
| + | |||
| + | This data can also be formatted in long form: | ||
| − | |||
<center> | <center> | ||
{| class="wikitable" style="text-align:center; width:75%" border="1" | {| class="wikitable" style="text-align:center; width:75%" border="1" | ||
|- | |- | ||
| − | !PatientID||Rating||Nurse | + | ! PatientID || Rating || Nurse |
|- | |- | ||
| − | |1||9||1 | + | | 1 || 9 || 1 |
|- | |- | ||
| − | |2||6||1 | + | | 2 || 6 || 1 |
|- | |- | ||
| − | |3||8||1 | + | | 3 || 8 || 1 |
|- | |- | ||
| − | |4||7||1 | + | | 4 || 7 || 1 |
|- | |- | ||
| − | |5||10||1 | + | | 5 || 10 || 1 |
|- | |- | ||
| − | |6||6||1 | + | | 6 || 6 || 1 |
|- | |- | ||
| − | |7||2||2 | + | | 7 || 2 || 2 |
|- | |- | ||
| − | |8||1||2 | + | | 8 || 1 || 2 |
|- | |- | ||
| − | |9||4||2 | + | | 9 || 4 || 2 |
|- | |- | ||
| − | |10||1||2 | + | | 10 || 1 || 2 |
|- | |- | ||
| − | |11||5||2 | + | | 11 || 5 || 2 |
|- | |- | ||
| − | |12||2||2 | + | | 12 || 2 || 2 |
|- | |- | ||
| − | |13||5||3 | + | | 13 || 5 || 3 |
|- | |- | ||
| − | |14||3||3 | + | | 14 || 3 || 3 |
|- | |- | ||
| − | |15||6||3 | + | | 15 || 6 || 3 |
|- | |- | ||
| − | |16||2||3 | + | | 16 || 2 || 3 |
|- | |- | ||
| − | |17||6||3 | + | | 17 || 6 || 3 |
|- | |- | ||
| − | |18||4||3 | + | | 18 || 4 || 3 |
|- | |- | ||
| − | |19||8||4 | + | | 19 || 8 || 4 |
|- | |- | ||
| − | |20||2||4 | + | | 20 || 2 || 4 |
|- | |- | ||
| − | |21||8||4 | + | | 21 || 8 || 4 |
|- | |- | ||
| − | |22||6||4 | + | | 22 || 6 || 4 |
|- | |- | ||
| − | |23||9||4 | + | | 23 || 9 || 4 |
|- | |- | ||
| − | |24||7||4 | + | | 24 || 7 || 4 |
|} | |} | ||
</center> | </center> | ||
| − | + | <pre> | |
| − | + | install.packages("ICC") | |
| − | + | library("ICC") | |
| − | + | ||
| − | + | # Load data (adjust path as needed) | |
| − | + | dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE) | |
| − | + | dataset <- dataset[, -1] # Remove PatientID column | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | + | # Fit ICC model | |
| − | + | icc_result <- ICCest(Rating, Nurse, data = dataset) | |
| + | icc_result | ||
| + | # ICC: -0.4804401 | ||
| + | # 95% CI: (-0.656, -0.035) | ||
| + | </pre> | ||
| − | Cronbach’s alpha equals | + | Cronbach’s alpha equals the stepped-up intra-class correlation coefficient in observational studies if and only if the item variance component is zero. If this component is negative, alpha underestimates the ICC; if positive, it overestimates it. |
| − | ====Generalizability | + | ==== Generalizability Theory ==== |
| − | Cronbach’s alpha is an unbiased estimate of | + | Cronbach’s alpha is an unbiased estimate of generalizability. It can be interpreted as a measure of how well the sum score on selected items captures the expected score over the entire domain—even if the domain is heterogeneous. |
| − | ====Problems with Cronbach’s | + | ==== Problems with Cronbach’s Alpha ==== |
| − | + | - Alpha depends not only on the magnitude of inter-item correlations but also on the number of items. Scales can appear more homogeneous simply by adding more items, even if average correlation remains unchanged. | |
| − | + | - Combining two distinct constructs into one scale may yield a high alpha despite measuring two different attributes. | |
| − | + | - Excessively high alpha (<math>> 0.95</math>) may indicate item redundancy. | |
| − | ====Split-Half Reliability==== | + | ==== Split-Half Reliability ==== |
| − | In | + | In split-half reliability, the test is divided into two halves (e.g., odd vs. even items). The correlation between halves is adjusted using the Spearman–Brown prophecy formula: |
| + | <math> | ||
| + | r' = \frac{n r}{(n - 1)(r + 1)}, | ||
| + | </math> | ||
| + | where <math>r</math> is the raw correlation between halves and <math>n = 2</math>. | ||
Example: | Example: | ||
| Line 156: | Line 188: | ||
{| class="wikitable" style="text-align:center; width:35%" border="1" | {| class="wikitable" style="text-align:center; width:35%" border="1" | ||
|- | |- | ||
| − | |Index|| Q1|| Q2|| Q3|| Q4|| Q5|| Q6|| Odd|| Even | + | | Index || Q1 || Q2 || Q3 || Q4 || Q5 || Q6 || Odd || Even |
|- | |- | ||
| − | |1 ||1|| 0|| 0|| 1|| 1|| 0|| 2|| 1 | + | | 1 || 1 || 0 || 0 || 1 || 1 || 0 || 2 || 1 |
|- | |- | ||
| − | |2|| 1|| 1 ||0 ||1|| 0 ||1|| 1|| 3 | + | | 2 || 1 || 1 || 0 || 1 || 0 || 1 || 1 || 3 |
|- | |- | ||
| − | |3|| 1|| 1|| 1|| 1|| 1|| 0|| 3|| 2 | + | | 3 || 1 || 1 || 1 || 1 || 1 || 0 || 3 || 2 |
|- | |- | ||
| − | |4 ||1 ||0 ||0 ||0 ||1 ||0|| 2|| 0 | + | | 4 || 1 || 0 || 0 || 0 || 1 || 0 || 2 || 0 |
|- | |- | ||
| − | |5|| 1|| 1|| 1|| 1|| 0|| 0|| 2|| 2 | + | | 5 || 1 || 1 || 1 || 1 || 0 || 0 || 2 || 2 |
|- | |- | ||
| − | |6 ||0|| 0 ||0 ||0 ||1 ||0 ||1|| 0 | + | | 6 || 0 || 0 || 0 || 0 || 1 || 0 || 1 || 0 |
|- | |- | ||
| − | | colspan=6 rowspan=4 | + | | colspan=6 rowspan=4 || mean || 1.833 || 1.333 |
|- | |- | ||
| − | | SD|| 0. | + | | SD || 0.753 || 1.211 |
| − | |- | + | |- |
| − | | corr(Even,Odd)|| 0. | + | | corr(Even, Odd) || 0.073 || |
| − | |- | + | |- |
| − | | AdjCorr | + | | AdjCorr = <math>\frac{n r}{(n-1)(r+1)}</math> || 0.136 || |
| − | | | ||
|} | |} | ||
</center> | </center> | ||
| − | ====KR-20==== | + | ==== KR-20 ==== |
| − | The | + | The Kuder–Richardson Formula 20 (KR-20) is a reliability estimate for dichotomous items: |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | === | + | <math> |
| + | \text{KR-20} = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma_X^2} \right), | ||
| + | </math> | ||
| + | where <math>p_i</math> is the proportion of correct responses to item <math>i</math>, <math>q_i = 1 - p_i</math>, and <math>\sigma_X^2</math> is the sample variance of total scores. | ||
| − | + | KR-20 is a special case of Cronbach’s alpha for binary data. It cannot accommodate partial credit or ordinal responses. | |
| − | + | ==== Standard Error of Measurement (SEM) ==== | |
| + | The greater the reliability, the smaller the SEM: | ||
| − | = | + | <math> |
| + | \text{SEM} = S \sqrt{1 - r_{xx}}, | ||
| + | </math> | ||
| + | where <math>r_{xx}</math> is the reliability and <math>S</math> is the standard deviation of observed scores. | ||
| − | + | === Applications === | |
| + | - A study on Kolb’s revised Learning Style Inventory with 221 business students supports the internal reliability of its scales and discusses factor analysis with ipsative data. | ||
| + | - Research comparing multi-item scales to single-item questions demonstrates that single items yield unreliable results and should not be used for construct inference. | ||
| − | + | === Software === | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | + | SOCR Cronbach's alpha calculator webapp (coming soon). | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | + | In R, the `psych` package is recommended over older alternatives like `psy`. Example using the `expsy` dataset: | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | + | <pre> | |
| − | + | # Load required packages | |
| − | + | library(psy) | |
| + | data(expsy) | ||
| − | + | # Compute alpha for first 10 items | |
| − | + | cronbach(expsy[, 1:10]) | |
| − | + | # Result: alpha ≈ 0.176 — low due to reversed item (item 2) | |
| − | |||
| − | |||
| − | |||
| − | + | # Reverse item 2 (assuming 1=high, 4=low) | |
| − | + | revised_data <- cbind(expsy[, c(1, 3:10)], -1 * expsy[, 2]) | |
| − | + | cronbach(revised_data) | |
| − | + | # Result: alpha ≈ 0.375 | |
| − | |||
| − | |||
| − | |||
| − | + | # Bootstrap 95% CI | |
| − | + | library(boot) | |
| − | + | cronbach.boot <- function(data, indices) { | |
| − | + | cronbach(data[indices, ])[[3]] # extract alpha | |
| − | + | } | |
| + | res <- boot(revised_data, cronbach.boot, R = 1000) | ||
| + | quantile(res$t, c(0.025, 0.975)) | ||
| + | # e.g., [-0.30, 0.63] | ||
| − | + | boot.ci(res, type = "bca") | |
| − | + | # BCa CI: (-0.15, 0.67) | |
| − | + | </pre> | |
| − | |||
| − | |||
| − | + | The '''coefficientalpha''' R package offers robust estimation with missing data and non-normality, including standard errors and confidence intervals. | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | + | === Cronbach's <math>\alpha</math> Calculations === | |
| − | + | The table below illustrates the core structure for computing Cronbach’s alpha. | |
| − | The table below illustrates the | ||
<center> | <center> | ||
{| align="center" border="1" | {| align="center" border="1" | ||
|- | |- | ||
| − | | | + | | rowspan="2" | Subjects || colspan="4" align="center" | Items/Questions || rowspan="2" | Total Score |
|- | |- | ||
| − | | | + | | <math>Q_1</math> || <math>Q_2</math> || … || <math>Q_k</math> |
|- | |- | ||
| − | | | + | | <math>S_1</math> || <math>Y_{1,1}</math> || <math>Y_{1,2}</math> || … || <math>Y_{1,k}</math> || <math>X_1 = \sum_{j=1}^k Y_{1,j}</math> |
|- | |- | ||
| − | | | + | | <math>S_2</math> || <math>Y_{2,1}</math> || <math>Y_{2,2}</math> || … || <math>Y_{2,k}</math> || <math>X_2 = \sum_{j=1}^k Y_{2,j}</math> |
|- | |- | ||
| − | | | + | | … || … || … || … || … || … |
|- | |- | ||
| − | | | + | | <math>S_n</math> || <math>Y_{n,1}</math> || <math>Y_{n,2}</math> || … || <math>Y_{n,k}</math> || <math>X_n = \sum_{j=1}^k Y_{n,j}</math> |
|- | |- | ||
| − | | | + | | Variance || <math>\sigma_{Y_{.,1}}^2</math> || <math>\sigma_{Y_{.,2}}^2</math> || … || <math>\sigma_{Y_{.,k}}^2</math> || <math>\sigma_X^2</math> |
|} | |} | ||
</center> | </center> | ||
| − | ===Cronbach's | + | === Cronbach's <math>\alpha</math> Inference === |
| − | + | Cronbach’s alpha is a point estimate. Its standard error enables interval estimation and hypothesis testing. | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | : | + | : Parametric CIs use the Pearson correlation matrix (assumes normality). |
| − | + | : Non-parametric CIs use Spearman correlations (robust to non-normality). | |
| − | + | : For ordinal data, consider ordinal alpha (Zumbo et al.), which uses polychoric correlations. | |
| − | : | + | The general formula for alpha is: |
| + | <math> | ||
| + | \alpha = \frac{N}{N - 1} \left( 1 - \frac{\sum_{j=1}^N V(Y_j)}{V\left( \sum_{j=1}^N Y_j \right)} \right). | ||
| + | </math> | ||
| − | + | The estimated variance of <math>\hat{\alpha}</math> is: | |
| − | + | <math> | |
| − | + | \hat{\sigma}^2_{\hat{\alpha}} = \frac{N^2}{k(N - 1)^2} \cdot d, | |
| + | </math> | ||
| + | where <math>d</math> depends on the covariance matrix <math>S</math>, its trace, and the vector of ones <math>j</math>. | ||
| − | + | A <math>(1 - \gamma)100\%</math> confidence interval is: | |
| + | <math> | ||
| + | \left( \hat{\alpha} - z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}},\ | ||
| + | \hat{\alpha} + z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}} \right). | ||
| + | </math> | ||
| + | === Problems === | ||
| + | Use the [[SOCR_TurkiyeStudentEvalData|Turkiye Student Course Evaluation survey (N=5,000)]] to compute the ICC and Cronbach’s alpha. | ||
| − | ===References=== | + | === References === |
| − | |||
| − | |||
| + | : [https://en.wikipedia.org/wiki/Cronbach%27s_alpha Cronbach's alpha – Wikipedia] | ||
| + | : [https://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_Formula_20 KR-20 – Wikipedia] | ||
| + | : Tsagris, M. (2014). Confidence intervals for Cronbach’s reliability coefficient. ResearchGate. | ||
| + | : Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of coefficients alpha and theta. Practical Assessment, Research & Evaluation, 12(13). | ||
<hr> | <hr> | ||
| − | + | SOCR Home page: http://www.socr.umich.edu | |
| − | |||
{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_Cronbachs}} | {{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_Cronbachs}} | ||
Latest revision as of 17:30, 10 February 2026
Contents
Scientific Methods for Health Sciences – Instrument Performance Evaluation: Cronbach's α
Overview
Cronbach’s alpha \(\alpha\) is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and assesses whether several items that propose to measure the same general construct produce similar scores. Cronbach’s alpha is widely used in the social sciences, nursing, business, and other disciplines. This section presents a general introduction to Cronbach’s alpha, including its calculation, application in research, and common issues in its use.
Motivation
We have discussed internal and external consistency and their importance in research. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap among patients suffering from a certain disease. The dataset contains 10 records measuring the degree of difficulty experienced in carrying out daily activities. Each item is scored from 1 (no difficulty) to 4 (can’t do). When these data are used to form a scale, they should exhibit internal consistency—all items should measure the same underlying construct and thus be correlated with one another. Cronbach’s alpha generally increases as the correlations between items increase.
Theory
Cronbach’s Alpha
Cronbach’s alpha is a measure of internal consistency or reliability of a psychometric instrument and assesses how well a set of items measures a single, one-dimensional latent trait.
Suppose we measure a quantity \(X\), which is the sum of \(K\) components\[X = Y_1 + Y_2 + \cdots + Y_K\]. Then Cronbach’s alpha is defined as\[ \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} \sigma_{Y_i}^2}{\sigma_X^2} \right), \] where \(\sigma_X^2\) is the variance of the observed total test scores and \(\sigma_{Y_i}^2\) is the variance of component \(i\) in the current sample.
If items are scored from 0 to 1, then\[ \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} P_i Q_i}{\sigma_X^2} \right), \] where \(P_i\) is the proportion scoring 1 on item \(i\), and \(Q_i = 1 - P_i\).
Alternatively, Cronbach’s alpha can be expressed as\[ \alpha = \frac{K \bar{c}}{\bar{v} + (K - 1) \bar{c}}, \] where \(\bar{v}\) is the average variance of each component and \(\bar{c}\) is the average covariance between all item pairs.
The standardized Cronbach’s alpha is\[ \alpha_{\text{standardized}} = \frac{K \bar{r}}{1 + (K - 1) \bar{r}}, \] where \(\bar{r}\) is the mean of the \(K(K - 1)/2\) non-redundant correlation coefficients (e.g., from the upper triangle of the correlation matrix).
The theoretical value of alpha ranges from 0 to 1, as it is a ratio of variances. Reliability of test scores is defined as\[ \rho_{XX} = \frac{\sigma_T^2}{\sigma_X^2}, \] the ratio of true-score variance to total-score variance.
Internal Consistency
Internal consistency measures whether several items hypothesized to reflect the same construct yield similar scores. It is usually quantified using Cronbach’s alpha, which is derived from pairwise item correlations. Internal consistency can theoretically range from negative infinity to 1. Negative values occur when within-subject variability exceeds between-subject variability. Only positive values of Cronbach’s alpha are interpretable.
Cronbach’s alpha increases as inter-item correlations increase.
| Cronbach's alpha | Internal consistency |
| \(\alpha \geq 0.9\) | Excellent (High-stakes testing) |
| \(0.7 \leq \alpha < 0.9\) | Good (Low-stakes testing) |
| \(0.6 \leq \alpha < 0.7\) | Acceptable |
| \(0.5 \leq \alpha < 0.6\) | Poor |
| \(\alpha < 0.5\) | Unacceptable |
Other Measures
Intra-class correlation (ICC) assesses the consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. Broadly, ICC is defined as\[ \text{ICC} = \frac{\text{Variance due to rated subjects (patients)}}{\text{Variance due to subjects} + \text{Variance due to judges} + \text{Residual variance}}. \]
Example: Suppose 4 nurses rate 6 patients on a 10-point depression scale:
| PatientID | NurseRater1 | NurseRater2 | NurseRater3 | NurseRater4 |
|---|---|---|---|---|
| 1 | 9 | 2 | 5 | 8 |
| 2 | 6 | 1 | 3 | 2 |
| 3 | 8 | 4 | 6 | 8 |
| 4 | 7 | 1 | 2 | 6 |
| 5 | 10 | 5 | 6 | 9 |
| 6 | 6 | 2 | 4 | 7 |
This data can also be formatted in long form:
| PatientID | Rating | Nurse |
|---|---|---|
| 1 | 9 | 1 |
| 2 | 6 | 1 |
| 3 | 8 | 1 |
| 4 | 7 | 1 |
| 5 | 10 | 1 |
| 6 | 6 | 1 |
| 7 | 2 | 2 |
| 8 | 1 | 2 |
| 9 | 4 | 2 |
| 10 | 1 | 2 |
| 11 | 5 | 2 |
| 12 | 2 | 2 |
| 13 | 5 | 3 |
| 14 | 3 | 3 |
| 15 | 6 | 3 |
| 16 | 2 | 3 |
| 17 | 6 | 3 |
| 18 | 4 | 3 |
| 19 | 8 | 4 |
| 20 | 2 | 4 |
| 21 | 8 | 4 |
| 22 | 6 | 4 |
| 23 | 9 | 4 |
| 24 | 7 | 4 |
install.packages("ICC")
library("ICC")
# Load data (adjust path as needed)
dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE)
dataset <- dataset[, -1] # Remove PatientID column
# Fit ICC model
icc_result <- ICCest(Rating, Nurse, data = dataset)
icc_result
# ICC: -0.4804401
# 95% CI: (-0.656, -0.035)
Cronbach’s alpha equals the stepped-up intra-class correlation coefficient in observational studies if and only if the item variance component is zero. If this component is negative, alpha underestimates the ICC; if positive, it overestimates it.
Generalizability Theory
Cronbach’s alpha is an unbiased estimate of generalizability. It can be interpreted as a measure of how well the sum score on selected items captures the expected score over the entire domain—even if the domain is heterogeneous.
Problems with Cronbach’s Alpha
- Alpha depends not only on the magnitude of inter-item correlations but also on the number of items. Scales can appear more homogeneous simply by adding more items, even if average correlation remains unchanged. - Combining two distinct constructs into one scale may yield a high alpha despite measuring two different attributes. - Excessively high alpha (\(> 0.95\)) may indicate item redundancy.
Split-Half Reliability
In split-half reliability, the test is divided into two halves (e.g., odd vs. even items). The correlation between halves is adjusted using the Spearman–Brown prophecy formula\[ r' = \frac{n r}{(n - 1)(r + 1)}, \] where \(r\) is the raw correlation between halves and \(n = 2\).
Example:
| Index | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Odd | Even |
| 1 | 1 | 0 | 0 | 1 | 1 | 0 | 2 | 1 |
| 2 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 3 |
| 3 | 1 | 1 | 1 | 1 | 1 | 0 | 3 | 2 |
| 4 | 1 | 0 | 0 | 0 | 1 | 0 | 2 | 0 |
| 5 | 1 | 1 | 1 | 1 | 0 | 0 | 2 | 2 |
| 6 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
| colspan=6 rowspan=4 | mean | 1.833 | 1.333 | |||||
| SD | 0.753 | 1.211 | ||||||
| corr(Even, Odd) | 0.073 | |||||||
| AdjCorr = \(\frac{n r}{(n-1)(r+1)}\) | 0.136 |
KR-20
The Kuder–Richardson Formula 20 (KR-20) is a reliability estimate for dichotomous items\[ \text{KR-20} = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma_X^2} \right), \] where \(p_i\) is the proportion of correct responses to item \(i\), \(q_i = 1 - p_i\), and \(\sigma_X^2\) is the sample variance of total scores.
KR-20 is a special case of Cronbach’s alpha for binary data. It cannot accommodate partial credit or ordinal responses.
Standard Error of Measurement (SEM)
The greater the reliability, the smaller the SEM\[ \text{SEM} = S \sqrt{1 - r_{xx}}, \] where \(r_{xx}\) is the reliability and \(S\) is the standard deviation of observed scores.
Applications
- A study on Kolb’s revised Learning Style Inventory with 221 business students supports the internal reliability of its scales and discusses factor analysis with ipsative data. - Research comparing multi-item scales to single-item questions demonstrates that single items yield unreliable results and should not be used for construct inference.
Software
SOCR Cronbach's alpha calculator webapp (coming soon).
In R, the `psych` package is recommended over older alternatives like `psy`. Example using the `expsy` dataset:
# Load required packages
library(psy)
data(expsy)
# Compute alpha for first 10 items
cronbach(expsy[, 1:10])
# Result: alpha ≈ 0.176 — low due to reversed item (item 2)
# Reverse item 2 (assuming 1=high, 4=low)
revised_data <- cbind(expsy[, c(1, 3:10)], -1 * expsy[, 2])
cronbach(revised_data)
# Result: alpha ≈ 0.375
# Bootstrap 95% CI
library(boot)
cronbach.boot <- function(data, indices) {
cronbach(data[indices, ])[[3]] # extract alpha
}
res <- boot(revised_data, cronbach.boot, R = 1000)
quantile(res$t, c(0.025, 0.975))
# e.g., [-0.30, 0.63]
boot.ci(res, type = "bca")
# BCa CI: (-0.15, 0.67)
The coefficientalpha R package offers robust estimation with missing data and non-normality, including standard errors and confidence intervals.
Cronbach's \(\alpha\) Calculations
The table below illustrates the core structure for computing Cronbach’s alpha.
| Subjects | Items/Questions | Total Score | |||
| \(Q_1\) | \(Q_2\) | … | \(Q_k\) | ||
| \(S_1\) | \(Y_{1,1}\) | \(Y_{1,2}\) | … | \(Y_{1,k}\) | \(X_1 = \sum_{j=1}^k Y_{1,j}\) |
| \(S_2\) | \(Y_{2,1}\) | \(Y_{2,2}\) | … | \(Y_{2,k}\) | \(X_2 = \sum_{j=1}^k Y_{2,j}\) |
| … | … | … | … | … | … |
| \(S_n\) | \(Y_{n,1}\) | \(Y_{n,2}\) | … | \(Y_{n,k}\) | \(X_n = \sum_{j=1}^k Y_{n,j}\) |
| Variance | \(\sigma_{Y_{.,1}}^2\) | \(\sigma_{Y_{.,2}}^2\) | … | \(\sigma_{Y_{.,k}}^2\) | \(\sigma_X^2\) |
Cronbach's \(\alpha\) Inference
Cronbach’s alpha is a point estimate. Its standard error enables interval estimation and hypothesis testing.
- Parametric CIs use the Pearson correlation matrix (assumes normality).
- Non-parametric CIs use Spearman correlations (robust to non-normality).
- For ordinal data, consider ordinal alpha (Zumbo et al.), which uses polychoric correlations.
The general formula for alpha is\[ \alpha = \frac{N}{N - 1} \left( 1 - \frac{\sum_{j=1}^N V(Y_j)}{V\left( \sum_{j=1}^N Y_j \right)} \right). \]
The estimated variance of \(\hat{\alpha}\) is\[ \hat{\sigma}^2_{\hat{\alpha}} = \frac{N^2}{k(N - 1)^2} \cdot d, \] where \(d\) depends on the covariance matrix \(S\), its trace, and the vector of ones \(j\).
A \((1 - \gamma)100\%\) confidence interval is\[ \left( \hat{\alpha} - z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}},\ \hat{\alpha} + z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}} \right). \]
Problems
Use the Turkiye Student Course Evaluation survey (N=5,000) to compute the ICC and Cronbach’s alpha.
References
- Cronbach's alpha – Wikipedia
- KR-20 – Wikipedia
- Tsagris, M. (2014). Confidence intervals for Cronbach’s reliability coefficient. ResearchGate.
- Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of coefficients alpha and theta. Practical Assessment, Research & Evaluation, 12(13).
SOCR Home page: http://www.socr.umich.edu
Translate this page: