Difference between revisions of "SMHS Cronbachs"

From SOCR
Jump to: navigation, search
(Problems)
m (Software)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
==[[SMHS| Scientific Methods for Health Sciences]] - Instrument Performance Evaluation: Cronbach's α ==
+
== [[SMHS|Scientific Methods for Health Sciences]] Instrument Performance Evaluation: Cronbach's α ==
  
===Overview:===
+
=== Overview ===
Cronbach’s alpha $\alpha$ is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and measures whether several items that propose to measure the same general construct and produce similar scores. Cronbach’s alpha is widely used in the social science, nursing, business and other disciplines. Here we present a general introduction to Cronbach’s alpha, how is it calculated, how to apply it in research and what are some common problems when using Cronbach’s alpha.
+
Cronbach’s alpha <math>\alpha</math> is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and assesses whether several items that propose to measure the same general construct produce similar scores. Cronbach’s alpha is widely used in the social sciences, nursing, business, and other disciplines. This section presents a general introduction to Cronbach’s alpha, including its calculation, application in research, and common issues in its use.
  
===Motivation:===
+
=== Motivation ===
We have discussed about internal and external consistency and their importance in researches and studies. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap of patients suffering from certain disease. The dataset contains 10records measuring the degree of difficulty experienced in carrying out daily activities. Each item is recorded from 1 (no difficulty) to 4 (can’t do). When those data is used to form a scale they need to have internal consistency. All items should measure the same thing, so they could be correlated with one another. Cronbach’s alpha generally increases when correlations between items increase.
+
We have discussed internal and external consistency and their importance in research. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap among patients suffering from a certain disease. The dataset contains 10 records measuring the degree of difficulty experienced in carrying out daily activities. Each item is scored from 1 (no difficulty) to 4 (can’t do). When these data are used to form a scale, they should exhibit internal consistency—all items should measure the same underlying construct and thus be correlated with one another. Cronbach’s alpha generally increases as the correlations between items increase.
  
 +
=== Theory ===
  
===Theory===
+
==== Cronbach’s Alpha ====
====Cronbach’s Alpha====
+
Cronbach’s alpha is a measure of internal consistency or reliability of a psychometric instrument and assesses how well a set of items measures a single, one-dimensional latent trait.
Cronbach’s Alpha is a measure of internal consistency or reliability of a psychometric instrument and measures how well a set of items measure a single, one-dimensional latent aspect of individuals.  
 
  
*Suppose we measure a quantity X, which is a sum of K components: $X=Y_{1}+ Y_{2}++Y_{k}$, then Cronbach’s alpha is defined as $\alpha =\frac{K}{K-1}$  $\left( 1-\frac{\sum_{i=1}^{K}\sigma^{2}_{{Y}_{i}}} {\sigma_{X}^{2}}\right)$, where $\sigma_{X}^{2}$ is the variance of the observed total test scores, and $ \sigma^{2}_{{Y}_{i}} $ is the variance of component $i$ for the current sample.  
+
Suppose we measure a quantity <math>X</math>, which is the sum of <math>K</math> components:
 +
<math>X = Y_1 + Y_2 + \cdots + Y_K</math>. 
 +
Then Cronbach’s alpha is defined as
 +
<math>
 +
\alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} \sigma_{Y_i}^2}{\sigma_X^2} \right),
 +
</math> 
 +
where <math>\sigma_X^2</math> is the variance of the observed total test scores and <math>\sigma_{Y_i}^2</math> is the variance of component <math>i</math> in the current sample.
  
: If items are scored from 0 to 1, then $\alpha =\frac{K}{K-1}$ $\left( 1-\frac{\sum_{i=1}^{K}P_{i}Q_{i}} {\sigma_{X}^{2}} \right)$, where $P_{i}$ is the proportion scoring 1 on item $i$ and $Q_{i}=1-P_{i}$. Alternatively, Cronbach’s alpha can be defined as $\alpha$=$\frac{K\bar c}{(\bar v +(K-1) \bar c )}$,where K is as above, $\bar v$ is the average variance of each component and $\bar c$ is the average of all covariance between the components across the current sample of persons.
+
If items are scored from 0 to 1, then
 +
<math>
 +
\alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} P_i Q_i}{\sigma_X^2} \right),
 +
</math> 
 +
where <math>P_i</math> is the proportion scoring 1 on item <math>i</math>, and <math>Q_i = 1 - P_i</math>.
  
*The standardized Cronbach’s alpha can be defined as $\alpha_{standardized}=\frac{K\bar r} {(1+(K-1)\bar r )}$, $\bar r$ is the mean of $\frac {K(K-1)}{2}$ non redundant correlation coefficients (i.e., the mean of an upper triangular, or lower triangular, correlation matrix).
+
Alternatively, Cronbach’s alpha can be expressed as
 +
<math>
 +
\alpha = \frac{K \bar{c}}{\bar{v} + (K - 1) \bar{c}},
 +
</math> 
 +
where <math>\bar{v}</math> is the average variance of each component and <math>\bar{c}</math> is the average covariance between all item pairs.
  
*The theoretical value of alpha varies from 0 to 1 considering it is ratio of two variance. $\rho_{XX}=\frac{\sigma_{T}^{2}} {\sigma_{X}^{2}}$, reliability of test scores is the ratio of the true score and total score variance.
+
The standardized Cronbach’s alpha is
 +
<math>
 +
\alpha_{\text{standardized}} = \frac{K \bar{r}}{1 + (K - 1) \bar{r}},
 +
</math> 
 +
where <math>\bar{r}</math> is the mean of the <math>K(K - 1)/2</math> non-redundant correlation coefficients (e.g., from the upper triangle of the correlation matrix).
  
====Internal consistency====
+
The theoretical value of alpha ranges from 0 to 1, as it is a ratio of variances. Reliability of test scores is defined as: 
Internal consistency is a measure of whether several items that proposed to measure the same general construct produce similar score. It is usually measured with Cronbach’s alpha, which is calculated from the pairwise correlation between items. Internal consistency can take values from negative infinity to 1. It is negative when there is greater within subject variability than between-subject variability. Only positive values of Cronbach’s alpha make sense. Cronbach’s alpha will generally increases as the inter-correlations among items tested increase.  
+
<math>
 +
\rho_{XX} = \frac{\sigma_T^2}{\sigma_X^2},
 +
</math> 
 +
the ratio of true-score variance to total-score variance.
 +
 
 +
==== Internal Consistency ====
 +
Internal consistency measures whether several items hypothesized to reflect the same construct yield similar scores. It is usually quantified using Cronbach’s alpha, which is derived from pairwise item correlations. Internal consistency can theoretically range from negative infinity to 1. Negative values occur when within-subject variability exceeds between-subject variability. Only positive values of Cronbach’s alpha are interpretable.
 +
 
 +
Cronbach’s alpha increases as inter-item correlations increase.
  
 
<center>
 
<center>
 
{| class="wikitable" style="text-align:center; width:35%" border="1"
 
{| class="wikitable" style="text-align:center; width:35%" border="1"
 
|-
 
|-
|Cronbach's alpha|| Internal consistency
+
| Cronbach's alpha || Internal consistency
|-
 
| $\alpha$  ≥ 0.9|| Excellent (High-Stakes testing)
 
 
|-
 
|-
|0.7 ≤ $\alpha$ < 0.9|| Good (Low-Stakes testing)
+
| <math>\alpha \geq 0.9</math> || Excellent (High-stakes testing)
 
|-
 
|-
|0.6 ≤ $\alpha$ < 0.7|| Acceptable
+
| <math>0.7 \leq \alpha < 0.9</math> || Good (Low-stakes testing)
 
|-
 
|-
|0.5 ≤ $\alpha$ < 0.6|| Poor
+
| <math>0.6 \leq \alpha < 0.7</math> || Acceptable
 
|-
 
|-
|$\alpha$ < 0.5 ||Unacceptable
+
| <math>0.5 \leq \alpha < 0.6</math> || Poor
 
|-
 
|-
 +
| <math>\alpha < 0.5</math> || Unacceptable
 
|}
 
|}
 
</center>
 
</center>
  
====Other Measures====
+
==== Other Measures ====
* '''Intra-class correlation:''' The Intra-class correlation coefficient (ICC) assesses the consistency, or reproducibility, of quantitative measurements made by different observers measuring the same quantity. Broadly speaking, the ICC is defined as the ratio of between-cluster variance to total variance:
+
 
$$ICC = \frac{Variance due to rated subjects (patients)}{(Variance due to subjects) + (Variance due to Judges) + (Residual Variance)}.$$
+
Intra-class correlation (ICC) assesses the consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. Broadly, ICC is defined as:
 +
<math>
 +
\text{ICC} = \frac{\text{Variance due to rated subjects (patients)}}{\text{Variance due to subjects} + \text{Variance due to judges} + \text{Residual variance}}.
 +
</math>
  
* Example: Suppose 4 nurses rate 6 patients on a 10 point depression scale:
+
Example: Suppose 4 nurses rate 6 patients on a 10-point depression scale:
  
 
<center>
 
<center>
 
{| class="wikitable" style="text-align:center; width:75%" border="1"
 
{| class="wikitable" style="text-align:center; width:75%" border="1"
 
|-
 
|-
!PatientID||NurseRater1||NurseRater2||NurseRater3||NurseRater4
+
! PatientID || NurseRater1 || NurseRater2 || NurseRater3 || NurseRater4
 
|-
 
|-
|1||9||2||5||8
+
| 1 || 9 || 2 || 5 || 8
 
|-
 
|-
|2||6||1||3||2
+
| 2 || 6 || 1 || 3 || 2
 
|-
 
|-
|3||8||4||6||8
+
| 3 || 8 || 4 || 6 || 8
 
|-
 
|-
|4||7||1||2||6
+
| 4 || 7 || 1 || 2 || 6
 
|-
 
|-
|5||10||5||6||9
+
| 5 || 10 || 5 || 6 || 9
 
|-
 
|-
|6||6||2||4||7
+
| 6 || 6 || 2 || 4 || 7
|}</center>
+
|}
 +
</center>
 +
 
 +
This data can also be formatted in long form:
  
This data can also be presented as a frame:
 
 
<center>
 
<center>
 
{| class="wikitable" style="text-align:center; width:75%" border="1"
 
{| class="wikitable" style="text-align:center; width:75%" border="1"
 
|-
 
|-
!PatientID||Rating||Nurse
+
! PatientID || Rating || Nurse
 
|-
 
|-
|1||9||1
+
| 1 || 9 || 1
 
|-
 
|-
|2||6||1
+
| 2 || 6 || 1
 
|-
 
|-
|3||8||1
+
| 3 || 8 || 1
 
|-
 
|-
|4||7||1
+
| 4 || 7 || 1
 
|-
 
|-
|5||10||1
+
| 5 || 10 || 1
 
|-
 
|-
|6||6||1
+
| 6 || 6 || 1
 
|-
 
|-
|7||2||2
+
| 7 || 2 || 2
 
|-
 
|-
|8||1||2
+
| 8 || 1 || 2
 
|-
 
|-
|9||4||2
+
| 9 || 4 || 2
 
|-
 
|-
|10||1||2
+
| 10 || 1 || 2
 
|-
 
|-
|11||5||2
+
| 11 || 5 || 2
 
|-
 
|-
|12||2||2
+
| 12 || 2 || 2
 
|-
 
|-
|13||5||3
+
| 13 || 5 || 3
 
|-
 
|-
|14||3||3
+
| 14 || 3 || 3
 
|-
 
|-
|15||6||3
+
| 15 || 6 || 3
 
|-
 
|-
|16||2||3
+
| 16 || 2 || 3
 
|-
 
|-
|17||6||3
+
| 17 || 6 || 3
 
|-
 
|-
|18||4||3
+
| 18 || 4 || 3
 
|-
 
|-
|19||8||4
+
| 19 || 8 || 4
 
|-
 
|-
|20||2||4
+
| 20 || 2 || 4
 
|-
 
|-
|21||8||4
+
| 21 || 8 || 4
 
|-
 
|-
|22||6||4
+
| 22 || 6 || 4
 
|-
 
|-
|23||9||4
+
| 23 || 9 || 4
 
|-
 
|-
|24||7||4
+
| 24 || 7 || 4
 
|}
 
|}
 
</center>
 
</center>
  
install.packages("ICC")
+
<pre>
library("ICC")
+
install.packages("ICC")
# save the data in the table above in a local file ""
+
library("ICC")
dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE)
+
 
# remove the first columns (Patient ID number)
+
# Load data (adjust path as needed)
dataset <- dataset[,-1]
+
dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE)
attach(dataset)
+
dataset <- dataset[, -1]  # Remove PatientID column
  dataset
 
 
Nest("p", w=0.14, x=Rating, y=Nurse, data=dataset)
 
icc <-ICCest(Rating, Nurse, dataset)
 
icc$\$ $UpperCI-icc$\$ $LowerCI #confidence interval width
 
icc
 
  
ICC: -0.4804401
+
# Fit ICC model
95% CI(ICC): (-0.6560437 : -0.03456346)
+
icc_result <- ICCest(Rating, Nurse, data = dataset)
 +
icc_result
 +
# ICC: -0.4804401
 +
# 95% CI: (-0.656, -0.035)
 +
</pre>
  
Cronbach’s alpha equals to the stepped-up intra-class correlation coefficient, which is commonly used in observational studies if and only if the value of the item variance component equals zero. If this variance component is negative, then alpha will underestimate the stepped-up intra-class correlation coefficient; if it’s positive, alpha will overestimate the stepped-up intra-class correlation.
+
Cronbach’s alpha equals the stepped-up intra-class correlation coefficient in observational studies if and only if the item variance component is zero. If this component is negative, alpha underestimates the ICC; if positive, it overestimates it.
  
====Generalizability theory====
+
==== Generalizability Theory ====
Cronbach’s alpha is an unbiased estimate of the generalizability. It can be viewed as a measure of how well the sum score on the selected items capture the expected score in the entire domain, even if that domain is heterogeneous.  
+
Cronbach’s alpha is an unbiased estimate of generalizability. It can be interpreted as a measure of how well the sum score on selected items captures the expected score over the entire domain—even if the domain is heterogeneous.
  
====Problems with Cronbach’s alpha====
+
==== Problems with Cronbach’s Alpha ====
# it is dependent not only on the magnitude of the correlations among items, but also on the number of items in the scale. Hence, a scale can be made to look more homogenous simply by increasing the number of items though the average correlation remains the same;
+
- Alpha depends not only on the magnitude of inter-item correlations but also on the number of items. Scales can appear more homogeneous simply by adding more items, even if average correlation remains unchanged.
# if two scales each measuring a distinct aspect are combined to form a long scale, alpha would probably be high though the merged scale is obviously tapping two different attributes;
+
- Combining two distinct constructs into one scale may yield a high alpha despite measuring two different attributes.
# if alpha is too high, then it may suggest a high level of item redundancy.
+
- Excessively high alpha (<math>> 0.95</math>) may indicate item redundancy.
  
====Split-Half Reliability====
+
==== Split-Half Reliability ====
In Split-Half Reliability assessment, the test is split in half (e.g., odd / even) creating “equivalent forms”. The two “forms” are correlated with each other and the correlation coefficient is adjusted to reflect the entire test length, using the Spearman-Brown Prophecy formula. Suppose the $Corr(Even,Odd)=r$ is the raw correlation between the even and odd items. Then the adjusted correlation will be:$r’ = \frac{n r}{(n-1)\, (r+1)},$ where n = number of items (in this case n=2).
+
In split-half reliability, the test is divided into two halves (e.g., odd vs. even items). The correlation between halves is adjusted using the Spearman–Brown prophecy formula
 +
<math>
 +
r' = \frac{n r}{(n - 1)(r + 1)},
 +
</math> 
 +
where <math>r</math> is the raw correlation between halves and <math>n = 2</math>.
  
 
Example:
 
Example:
Line 156: Line 188:
 
{| class="wikitable" style="text-align:center; width:35%" border="1"
 
{| class="wikitable" style="text-align:center; width:35%" border="1"
 
|-
 
|-
|Index|| Q1|| Q2|| Q3|| Q4|| Q5|| Q6|| Odd|| Even
+
| Index || Q1 || Q2 || Q3 || Q4 || Q5 || Q6 || Odd || Even
 +
|-
 +
| 1 || 1 || 0 || 0 || 1 || 1 || 0 || 2 || 1
 +
|-
 +
| 2 || 1 || 1 || 0 || 1 || 0 || 1 || 1 || 3
 
|-
 
|-
|1 ||1|| 0|| 0|| 1|| 1|| 0|| 2|| 1
+
| 3 || 1 || 1 || 1 || 1 || 1 || 0 || 3 || 2
 
|-
 
|-
|2|| 1|| 1 ||0 ||1|| 0 ||1|| 1|| 3
+
| 4 || 1 || 0 || 0 || 0 || 1 || 0 || 2 || 0
 
|-
 
|-
|3|| 1|| 1|| 1|| 1|| 1|| 0|| 3|| 2
+
| 5 || 1 || 1 || 1 || 1 || 0 || 0 || 2 || 2
 
|-
 
|-
|4 ||1 ||0 ||0 ||0 ||1 ||0|| 2|| 0
+
| 6 || 0 || 0 || 0 || 0 || 1 || 0 || 1 || 0
 
|-
 
|-
|5|| 1|| 1|| 1|| 1|| 0|| 0|| 2|| 2
+
| colspan=6 rowspan=4 || mean || 1.833 || 1.333
 
|-
 
|-
|6 ||0|| 0 ||0 ||0 ||1 ||0 ||1|| 0
+
| SD || 0.753 || 1.211
 
|-
 
|-
| colspan=6 rowspan=4| ||mean|| 1.833333333|| 1.33333333
+
| corr(Even, Odd) || 0.073 ||
 
|-
 
|-
| SD|| 0.752772653|| 1.21106014
+
| AdjCorr = <math>\frac{n r}{(n-1)(r+1)}</math> || 0.136 ||
|-
 
| corr(Even,Odd)|| 0.073127242 || rowspan=2|
 
|-
 
| AdjCorr(Even,Odd)=$\frac{nr}{(n-1)(r+1)}$|| 0.136288111
 
|-
 
 
|}
 
|}
 
</center>
 
</center>
  
====KR-20====
+
==== KR-20 ====
The [http://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_Formula_20 Kuder–Richardson Formula 20 (KR-20)] is a very reliable internal reliability estimate which simulates calculating split-half reliability for every possible combination of items. For a test with ''K'' test items indexed ''i''=1 to ''K'':
+
The Kuder–Richardson Formula 20 (KR-20) is a reliability estimate for dichotomous items:
$$KR-20 = \frac{K}{K-1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma^2_X} \right),$$
 
where $p_i$ is the proportion of ''correct'' responses to test item ''i'', $q_i$ is the proportion of ''incorrect'' responses to test item ''i'' (thus $p_i + q_i= 1$), the variance for the denominator is
 
$\sigma^2_X = \frac{\sum_{i=1}^n (X_i-\bar{X})^2\,{}}{n-1},$ and where $n$ is the total sample size.
 
 
 
The Cronbach's α and KR-20 are similar -- KR-20 is a derivative of the Cronbach's α with the advantage that it can handle both dichotomous and continuous variables, however, KR-20 can't be used when multiple-choice questions involve partial credit and require systematic item-based analysis.
 
 
 
====Standard Error of Measurement (SEM)====
 
The greater the reliability of the test, the smaller the SEM.
 
 
 
$$SEM=S\sqrt{1-r_{xx}},$$
 
where $r_{xx’}$ is the correlation between two instances of the measurements under identical conditions, and $S$ is the total standard deviation.
 
  
===Applications===
+
<math>
 +
\text{KR-20} = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma_X^2} \right),
 +
</math> 
 +
where <math>p_i</math> is the proportion of correct responses to item <math>i</math>, <math>q_i = 1 - p_i</math>, and <math>\sigma_X^2</math> is the sample variance of total scores.
  
* [http://link.springer.com/article/10.1007/s10869-005-8262-4 This article] explores the internal validity and reliability of Kolb’s revised learning style inventory in a sample with 221 graduate and undergraduate business students. It also reviewed research on the LSI and studied on implications of conducting factor analysis using ipsative data (type of data where respondents compare two or more desirable options and pick the one that is most preferred (sometimes called a "forced choice" scale). Experiential learning theory is presented and the concept of learning styles explained. This paper largely supports prior research supporting the internal reliability of scales.
+
KR-20 is a special case of Cronbach’s alpha for binary data. It cannot accommodate partial credit or ordinal responses.
  
* [https://scholarworks.iupui.edu/bitstream/handle/1805/344/Gliem%20&%20Gliem.pdf?s This article] showed the reason a single-item questions pertaining to a construct are not reliable and should not be used in drawing conclusions. It compared the reliability of a summated, multi-item scale versus a single-item question and showed how unreliable a single item is and therefore not appropriate to make inferences based on analysis of single item question, which are used in measuring a construct.
+
==== Standard Error of Measurement (SEM) ====
 +
The greater the reliability, the smaller the SEM:
  
===Software===
+
<math>
 +
\text{SEM} = S \sqrt{1 - r_{xx}},
 +
</math> 
 +
where <math>r_{xx}</math> is the reliability and <math>S</math> is the standard deviation of observed scores.
  
* SOCR Cronbach's alpha calculator webapp (coming up) ...
+
=== Applications ===
 +
- A study on Kolb’s revised Learning Style Inventory with 221 business students supports the internal reliability of its scales and discusses factor analysis with ipsative data.
 +
- Research comparing multi-item scales to single-item questions demonstrates that single items yield unreliable results and should not be used for construct inference.
  
* '''In R:''' using [http://cran.r-project.org/web/packages/psy/psy.pdf the ''psy'' package] and the psychometry dataset (expsy), which is a [http://www.r-tutor.com/r-introduction/data-frame frame] with 30 rows and 16 columns with missing data, where it1-it10 correspond to the rating of 30 patients with a 10 items scale, r1, r2, r3 to the rating of item 1 by 3 different clinicians of the same 30 patients, rb1, rb2, rb3 to the binary transformation of r1, r2, r3 (1 or 2 -> 0; and 3 or 4 -> 1).
+
=== Software ===
 
cronbach(v1)  ## v1 is n*p matrix or data frame with n subjects and p items.
 
## This phrase is used to compute the Cronbach’s reliability coefficient alpha.
 
## This coefficient may be applied to a series of items aggregated in a single score.
 
## It estimates reliability in the framework of the domain sampling model.
 
  
An example to calculate Cronbach’s alpha:
+
SOCR Cronbach's alpha calculator webapp (coming soon).
library(psy)
 
data(expsy)   
 
cronbach(expsy[,1:10]) 
 
## this choose the vector of the columns 1 to 10 and calculated the  Cronbach’s Alpha value
 
  
$\$ $sample.size
+
In R, the `psych` package is recommended over older alternatives like `psy`. Example using the `expsy` dataset:
[1] 27
 
$\$ $number.of.items
 
[1] 10
 
$\$ $alpha
 
[1] 0.1762655
 
## not good because item 2 is reversed (1 is high and 4 is low)   
 
  
cronbach(cbind(expsy[,c(1,3:10)],-1*expsy[,2])
+
<pre>
## this choose columns 1 and columns 3 to 10 and added in the reversed column 2,
+
# Load required packages
## and then calculated the Cronbach’s Alpha value for the revised data
+
library(psy)
 +
data(expsy)
  
$\$ $sample.size
+
# Compute alpha for first 10 items
[1] 27
+
cronbach(expsy[, 1:10])
$\$ $number.of.items
+
# Result: alpha 0.176 — low due to reversed item (item 2)
[1] 10
 
$\$ $alpha
 
[1] 0.3752657
 
  
## better to obtain a 95%confidence interval:   
+
# Reverse item 2 (assuming 1=high, 4=low)
datafile <- cbind(expsy[,c(1,3:10)],-1*expsy[,2])
+
revised_data <- cbind(expsy[, c(1, 3:10)], -1 * expsy[, 2])
## extract the revised data into a new dataset named ‘datafile’
+
cronbach(revised_data)
library(boot)
+
# Result: alpha ≈ 0.375
cronbach.boot <- function(data,x) {cronbach(data[x,])[[3]]}
 
res <- boot(datafile,cronbach.boot,1000) 
 
res
 
  
  Call:
+
# Bootstrap 95% CI
boot(data = datafile, statistic = cronbach.boot, R = 1000)
+
library(boot)
Bootstrap Statistics :
+
cronbach.boot <- function(data, indices) {
    original      bias    std. error
+
  cronbach(data[indices, ])[[3]] # extract alpha
t1* 0.3752657 -0.06104997  0.2372292
+
}
 +
res <- boot(revised_data, cronbach.boot, R = 1000)
 +
quantile(res$t, c(0.025, 0.975))
 +
# e.g., [-0.30, 0.63]
  
quantile(res$\$ $t,c(0.025,0.975))  ## this calculated the 25% and 97.5% value to form the 95% confidence interval of Cronbach’s alpha
+
boot.ci(res, type = "bca")
      2.5%      97.5%
+
# BCa CI: (-0.15, 0.67)
-0.2987214  0.6330491
+
</pre>
## two-sided bootstrapped confidence interval of Cronbach’s alpha boot.ci(res,type="bca")  
 
## adjusted bootstrap percentile (BCa) confidence interval (better)  
 
  
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
+
The '''coefficientalpha''' R package offers robust estimation with missing data and non-normality, including standard errors and confidence intervals.
Based on 1000 bootstrap replicates
 
 
CALL :
 
boot.ci(boot.out = res, type = "bca")
 
 
Intervals :
 
Level      BCa         
 
95%  (-0.1514, 0.6668 ) 
 
Calculations and Intervals on Original Scale
 
  
The [http://cran.r-project.org/web/packages/coefficientalpha/coefficientalpha.pdf CoefficientAlpha R package] provides an alternative methods for computing Cronbach's alpha coefficient in the presence of missing data and for non-normal data. It also reports robust standard error and confidence interval estimates for alpha.
+
=== Cronbach's <math>\alpha</math> Calculations ===
  
===Cronbach's $\alpha$ calculations===
+
The table below illustrates the core structure for computing Cronbach’s alpha.
The table below illustrates the setting and core calculations involved in computing the Cronbach's $\alpha$.
 
  
 
<center>
 
<center>
 
{| align="center" border="1"
 
{| align="center" border="1"
 
|-
 
|-
| rowspan="2"| Subjects || colspan="4" align="center"| Items/Questions Part of the Assessment Instrument|| rowspan="2" | Total Score per Subject
+
| rowspan="2" | Subjects || colspan="4" align="center" | Items/Questions || rowspan="2" | Total Score
 
|-
 
|-
| $Q_1$ ||$Q_2$ ||... ||$Q_k$
+
| <math>Q_1</math> || <math>Q_2</math> || || <math>Q_k</math>
 
|-
 
|-
| $S_1$||$Y_{1,1}$||$Y_{1,2}$||…||$Y_{1,k}$||$X_1=\sum_{j=1}^k{Y_{1,j}}$
+
| <math>S_1</math> || <math>Y_{1,1}</math> || <math>Y_{1,2}</math> || … || <math>Y_{1,k}</math> || <math>X_1 = \sum_{j=1}^k Y_{1,j}</math>
 
|-
 
|-
| $S_2$||$Y_{2,1}$||$Y_{2,2}$||…||$Y_{2,k}$||$X_2=\sum_{j=1}^k{Y_{2,j}}$
+
| <math>S_2</math> || <math>Y_{2,1}</math> || <math>Y_{2,2}</math> || … || <math>Y_{2,k}</math> || <math>X_2 = \sum_{j=1}^k Y_{2,j}</math>
 
|-
 
|-
| ... ||... ||... ||...||...||...
+
| || || || || ||
 
|-
 
|-
| $S_n$||$Y_{n,1}$||$Y_{n,2}$||…||$Y_{n,k}$||$X_n=\sum_{j=1}^k{Y_{n,j}}$
+
| <math>S_n</math> || <math>Y_{n,1}</math> || <math>Y_{n,2}</math> || … || <math>Y_{n,k}</math> || <math>X_n = \sum_{j=1}^k Y_{n,j}</math>
 
|-
 
|-
| Variance per Item||$\sigma_{Y_{.,1}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,1}-\bar{Y}_{.,1})^2}$||$$\sigma_{Y_{.,2}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,2}-\bar{Y}_{.,2})^2}$$||…||$$\sigma_{Y_{.,k}}^2=\frac{1}{n-1}\sum_{i=1}^n{(Y_{i,k}-\bar{Y}_{.,k})^2}$$||$$\sigma_X^2=\frac{1}{n-1}\sum_{i=1}^n{(X_i-\bar{X})^2}$$
+
| Variance || <math>\sigma_{Y_{.,1}}^2</math> || <math>\sigma_{Y_{.,2}}^2</math> || … || <math>\sigma_{Y_{.,k}}^2</math> || <math>\sigma_X^2</math>
 
|}
 
|}
 
</center>
 
</center>
  
===Cronbach's $\alpha$ inference===
+
=== Cronbach's <math>\alpha</math> Inference ===
Cronbach's $\alpha$ coefficient is a point estimate of the reliability. Its standard error is important to construct an interval estimation of its true value and to obtain statistical inference about its significance. There are parametric and non-parametric methods to estimate the variance of Cronbach's $\alpha$, $V(\alpha)$, see [http://www.researchgate.net/profile/Michail_Tsagris/publication/267097800_Confidence_intervals_for_Cronbachs_reliability_coefficient/links/544568eb0cf2d62c304d7f70.pdf this paper (Confidence intervals for Cronbach’s reliability coefficient)].
+
Cronbach’s alpha is a point estimate. Its standard error enables interval estimation and hypothesis testing.
 
 
The [http://link.springer.com/article/10.1007/BF02296146 Chronbach’s alpha has a known distribution], which allows us to compute its variance. Thus, we can compute the confidence interval for $\alpha$ and make inference (e.g., $H_o: \alpha=\alpha_o$, vs. $H_a: \alpha \not= \alpha_o$).
 
 
 
* '''Inference''': We can have parametric (using Pearson correlation matrix) or non-parametric (using Spearman correlation matrix) confidence intervals (CIs) for $\alpha$. Note that Cronbach's alpha may be appropriate for continuous numeric data types. If we have ordinal data, $\alpha$ may underestimate the true instrument reliability. For ordinal data, [http://www.pareonline.net/getvn.asp?v=17&n=3 Zumbo's ordinal alpha or ordinal omega coefficients] may be more appropriate. These estimators employ a correlation matrix under the assumption of latent multivariate normality. For [http://www.researchgate.net/profile/Michail_Tsagris/publication/267097800_Confidence_intervals_for_Cronbachs_reliability_coefficient/links/544568eb0cf2d62c304d7f70.pdf non-parametric CIs we can use the Spearman correlation matrix (based on data ranking), and for parametric CI’s, we can use raw data and Pearson correlation matrix].
 
  
* '''Confidence Intervals''': the Cronbach’s $\alpha$ reliability coefficient is defined as:
+
: Parametric CIs use the Pearson correlation matrix (assumes normality).
$$\alpha=\frac{N}{N-1}
+
: Non-parametric CIs use Spearman correlations (robust to non-normality).
\left ( 1-\frac{\sum_{j=1}^N{V(Y_j)}}
+
: For ordinal data, consider ordinal alpha (Zumbo et al.), which uses polychoric correlations.
              {V\left ( \sum_{j=1}^N{Y_j} \right )}
 
\right ),$$
 
: where $Y_j$ represents the $j^{th}$ variable $Y$ (the $j^{th}$ item in the $Y$ questionnaire), and $V$ is the variance.
 
  
: The estimated variance of $\alpha$ is:
+
The general formula for alpha is:
$$ \hat{\sigma}^2_{\hat{\alpha}} = V(\hat{\alpha})=d\frac{N^2}{k(N-1)^2},
+
<math>
$$
+
\alpha = \frac{N}{N - 1} \left( 1 - \frac{\sum_{j=1}^N V(Y_j)}{V\left( \sum_{j=1}^N Y_j \right)} \right).
 +
</math>
  
: where $N$ is the number of items, $k$ is the sample size (number of completed questionnaires), $d=\frac{2}{(j^tSj)^3} \left ( (j^tSj) \left ( tr(S^2) +tr^2(S) \right ) - 2(tr(S)(j^tS^2j)\right )$, $S$ is unbiased sample estimate of the true covariance matrix of the question items ($\Sigma$), $tr$ is a trace of a matrix, $j$ is an N-dimensional vector of ones.
+
The estimated variance of <math>\hat{\alpha}</math> is
 +
<math>
 +
\hat{\sigma}^2_{\hat{\alpha}} = \frac{N^2}{k(N - 1)^2} \cdot d,
 +
</math> 
 +
where <math>d</math> depends on the covariance matrix <math>S</math>, its trace, and the vector of ones <math>j</math>.
  
: Thus [[SMHS_HypothesisTesting#Testing_a_claim_about_a_mean_with_large_sample_size |the $(1-\gamma)100\%$ confidence interval for $\alpha$]] is:
+
A <math>(1 - \gamma)100\%</math> confidence interval is:
$$\left ( \hat{\alpha} - z_{\frac{\gamma}{2}}\hat{\sigma}_{\hat{\alpha}}, \hat{\alpha} + z_{\frac{\gamma}{2}}\hat{\sigma}_{\hat{\alpha}} \right ), $$
+
<math>
: where $z_{\frac{\gamma}{2}}$ is the [[EBook#The_Standard_Normal_Distribution |normal distribution critical value]] corresponding to false-positive error rate $\gamma$.
+
\left( \hat{\alpha} - z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}},\
 +
\hat{\alpha} + z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}} \right).
 +
</math>
  
===Problems===
+
=== Problems ===
* Use the [[SOCR_TurkiyeStudentEvalData| Turkiye Student Course Evaluation survey (N=5,000)]] to compute the ICC ad Cronbach's alpha.
+
Use the [[SOCR_TurkiyeStudentEvalData|Turkiye Student Course Evaluation survey (N=5,000)]] to compute the ICC and Cronbach’s alpha.
  
===References===
+
=== References ===
* [http://en.wikipedia.org/wiki/Cronbach's_alpha  Cronbach's alpha Wikipedia]
 
*[http://en.wikipedia.org/wiki/Kuder–Richardson_Formula_20  Kuder-Richardson Formula 20 Wikipedia]
 
  
 +
: [https://en.wikipedia.org/wiki/Cronbach%27s_alpha Cronbach's alpha – Wikipedia] 
 +
: [https://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_Formula_20 KR-20 – Wikipedia] 
 +
: Tsagris, M. (2014). Confidence intervals for Cronbach’s reliability coefficient. ResearchGate. 
 +
: Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of coefficients alpha and theta. Practical Assessment, Research & Evaluation, 12(13).
  
 
<hr>
 
<hr>
* SOCR Home page: http://www.socr.umich.edu
+
SOCR Home page: http://www.socr.umich.edu
 
 
 
{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_Cronbachs}}
 
{{translate|pageName=http://wiki.socr.umich.edu/index.php?title=SMHS_Cronbachs}}

Latest revision as of 17:30, 10 February 2026

Scientific Methods for Health Sciences – Instrument Performance Evaluation: Cronbach's α

Overview

Cronbach’s alpha \(\alpha\) is a coefficient of internal consistency and is commonly used as an estimate of the reliability of a psychometric test. Internal consistency is typically a measure based on the correlations between different items on the same test and assesses whether several items that propose to measure the same general construct produce similar scores. Cronbach’s alpha is widely used in the social sciences, nursing, business, and other disciplines. This section presents a general introduction to Cronbach’s alpha, including its calculation, application in research, and common issues in its use.

Motivation

We have discussed internal and external consistency and their importance in research. How do we measure internal consistency? For example, suppose we are interested in measuring the extent of handicap among patients suffering from a certain disease. The dataset contains 10 records measuring the degree of difficulty experienced in carrying out daily activities. Each item is scored from 1 (no difficulty) to 4 (can’t do). When these data are used to form a scale, they should exhibit internal consistency—all items should measure the same underlying construct and thus be correlated with one another. Cronbach’s alpha generally increases as the correlations between items increase.

Theory

Cronbach’s Alpha

Cronbach’s alpha is a measure of internal consistency or reliability of a psychometric instrument and assesses how well a set of items measures a single, one-dimensional latent trait.

Suppose we measure a quantity \(X\), which is the sum of \(K\) components\[X = Y_1 + Y_2 + \cdots + Y_K\]. Then Cronbach’s alpha is defined as\[ \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} \sigma_{Y_i}^2}{\sigma_X^2} \right), \] where \(\sigma_X^2\) is the variance of the observed total test scores and \(\sigma_{Y_i}^2\) is the variance of component \(i\) in the current sample.

If items are scored from 0 to 1, then\[ \alpha = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^{K} P_i Q_i}{\sigma_X^2} \right), \] where \(P_i\) is the proportion scoring 1 on item \(i\), and \(Q_i = 1 - P_i\).

Alternatively, Cronbach’s alpha can be expressed as\[ \alpha = \frac{K \bar{c}}{\bar{v} + (K - 1) \bar{c}}, \] where \(\bar{v}\) is the average variance of each component and \(\bar{c}\) is the average covariance between all item pairs.

The standardized Cronbach’s alpha is\[ \alpha_{\text{standardized}} = \frac{K \bar{r}}{1 + (K - 1) \bar{r}}, \] where \(\bar{r}\) is the mean of the \(K(K - 1)/2\) non-redundant correlation coefficients (e.g., from the upper triangle of the correlation matrix).

The theoretical value of alpha ranges from 0 to 1, as it is a ratio of variances. Reliability of test scores is defined as\[ \rho_{XX} = \frac{\sigma_T^2}{\sigma_X^2}, \] the ratio of true-score variance to total-score variance.

Internal Consistency

Internal consistency measures whether several items hypothesized to reflect the same construct yield similar scores. It is usually quantified using Cronbach’s alpha, which is derived from pairwise item correlations. Internal consistency can theoretically range from negative infinity to 1. Negative values occur when within-subject variability exceeds between-subject variability. Only positive values of Cronbach’s alpha are interpretable.

Cronbach’s alpha increases as inter-item correlations increase.

Cronbach's alpha Internal consistency
\(\alpha \geq 0.9\) Excellent (High-stakes testing)
\(0.7 \leq \alpha < 0.9\) Good (Low-stakes testing)
\(0.6 \leq \alpha < 0.7\) Acceptable
\(0.5 \leq \alpha < 0.6\) Poor
\(\alpha < 0.5\) Unacceptable

Other Measures

Intra-class correlation (ICC) assesses the consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. Broadly, ICC is defined as\[ \text{ICC} = \frac{\text{Variance due to rated subjects (patients)}}{\text{Variance due to subjects} + \text{Variance due to judges} + \text{Residual variance}}. \]

Example: Suppose 4 nurses rate 6 patients on a 10-point depression scale:

PatientID NurseRater1 NurseRater2 NurseRater3 NurseRater4
1 9 2 5 8
2 6 1 3 2
3 8 4 6 8
4 7 1 2 6
5 10 5 6 9
6 6 2 4 7

This data can also be formatted in long form:

PatientID Rating Nurse
1 9 1
2 6 1
3 8 1
4 7 1
5 10 1
6 6 1
7 2 2
8 1 2
9 4 2
10 1 2
11 5 2
12 2 2
13 5 3
14 3 3
15 6 3
16 2 3
17 6 3
18 4 3
19 8 4
20 2 4
21 8 4
22 6 4
23 9 4
24 7 4
install.packages("ICC")
library("ICC")

# Load data (adjust path as needed)
dataset <- read.csv('C:\\Users\\Desktop\\Nurse_data.csv', header = TRUE)
dataset <- dataset[, -1]  # Remove PatientID column

# Fit ICC model
icc_result <- ICCest(Rating, Nurse, data = dataset)
icc_result
# ICC: -0.4804401
# 95% CI: (-0.656, -0.035)

Cronbach’s alpha equals the stepped-up intra-class correlation coefficient in observational studies if and only if the item variance component is zero. If this component is negative, alpha underestimates the ICC; if positive, it overestimates it.

Generalizability Theory

Cronbach’s alpha is an unbiased estimate of generalizability. It can be interpreted as a measure of how well the sum score on selected items captures the expected score over the entire domain—even if the domain is heterogeneous.

Problems with Cronbach’s Alpha

- Alpha depends not only on the magnitude of inter-item correlations but also on the number of items. Scales can appear more homogeneous simply by adding more items, even if average correlation remains unchanged. - Combining two distinct constructs into one scale may yield a high alpha despite measuring two different attributes. - Excessively high alpha (\(> 0.95\)) may indicate item redundancy.

Split-Half Reliability

In split-half reliability, the test is divided into two halves (e.g., odd vs. even items). The correlation between halves is adjusted using the Spearman–Brown prophecy formula\[ r' = \frac{n r}{(n - 1)(r + 1)}, \] where \(r\) is the raw correlation between halves and \(n = 2\).

Example:

Index Q1 Q2 Q3 Q4 Q5 Q6 Odd Even
1 1 0 0 1 1 0 2 1
2 1 1 0 1 0 1 1 3
3 1 1 1 1 1 0 3 2
4 1 0 0 0 1 0 2 0
5 1 1 1 1 0 0 2 2
6 0 0 0 0 1 0 1 0
colspan=6 rowspan=4 mean 1.833 1.333
SD 0.753 1.211
corr(Even, Odd) 0.073
AdjCorr = \(\frac{n r}{(n-1)(r+1)}\) 0.136

KR-20

The Kuder–Richardson Formula 20 (KR-20) is a reliability estimate for dichotomous items\[ \text{KR-20} = \frac{K}{K - 1} \left( 1 - \frac{\sum_{i=1}^K p_i q_i}{\sigma_X^2} \right), \] where \(p_i\) is the proportion of correct responses to item \(i\), \(q_i = 1 - p_i\), and \(\sigma_X^2\) is the sample variance of total scores.

KR-20 is a special case of Cronbach’s alpha for binary data. It cannot accommodate partial credit or ordinal responses.

Standard Error of Measurement (SEM)

The greater the reliability, the smaller the SEM\[ \text{SEM} = S \sqrt{1 - r_{xx}}, \] where \(r_{xx}\) is the reliability and \(S\) is the standard deviation of observed scores.

Applications

- A study on Kolb’s revised Learning Style Inventory with 221 business students supports the internal reliability of its scales and discusses factor analysis with ipsative data. - Research comparing multi-item scales to single-item questions demonstrates that single items yield unreliable results and should not be used for construct inference.

Software

SOCR Cronbach's alpha calculator webapp (coming soon).

In R, the `psych` package is recommended over older alternatives like `psy`. Example using the `expsy` dataset:

# Load required packages
library(psy)
data(expsy)

# Compute alpha for first 10 items
cronbach(expsy[, 1:10])
# Result: alpha ≈ 0.176 — low due to reversed item (item 2)

# Reverse item 2 (assuming 1=high, 4=low)
revised_data <- cbind(expsy[, c(1, 3:10)], -1 * expsy[, 2])
cronbach(revised_data)
# Result: alpha ≈ 0.375

# Bootstrap 95% CI
library(boot)
cronbach.boot <- function(data, indices) {
  cronbach(data[indices, ])[[3]]  # extract alpha
}
res <- boot(revised_data, cronbach.boot, R = 1000)
quantile(res$t, c(0.025, 0.975))
# e.g., [-0.30, 0.63]

boot.ci(res, type = "bca")
# BCa CI: (-0.15, 0.67)

The coefficientalpha R package offers robust estimation with missing data and non-normality, including standard errors and confidence intervals.

Cronbach's \(\alpha\) Calculations

The table below illustrates the core structure for computing Cronbach’s alpha.

Subjects Items/Questions Total Score
\(Q_1\) \(Q_2\) \(Q_k\)
\(S_1\) \(Y_{1,1}\) \(Y_{1,2}\) \(Y_{1,k}\) \(X_1 = \sum_{j=1}^k Y_{1,j}\)
\(S_2\) \(Y_{2,1}\) \(Y_{2,2}\) \(Y_{2,k}\) \(X_2 = \sum_{j=1}^k Y_{2,j}\)
\(S_n\) \(Y_{n,1}\) \(Y_{n,2}\) \(Y_{n,k}\) \(X_n = \sum_{j=1}^k Y_{n,j}\)
Variance \(\sigma_{Y_{.,1}}^2\) \(\sigma_{Y_{.,2}}^2\) \(\sigma_{Y_{.,k}}^2\) \(\sigma_X^2\)

Cronbach's \(\alpha\) Inference

Cronbach’s alpha is a point estimate. Its standard error enables interval estimation and hypothesis testing.

Parametric CIs use the Pearson correlation matrix (assumes normality).
Non-parametric CIs use Spearman correlations (robust to non-normality).
For ordinal data, consider ordinal alpha (Zumbo et al.), which uses polychoric correlations.

The general formula for alpha is\[ \alpha = \frac{N}{N - 1} \left( 1 - \frac{\sum_{j=1}^N V(Y_j)}{V\left( \sum_{j=1}^N Y_j \right)} \right). \]

The estimated variance of \(\hat{\alpha}\) is\[ \hat{\sigma}^2_{\hat{\alpha}} = \frac{N^2}{k(N - 1)^2} \cdot d, \] where \(d\) depends on the covariance matrix \(S\), its trace, and the vector of ones \(j\).

A \((1 - \gamma)100\%\) confidence interval is\[ \left( \hat{\alpha} - z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}},\ \hat{\alpha} + z_{\gamma/2} \hat{\sigma}_{\hat{\alpha}} \right). \]

Problems

Use the Turkiye Student Course Evaluation survey (N=5,000) to compute the ICC and Cronbach’s alpha.

References

Cronbach's alpha – Wikipedia
KR-20 – Wikipedia
Tsagris, M. (2014). Confidence intervals for Cronbach’s reliability coefficient. ResearchGate.
Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of coefficients alpha and theta. Practical Assessment, Research & Evaluation, 12(13).

SOCR Home page: http://www.socr.umich.edu



Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif