SMHS ROC

Scientific Methods for Health Sciences - Receiver Operating Characteristic (ROC) Curve

Overview

The Receiver Operating Characteristic (ROC) curve is a fundamental graphical tool used to evaluate the performance of a binary classifier system as its discrimination threshold varies. It illustrates the diagnostic ability of a classifier by plotting the True Positive Rate (TPR) against the False Positive Rate (FPR) across all possible threshold settings.

By visualizing these trade-offs, the ROC curve aids in selecting optimal models and discarding suboptimal ones. The Area Under the Curve (AUC) serves as a single scalar measure of aggregate diagnostic performance.

Motivation

In binary classification tasks—such as diagnosing Disease vs. No Disease—outcomes are often determined by whether a continuous test statistic falls above or below a chosen cutoff. While sensitivity and specificity describe accuracy at a single threshold, classifier performance changes as this threshold shifts.

Key objectives of ROC analysis include:

Visualizing trade-offs: Demonstrating the dynamic relationship between sensitivity (TPR) and specificity (\(1 - \text{FPR}\)).
Assessing accuracy: The closer the curve hugs the top-left corner of the ROC space, the better the test. A curve along the 45° diagonal represents random guessing (no discriminative ability).
Selecting optimal thresholds: The slope of the tangent to the ROC curve at a given point reflects the likelihood ratio, enabling threshold selection based on clinical costs or benefits.
Summarizing performance: The AUC quantifies the probability that the classifier ranks a randomly chosen positive instance higher than a randomly chosen negative one.

Theory

The Confusion Matrix and Core Metrics

A binary classifier produces four possible outcomes based on a decision threshold:

		True Condition (Gold Standard)
		Disease (Positive)	No Disease (Negative)
Test Result	Positive	True Positive (TP) (Hit)	False Positive (FP) (Type I Error, \(\alpha\))
	Negative	False Negative (FN) (Type II Error, \(\beta\))	True Negative (TN) (Correct Rejection)

Fundamental metrics derived from this matrix:

Sensitivity (True Positive Rate)\[\text{Sensitivity} = \frac{TP}{TP + FN}\]
Specificity (True Negative Rate)\[\text{Specificity} = \frac{TN}{TN + FP}\]
False Positive Rate\[FPR = 1 - \text{Specificity} = \frac{FP}{TN + FP}\]

Constructing the ROC Curve

To construct an ROC curve, sensitivity and specificity are computed for every feasible cutoff value of the diagnostic test.

Example: Hypothyroidism Diagnosis Using T4 Levels The table below shows the distribution of T4 measurements in hypothyroid (diseased) and euthyroid (non-diseased) individuals.

Frequency of T4 Levels in Hypothyroid vs. Euthyroid Patients
Group	<1	1–2	2–3	3–4	4–5	5–6	6–7	7–8	8–9	9–10	10–11	>11
Hypothyroid	2	3	1	8	4	4	3	3	1	0	2	1
Euthyroid	0	0	0	0	1	6	11	19	17	20	11	8

We compute performance metrics at different thresholds:

Cut-point = 5 (strict):

 T4 ≤ 5 → test positive  
 \(\text{Sensitivity} = 18/32 = 0.56\),  
 \(\text{Specificity} = 92/93 = 0.99\)

Cut-point = 7 (moderate)\[\text{Sensitivity} = 25/32 = 0.78\],

 \(\text{Specificity} = 75/93 = 0.81\)

Cut-point = 9 (lenient)\[\text{Sensitivity} = 29/32 = 0.91\],

 \(\text{Specificity} = 39/93 = 0.42\)

Plotting Sensitivity (y-axis) versus FPR = \(1 - \text{Specificity}\) (x-axis) for all thresholds yields the ROC curve.

Applications and Interpretation

Area Under the Curve (AUC)

The AUC provides a standardized measure of overall diagnostic accuracy:

0.90–1.00: Excellent
0.80–0.90: Good
0.70–0.80: Fair
0.60–0.70: Poor
0.50–0.60: Fail (no better than chance)

In the hypothyroidism example, the AUC is 0.86, indicating good discriminative ability.

Optimal Threshold Selection and Cost Analysis

While AUC summarizes global performance, clinical decisions require a single operating threshold. The optimal choice depends on the relative costs of false positives (FP) and false negatives (FN).

The slope method identifies the optimal point on the ROC curve where the tangent slope equals\[\text{Slope} = \frac{\text{Cost}(FP) \times P(\text{Negative})}{\text{Cost}(FN) \times P(\text{Positive})}\].

If missing a disease (FN) is 8× more costly than a false alarm (FP), the target slope is \(1/8\).
If treatment risks make FP 2× more costly, the target slope is 2.

This approach balances clinical priorities with statistical performance.

Practical Implementation in R

Modern ROC analysis leverages R packages such as `pROC` and `ROCR` for robust computation and visualization.

Example 1: Basic ROC Analysis for T4 Test

# Install and load required package
if (!require("pROC")) install.packages("pROC")
library(pROC)

# Simulated data based on T4 distribution
response <- c(rep(1, 32), rep(0, 93))  # 1 = Hypothyroid, 0 = Euthyroid

# Simulated T4 values (lower = more likely diseased)
predictor_pos <- c(rep(4, 18), rep(6, 7), rep(8, 4), rep(10, 3))
predictor_neg <- c(rep(4, 1), rep(6, 17), rep(8, 36), rep(10, 39))
predictor <- c(predictor_pos, predictor_neg)

# Build ROC object (higher predictor = less likely diseased)
roc_obj <- roc(response, predictor, direction = ">")

# Plot ROC curve with AUC
plot(roc_obj, main = "ROC Curve for T4", col = "blue", print.auc = TRUE)

# Identify optimal threshold (Youden index)
coords(roc_obj, "best", ret = c("threshold", "specificity", "sensitivity"))

Example 2: Comparing Classifiers for Alzheimer’s Disease

This example compares Random Forest and Logistic Regression models using Global Gray Matter Volume (GMV) and demographic features.

# Load required libraries
if (!require("randomForest")) install.packages("randomForest")
if (!require("ROCR")) install.packages("ROCR")
library(randomForest)
library(ROCR)
# install.packages("xml2")
library("XML"); library("xml2")
library("rvest")
wiki_url <- read_html("https://wiki.socr.umich.edu/index.php/SOCR_Data_July2009_ID_NI#Curvedness_Data")
html_nodes(wiki_url, "#content")

dataset <- html_table(html_nodes(wiki_url, "table")[[2]])

# Dynamically load GMV dataset from SOCR GitHub
# url <- "https://raw.githubusercontent.com/SOCR/SOCR_Data/master/CSV_SOCR_Data/CSV_July2009_ID_NI_Curvedness_Data.csv"
# dataset <- read.csv(url, stringsAsFactors = FALSE)

# Preprocess: Binary classification (AD vs Non-AD); exclude MCI
dataset_clean <- subset(dataset, Group != "MCI")
dataset_clean$Group <- factor(dataset_clean$Group, levels = c("NonAD", "AD"))  # Ensure proper coding

# Ensure GMV, Age are numeric; Sex as factor
dataset_clean$GMV <- as.numeric(dataset_clean$GMV)
dataset_clean$Age <- as.numeric(dataset_clean$Age)
dataset_clean$Sex <- as.factor(dataset_clean$Sex)

# Random Forest model
set.seed(123)
rf_model <- randomForest(Group ~ GMV + Age + Sex, data = dataset_clean, ntree = 100)
rf_probs <- predict(rf_model, dataset_clean, type = "prob")[, "AD"]  # Explicitly reference "AD" column

# Logistic regression model (note: Sex may be excluded due to collinearity or design)
glm_model <- glm(Group ~ GMV + Age, data = dataset_clean, family = binomial)
glm_probs <- predict(glm_model, dataset_clean, type = "response")

# ROC performance
pred_rf <- prediction(rf_probs, dataset_clean$Group)
perf_rf <- performance(pred_rf, "tpr", "fpr")
auc_rf <- performance(pred_rf, "auc")@y.values[[1]]

pred_glm <- prediction(glm_probs, dataset_clean$Group)
perf_glm <- performance(pred_glm, "tpr", "fpr")
auc_glm <- performance(pred_glm, "auc")@y.values[[1]]

# Comparative plot
plot(perf_rf, col = "blue", lwd = 2, main = "ROC Comparison: RF vs GLM")
plot(perf_glm, col = "red", lwd = 2, add = TRUE)
abline(a = 0, b = 1, lty = 2, col = "gray")
legend("bottomright", 
       legend = c(paste("Random Forest (AUC =", round(auc_rf, 3), ")"),
                  paste("Logistic Reg (AUC =", round(auc_glm, 3), ")")),
       col = c("blue", "red"), lwd = 2)

Problems

Problem 6.1: ROC Construction

A study evaluates a biomarker for distinguishing lung cancer subtypes:

Biomarker Range	Type A (Positive)	Type B (Negative)
< 2	3	4
2–4	6	2
4–6	15	7
6–8	7	33
> 8	1	38

Task: Construct the ROC curve using cut-points at 2, 4, 6, and 8 (Type A = positive).
Compute the AUC and interpret whether the test is clinically useful.

Problem 6.2: Clinical Application

True or False: When screening for a serious, treatable disease (e.g., early-stage cancer), it is generally more important to have a test with high specificity than high sensitivity.

Answer: False. For treatable but serious diseases, high sensitivity is prioritized to minimize false negatives (missed cases). Specificity can be improved in follow-up confirmatory testing.

Problem 6.3: Definitions

The Positive Predictive Value (PPV) is calculated as:

(a) True Positives / Total Population
(b) True Negatives / (True Negatives + False Positives)
(c) True Positives / (True Positives + False Positives)
(d) True Negatives / Test Negatives

Answer: (c). PPV is the probability that a person with a positive test truly has the disease.

Problems 6.4–6.6: Performance Metrics

A new diabetes test yields:

	Disease Present	Disease Absent	Total
Test Positive	80	70	150
Test Negative	10	240	250
Total	90	310	400

6.4 Sensitivity\[80/90 \approx 89\%\]
6.5 Specificity\[240/310 \approx 77\%\]
6.6 PPV\[80/150 \approx 53\%\]

References

Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. *Radiology*. 1982;143(1):29–36.
Metz CE. Basic principles of ROC analysis. *Seminars in Nuclear Medicine*. 1978;8(4):283–298.
SOCR Data: Global Gray Matter Volume (GMV) Alzheimer’s Dataset
pROC and ROCR R package documentation

SOCR Home page: http://www.socr.umich.edu

Translate this page:

(default)	Deutsch	Español	Français	Italiano	Português	日本語	България	الامارات العربية المتحدة	Suomi	इस भाषा में	Norge
한국어	中文	繁体中文	Русский	Nederlands	Ελληνικά	Hrvatska	Česká republika	Danmark	Polska	România	Sverige

SMHS ROC

Contents

Scientific Methods for Health Sciences - Receiver Operating Characteristic (ROC) Curve

Overview

Motivation

Theory

The Confusion Matrix and Core Metrics

Constructing the ROC Curve

Applications and Interpretation

Area Under the Curve (AUC)

Optimal Threshold Selection and Cost Analysis

Practical Implementation in R

Example 1: Basic ROC Analysis for T4 Test

Example 2: Comparing Classifiers for Alzheimer’s Disease

Problems

Problem 6.1: ROC Construction

Problem 6.2: Clinical Application

Problem 6.3: Definitions

Problems 6.4–6.6: Performance Metrics

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools