Difference between revisions of "Scientific Methods for Health Sciences"

Revision as of 11:31, 28 July 2014

The Scientific Methods for Health Sciences EBook is still under active development. It is expected to be complete by Sept 01, 2014, when this banner will be removed.

SOCR Wiki: Scientific Methods for Health Sciences

Scientific Methods for Health Sciences EBook

Electronic book (EBook) on Scientific Methods for Health Sciences (coming up ...)

Preface

The Scientific Methods for Health Sciences (SMHS) EBook is designed to support a 4-course training of scientific methods for graduate students in the health sciences.

Format

Follow the instructions in this page to expand, revise or improve the materials in this EBook.

Learning and Instructional Usage

This section describes the means of traversing, searching, discovering and utilizing the SMHS EBook resources in both formal and informal learning setting.

Copyrights

The SMHS EBook is a freely and openly accessible electronic book developed by SOCR and the general community.

Chapter I: Fundamentals

Exploratory Data Analysis, Plots and Charts

Review of data types, exploratory data analyses and graphical representation of information.

Ubiquitous Variation

There are many ways to quantify variability, which is present in all natural processes.

Parametric Inference

Foundations of parametric (model-based) statistical inference.

Random variables, stochastic processes, and events are the core concepts necessary to define likelihoods of certain outcomes or results to be observed. We define event manipulations and present the fundamental principles of probability theory including conditional probability, total and Bayesian probability laws, and various combinatorial ideas.

Odds Ratio/Relative Risk

The relative risk, RR, (a measure of dependence comparing two probabilities in terms of their ratio) and the odds ratio, OR, (the fraction of one probability and its complement) are widely applicable in many healthcare studies.

Probability Distributions

Probability distributions are mathematical models for processes that we observe in nature. Although there are different types of distributions, they have common features and properties that make them useful in various scientific applications.

Resampling and Simulation

Resampling is a technique for estimation of sample statistics (e.g., medians, percentiles) by using subsets of available data or by randomly drawing replacement data. Simulation is a computational technique addressing specific imitations of what’s happening in the real world or system over time without awaiting it to happen by chance.

Design of Experiments

Design of experiments (DOE) is a technique for systematic and rigorous problem solving that applies data collection principles to ensure the generation of valid, supportable and reproducible conclusions.

Intro to Epidemiology

Epidemiology is the study of the distribution and determinants of disease frequency in human populations. This section presents the basic epidemiology concepts. More advanced epidemiological methodologies are discussed in the next chapter.

Experiments vs. Observational Studies

Experimental and observational studies have different characteristics and are useful in complementary investigations of association and causality.

Estimation

Estimation is a method of using sample data to approximate the values of specific population parameters of interest like population mean, variability or 97^th percentile. Estimated parameters are expected to be interpretable, accurate and optimal, in some form.

Hypothesis Testing

Hypothesis testing is a quantitative decision-making technique for examining the characteristics (e.g., centrality, span) of populations or processes based on observed experimental data.

Statistical Power, Sensitivity and Specificity

The fundamental concepts of type I (false-positive) and type II (false-negative) errors lead to the important study-specific notions of statistical power, sample size, effect size, sensitivity and specificity.

Data Management

All modern data-driven scientific inquiries demand deep understanding of tabular, ASCII, binary, streaming, and cloud data management, processing and interpretation.

Bias and Precision

Bias and precision are two important and complementary characteristics of estimated parameters that quantify the accuracy and variability of approximated quantities.

Association and Causality

An association is a relationship between two, or more, measured quantities that renders them statistically dependent so that the occurrence of one does affect the probability of the other. A causal relation is a specific type of association between an event (the cause) and a second event (the effect) that is considered to be a consequence of the first event.

Rate-of-change

Rate of change is a technical indicator describing the rate in which one quantity changes in relation to another quantity.

Clinical vs. Statistical Significance

Statistical significance addresses the question of whether or not the results of a statistical test meet an accepted quantitative criterion, whereas clinical significance is answers the question of whether the observed difference between two treatments (e.g., new and old therapy) found in the study large enough to alter the clinical practice.

IV. HS 850: Fundamentals

Multiple Testing Inference

1) Overview: Multiple testing refers to the cases where testing of several hypotheses are involved simultaneously. This is very common in empirical researches and additional methods besides the traditional rules needs to be applied in multiple testing in order to adjust for the multiple testing problems. In this lecture, we are going to introduce to the area of multiple testing: we will introduce to the basic concepts we are going to use in multiple testing and discuss about the general problems with multiple testing and ways to deal with the multiple testing problems efficiently including with Bonferroni, Tukey’s procedure, Family-Wise Error Rate (FWER), and FDR (false discovery rate).

2) Motivation: We have learned how to do the hypothesis testing with statistical tests. However, the multiple testing problems will occur when one considers a set of statistical inference simultaneously, or infers a subset of parameters selected based on the observed values. So what can we do to adjust for multiple testing? How can we keep the prescribed family wise error rate of α in an analysis involving more than one comparison? Apparently, the error rate for each comparison, it must be more stringent than α. Multiple testing correction would be the way to go and we are going to introduce some commonly used methods for adjusting for this type of error in multiple testing.

3) Theory

3.1) Family-Wise Error Rate (FWER): the probability of making the type I error among all the hypotheses when performing multiple hypothesis tests. FWER exerts a more stringent control over false discovery compared to false discovery rate controlling procedure. Suppose we did simultaneous tests on m hypotheses denoted by H_1,H_2,…,H_m with corresponding p-values p_1,p_2,…,p_m. Let I_0 be the subset of the true null hypotheses with m_0. Our aim is to achieve an overall type I error rate of α from this multiple testing. FWER=Pr⁡(V≥1)=1-Pr⁡(V=0). By assuming FWER≤α, the probability of making even one type I error in the family is controlled at level α.

|class=w"wikitable" style="text-align:center; width:75%" border="1" | ||Null hypothesis is True|| Alternative hypothesis is True || Total|||| |Declared significant||V (number of false positives)||S(number of true positives)||R Declared non-significant|||| |U(number of true negatives)||T(number of false negatives)|| m-R|||| |Total||m_0(number of true null hypotheses)||m-m_0(number of true alternatvies)||m||||

A procedure controls the FWER in the weak sense if the FWER control at level α is guaranteed only when all hypotheses are true. A procedure controls the FWER in the strong sense if the FWER control at level α is guaranteed for any configuration of true and non-true null hypotheses.

Controlling FWER: Bonferrroni: states that rejecting all p_i≤α/m will control that FWER≤α which is proved through Boole’s Inequality: FWER=Pr⁡{⋃_(I_0)▒〖(p_i≤α/m)}〗 ∑_(I_0)▒Pr⁡(p_i≤α/m) }≤m_0 α/m≤m α/m=α. This is the simplest and most conservative method to control FWER though it can be somewhat conservative if there are a large number of tests and/or the test statistics are positively correlated. It controls the probability of false positives only. Tukey’s procedure: only applicable for pairwise comparisons. It assumes independence of the observations being tested as well as equal variation across observations. The procedure calculates for each pair the standardized range statistics: (Y_A-Y_B)/SE where Y_A is the larger of the two means being compared and Y_B is the smaller one and SE is the standard error of the data. The S ̌Ida ́k procedure: works for independent tests where each hypothesis test has α_SID=1-(1-α)^(1/m). This is a more powerful method than Bonferroni but the gain is small. Holm’s step-down procedure: starts by ordering the p values from lowest to highest as p_((1)),p_((2)),…,p_((m) ) with corresponding hypotheses H_((1)),H_((2)),…,H_((m) ). Suppose R is the smallest k such that p_((k))>α/(m+1-k). Reject the null hypotheses H_((1)),H_((2)),…,H_((R-1) ). If R=1 then none of the hypotheses are rejected. This method is uniformly better than Bonferroni’s and it is based on Bonferroni with no restriction on the joint distribution of the test statistics. Hochberg’s step-up procedure: starts by ordering the p values from lowest to highest as p_((1)),p_((2)),…,p_((m) ) with corresponding hypotheses H_((1)),H_((2)),…,H_((m) ). For a given α, let R be the largest k such that p_((k))≤α/(m+1-k). Reject the null hypotheses H_((1)),H_((2)),…,H_((R) ). It is more powerful than Holm’s, however, it is based on the Simes test so it holds only under independence (and also under some form of positive dependence).

3.4) FDR (false discovery rate): a statistical method used in multiple hypothesis testing to adjust for multiple comparisons. It is designed to control the expected proportion of incorrectly rejected null hypotheses. Compared to FWER, it exerts a less stringent control over false discovery and seeks to reduce the probability of even one false discovery as opposed to the expected proportion of false discoveries and enjoys greater power at the cost increased rate of type I errors.

Null hypothesis is True Alternative hypothesis is True Total Declared significant V (number of false positives) S(number of true positives) R Declared non-significant U(number of true negatives) T(number of false negatives) m-R Total m_0(number of true null hypotheses) m-m_0(number of true alternatives) m Define Q as the proportion of false discoveries among the discoveries (Q=V/R) then FDR is defined as FDR=Q_e=E[Q]=E[V/(V+S)]=E[V/R],where V/R is defined to be 0 when R=0. Our aim is to keep FDR below the threshold α (or q). And q-value is defined as FDR analogue of the p-value, the q-value of individual hypothesis test is the minimum FDR at which the test may be called significant.

Controlling procedures of FDR: With m null hypotheses H_1,H_2,…,H_m and p_1,p_2,…,p_m as their corresponding p-values. We order these p-values in increasing order and denote as p_((1)),p_((2)),…,p_((m) ). Benjamini-Hochberg procedure: controls the false discovery (at least α). For a given α, find the largest k such that p_((k))≤k/m α; then reject all H_((i)) for i=1,…,k. This method works when the m tests are independent as well as with some cases of dependence. E(Q)≤m_0/m α≤α. Benjamini-Hochberg-Yekutieli procedure: controls the FDR under positive dependence assumptions. It modifies the BH procedure: p_((k))≤k/(m*c(m)) α, if the tests are independent or positively correlated we choose c(m)=1 and choose c(m)=∑_(i=1)^m▒1/i with arbitrary dependent tests, when the tests are negatively correlated, c(m) can be approximated with ∑_(i=1)^m▒1/i≈ln⁡(m)+γ.

Example: Suppose we have computed a vector of p-values (p_1,p_2…,p_n). Let’s compare the corrections using different strategies:

Given a set of p-values, returns p-values adjusted using one of several methods.
c("holm", "hochberg", "hommel", "bonferroni", "fdr", "BY",
"fdr", "none")

> p.adjust(c(0.05,0.05,0.1),"bonferroni") [1] 0.15 0.15 0.30

> p.adjust(c(0.05,0.05,0.1),"fdr") [1] 0.075 0.075 0.100

> p.adjust(c(0.05,0.05,0.1),"holm") [1] 0.15 0.15 0.15

4) Applications

4.1) This article (http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_AnalysesCommandLineFDR_Correction) presents information on how to use the SOCR analyses library for the purpose of computing the False Discovery Rate (FDR) correction for multiple testing in volumetric and shape-based analyses. It provides the specific procedure to compute FDR using SOCR in multiple testing and illustrates with examples and supplementary information about FDR.

4.2) This article (http://home.uchicago.edu/amshaikh/webfiles/palgrave.pdf) is a comprehensive introduction to multiple testing. It describes the problem of multiple testing more formally and discusses methods, which account for the multiplicity issue. In particular, the recent developments based on resampling results in an improved ability to reject false hypotheses compared to classical methods such as Bonferroni.

5) Software http://bioinformatics.oxfordjournals.org/content/21/12/2921.full http://socr.ucla.edu/htmls/SOCR_Analyses.html http://graphpad.com/quickcalcs/PValue1.cfm http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_AnalysesCommandLineFDR_Correction

6) Problems

6.1) Suppose the research is conducted to test a new drug and there are 10 hypotheses being tested simultaneously. Calculate the significance level of each individual test using Bonferroni correction if we want to maintain an overall type I error of 5% and the probability of observing at least one significant result when using the correction you chose?

6.2) Consider we are working with a study on test of a new drug for cancer where we have three treatments: the new medicine, the old medicine and the combination of the two. We are doing a pairwise test on these three treatments and want to maintain a Type I error rate of 5%. Consider the Tukey’s correction and describe how you are going to apply this method here.

7) References http://mirlyn.lib.umich.edu/Record/004199238 http://mirlyn.lib.umich.edu/Record/004232056 http://mirlyn.lib.umich.edu/Record/004133572

Correction for Multiple Testing

Multiple testing refers to analytical protocols involving testing of several (typically more then two) hypotheses. Multiple testing studies require correction for type I (false-positive rate), which can be done using Bonferroni's method, Tukey’s procedure, family-wise error rate (FWER), or false discovery rate (FDR).

Common mistakes and misconceptions in using probability and statistics, identifying potential assumption violations, and avoiding them

Chapter IV: Special Topics

Scientific Visualization

PCOR/CER methods Heterogeneity of Treatment Effects

Big-Data/Big-Science

Missing data

Genotype-Environment-Phenotype associations

Medical imaging

Data Networks

Adaptive Clinical Trials

Databases/registries

Meta-analyses

Causality/Causal Inference, SEM

Classification methods

Time-series analysis

Scientific Validation

Geographic Information Systems (GIS)

Rasch measurement model/analysis

MCMC sampling for Bayesian inference

Network Analysis

SOCR Home page: http://www.socr.umich.edu

Translate this page:

(default)	Deutsch	Español	Français	Italiano	Português	日本語	България	الامارات العربية المتحدة	Suomi	इस भाषा में	Norge
한국어	中文	繁体中文	Русский	Nederlands	Ελληνικά	Hrvatska	Česká republika	Danmark	Polska	România	Sverige

@@ Line 86: / Line 86: @@
 .1) Family-Wise Error Rate (FWER): the probability of making the type I error among all the hypotheses when performing multiple hypothesis tests. FWER exerts a more stringent control over false discovery compared to false discovery rate controlling procedure. Suppose we did simultaneous tests on m hypotheses denoted by H_1,H_2,…,H_m with corresponding p-values p_1,p_2,…,p_m. Let I_0 be the subset of the true null hypotheses with m_0. Our aim is to achieve an overall type I error rate of α from this multiple testing.  FWER=Pr⁡(V≥1)=1-Pr⁡(V=0). By assuming FWER≤α, the probability of making even one type I error in the family is controlled at level α.
-	Null hypothesis is True	Alternative hypothesis is True	Total
-Declared significant	V (number of false positives)	S(number of true positives)	R
+|class=w"wikitable" style="text-align:center; width:75%" border="1"
-Declared non-significant	U(number of true negatives)	T(number of false negatives)	m-R
+| ||Null hypothesis is True||	Alternative hypothesis is True ||	Total||||
-Total	m_0(number of true null hypotheses)	m-m_0(number of true alternatvies)	m
+|Declared significant||V (number of false positives)||S(number of true positives)||R
+Declared non-significant||||
+|U(number of true negatives)||T(number of false negatives)||	m-R||||
+|Total||m_0(number of true null hypotheses)||m-m_0(number of true alternatvies)||m||||
 	A procedure controls the FWER in the weak sense if the FWER control at level α is guaranteed only when all hypotheses are true.