SMHS ParamInference
Contents
Scientific Methods for Health Sciences - Parametric Inference
Overview
In statistical inference, we aim to draw inferences about an underlying population based on a sample drawn from it. For example, we sometimes achieve this by estimating the parameters of a probability density function based on observations. In an idealized case, we would have a perfect model with unknown parameters; based on this, we would make inferences about the population by estimating the parameters with the data we have. In this section, we are going to introduce to the concept of variables, parametric models and inference based on these models.
Motivation
Consider the well-known example of flipping a coin 10 times. Experience tells us that the outcome of the number of heads in one experiment with 10 flips would follow a Binomial Distribution with $ p=P(head)$ in one flip. Here, we have chosen the model to be a Binomial $ (n,p) $, where $ n=10 $. So, the next step would be to determine on the value of $ p $. An obvious way of doing this would be to flip the coin many times (say 100) and get the number of heads and the estimate of $ p $ would just be the number of heads in the 100 flips divided by 100, say $ 63/100 $. Based on the information, we have the number of heads in our experiment follows a Binomial distribution with $ (10,0.63) $. That is, we can infer that we will flip an average of 6.3 heads in 10 flips if we repeat the experiment enough time. So, what is a random variable? How to build up a parametric model based on the data? What kind of inference can we make based on the parametric model?
Theory
- Random variable: a variable whose value is subject to variations due to chance (i.e., randomness). It can take on a set of values, each with an associated probability for discrete variables or a probability density function for continuous variables. The value of a random variable represents the possible outcomes of a yet-to-be-performed experiment, or the possible outcomes of a past experiment whose already-existing value is uncertain. The possible values of a random variable and their associated probabilities (known as a probability distribution) can be further described with mathematical functions.
- There are two types of random variables:
- Discrete random variables: take on a specified finite or countable list of values, endowed with a probability mass function, characteristic of a probability distribution;
- Continuous random variables: take on any numerical value in an interval or collection of intervals, via a probability density function that is characteristic of a probability distribution, or a mixture of both types.
- Parameters: a characteristic, or measurable factor that can help in defining a particular system. It is an important element to consider in evaluation or comprehension of an event. Say, μ is often used as the mean and σ is often used as the standard deviation in statistics. The following table provides of a list of commonly used parameters with descriptions:
Parameter | Description | Parameter | Description |
$\bar{x}$ | Sample mean | α,β,γ | Greek |
μ | Population mean | θ | Lower case for Theta |
σ | Population standard deviation | φ | Lower case for Phi |
$σ^2$ | Population variance | ω | Lower case for Omega |
s | Sample standard deviation | ∆ | Increment |
$s^2$ | Sample variance | ν | Nu |
λ | Poisson mean, Lambda | τ | Tau |
χ | χ distribution, Chi | η | Eta |
ρ | The density, Rho | τ | Sometimes used in tau function |
ϕ | Normal density function, Phi | Θ | Parameter space |
Γ | Gamma | Ω | Sample Space, Omega |
∂ | Per/ divided | δ | Lower case for Delta |
S | Sample space | Κ,k | Kappa |
- Parametric model: a collection of probability distribution that can be described using a finite number of parameters. These parameters are usually collected together to form a single k-dimensional parameter vector $\theta=(\theta_1,\theta_2,…,\theta_k)$. The main characteristics of a parametric mode: all the parameters are in finite-dimensional parameter spaces.
- Each member of the collection of the parametric model $ p_θ $ is described by a finite-dimensional parameter $ θ $. The set of all allowable values for the parameter is denoted $ Θ⊆R^k $, and the model itself is written as $ P={p_θ |θ∈Θ} $, when the model consists of absolutely continuous distribution, it is often specified in terms of corresponding probability density function $ P={f_θ |θ∈Θ}$. It’s identifiable if the mapping $ θ→p_θ $ is invertible, that is there are no two different parameter value $ θ_1 $ and $ θ_2 $ such that $ p_{θ_1} =p_{θ_2} $.
- Consider one of the most popular distribution of normal distribution, where the parameter is $ θ=(μ,σ) $, where $ μ∈R $ is a location parameter, and σ>0 is a scale parameter. This parameterized family:
$$ p=\{f_θ (x)=\frac{1}{\sqrt{2πσ}} e^{-\frac{1}{2σ^2}{({x-μ}^2)}} |μ∈R,σ>0\}.$$
- Parametric inference: Often, we are interested in estimating $ \theta $, or more generally, a function of $ \theta $, say $ g(\theta) $. Let’s consider a few examples that will enable us to understand this.
- Let $ x_1,x_2,…,x_n $ be the outcomes of n independent flips of the same coin. Here, we code $ X_i=1 $ if the $i^{th}$ toss produces a Head and code $ X_i=0 $ if the $i^{th}$ toss produces a tail. So $ \theta $, which is the probability of flipping a head in a single toss could be any number between 0 and 1. We know that $ x_i$’s are i.i.d. and the common distribution $ p_{\theta} $ is the Bernoulli $ (\theta) $ distribution which has the probability mass function of $ f(x,\theta)=\theta^x (1-\theta)^(1-x), x \in {0,1} $. If we repeat the experiment with the same coin for enough time, the average number of heads we will have would be $n \theta$.
- Let $ x_1,x_2,…,x_n $ be the number of customers that arrive at $n$ different identical counters in unit time. Then the $ X_i$'s can be though of as i.i.d. random variable with Poisson distribution with mean $ \theta $, which varies in the set $ (0,\infty) $, representing the parameter space $ \Theta $. The probability mass function of $ f(x,\theta)=e^{-\theta} \frac{\theta^{x}}{x!}$, for each $x=0, 1, 2, ...$.
- After determining the parameters in the model, we will be able to apply the characteristic of the distribution and the model to the data. The characteristics of various distributions will be discussed further in the Distribution section. We will also discuss about hypothesis testing and estimation later.
Random number generation
- R examples: the random variable follows a normal distribution, $ N(0,1) $.
- Random number generator to get 10 random variables follow a normal distribution with mean 0, variance 1:
> runif(10,0,1) [1] 0.64900447 0.82074379 0.56889471 0.95659206 0.69771341 0.19772881 0.07656862 [8] 0.29823980 0.31825198 0.45029058
- Generate 5 random variables follow a Poisson distribution with $\lambda = 2$
> rpois(5,2) [1] 3 2 1 4 1
- Generate 5 random variables follow a Binomial distribution with $ p = 0.3, n = 10 $
> rbinom(5,10,0.3) [1] 2 3 3 2 3
Applications
- The article titled Parametric Inference For Imperfectly Observed Gibbsian Fields presents a maximum likelihood estimation method for imperfectly observed Gibbsian fields on a finite lattice. This method is an adaptation of the algorithm given in Younes. Presentation of the new algorithm is followed by a theorem about the limit of the second derivative of the likelihood when the lattice increases, which is related to convergence of the method. The paper offers some practical remarks about the implementation of the procedure.
- This article uses graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.
Software
- SOCR Distributions
- Bivariate Normal Experiment
- Multinomial Distribution
- Activities with Binomial Distributions
Problems
- Suppose we are flipping a fair dice, what would be the average probability that we are going to roll three six in a row? What kind of model we are inferring on?
- Consider the unfair coin flipping game, where the probability of flipping a head is unknown. Construct an experiment to test the probability of flipping a head in a single experiment. What is the probability that we are going to roll 5 heads out of 8 flips?
- Random number generator is a commonly used in scientific studies. Explain how it works.
- The average number of homes sold by realty Tom is 3 houses per day, what is the probability that exactly 4 houses will be sold tomorrow?
- Suppose that the average number of patients with cancer seen per day is 5, what is the probability that less than 4 patients with cancer will be seen on the next day?
References
- SOCR Home page: http://www.socr.umich.edu
Translate this page: