SMHS ParamInference

From SOCR
Revision as of 15:57, 25 July 2014 by WikiSysop (talk | contribs)
Jump to: navigation, search

Scientific Methods for Health Sciences - Parametric Inference

IV. HS 850: Fundamentals

Parametric Inference

1) Overview: Statistics aims to retrieving the ‘causes’ (e.g. parameters of a probability density function) from the observations. In statistical inference, we aim to collect information on the underlying population based on a sample drawn from it. The ideal case would be to find the perfect model with unknown parameters based on which we can make further inference about the data (the population) and of which the parameters can be determined with data we have. In this lecture, we are going to introduce to the concept of variables, parametric models and making inference based on the parametric model.


2) Motivation: consider a well-known example of flipping a coin 10 times. Experience tells us that the outcome of the number of heads in one experiment with 10 flips would follow a Binomial Distribution with $ p=P(head)$ in one flip. Here, we have chosen the model to be a Binomial $ (n,p) $, where $ n=10 $. So, the next step would be to determine on the value of $ p $. An obvious way of doing this would be to flip the coin many times (say 100) and get the number of heads and the estimate of $ p $ would just be the number of heads in the 100 flips divided by 100, say $ 63/100 $. Based on the information, we have the number of heads in our experiment follows a Binomial distribution with $ (10,0.63) $. That is, we can infer that we will flip an average of 6.3 heads in 10 flips if we repeat the experiment enough time. So, what is a random variable? How to build up a parametric model based on the data? What kind of inference can we make based on the parametric model?


3) Theory

  • 3.1) Random variable: a variable whose value is subject to variations due to chance (i.e., randomness). It can take on a set of values, each with an associated probability for discrete variables or a probability density function for continuous variables. The value of a random variable represents the possible outcomes of a yet-to-be-performed experiment, or the possible outcomes of a past experiment whose already-existing value is uncertain. The possible values of a random variable and their associated probabilities (known as a probability distribution) can be further described with mathematical functions.

There are two types of random variables: Discrete random variables: take on a specified finite or countable list of values, endowed with a probability mass function, characteristic of a probability distribution; Continuous random variables: take on any numerical value in an interval or collection of intervals, via a probability density function that is characteristic of a probability distribution, or a mixture of both types.


  • 3.2) Parameters: a characteristic, or measurable factor that can help in defining a particular system. It is an important element to consider in evaluation or comprehension of an event. Say, μ is often used as the mean and σ is often used as the standard deviation in statistics. The following table provides of a list of commonly used parameters with descriptions:
Parameter Description Parameter Description
x ̅ Sample mean α,β,γ Greek
μ Population mean θ Lower case for Theta
σ Population standard deviation φ Lower case for Phi
σ^2 Population variance ω Lower case for Omega
s Sample standard deviation Increment
s^2 Sample variance ν Nu
λ Poisson mean, Lambda τ Tau
χ χ distribution, Chi η Eta
ρ The density, Rho τ Sometimes used in tau function
ϕ Normal density function, Phi Θ Parameter space
Γ Gamma Ω Sample Space, Omega
Per/ divided δ Lower case for Delta
S Sample space Κ,k Kappa


3.3) Parametric model: a collection of probability distribution that can be described using a finite number of parameters. These parameters are usually collected together to form a single k-dimensional parameter vector $ θ=(θ_1,θ_2,…,θ_k) $. The main characteristics of a parametric mode: all the parameters are in finite-dimensional parameter spaces.

Each member of the collection of the parametric model $ p_θ $ is described by a finite-dimensional parameter $ θ $. The set of all allowable values for the parameter is denoted $ Θ⊆R^k $, and the model itself is written as $ P={p_θ |θ∈Θ} $, when the model consists of absolutely continuous distribution, it is often specified in terms of corresponding probability density function $ P={f_θ |θ∈Θ}$. It’s identifiable if the mapping $ θ→p_θ $ is invertible, that is there are no two different parameter value $ θ_1 $ and $ θ_2 $ such that $ p_{θ_1} =p_{θ_2} $.


Consider one of the most popular distribution of normal distribution, where the parameter is $ θ=(μ,σ) $, where $ μ∈R $ is a location parameter, and σ>0 is a scale parameter. This parameterized family: $ p={\{f_θ (x)=\frac{1}{\sqrt{2πσ}} e^{-\frac{1}{2σ^2}^{({x-μ}^2)}} |μ∈R,σ>0\}} $.


3.4) Parametric inference: here, we would be interested in estimating $ θ $, or more generally, a function of $ θ $, say $ g(θ) $. Let’s consider a few examples that will enable us to understand this.

    • Let $ x_1,x_2,…,x_n $ be the outcomes of n independent flips of the same coin. Here, we code $ X_i=1 $ if the ith toss produces a Head and code $ X_i=0 $ if the ith toss produces a tail. So $ θ $, which is the probability of flipping a head in a single toss could be any number between 0 and 1. We know that $ x_i’s are i.i.d. $ and the common distribution $ p_θ $ is the Bernoulli $ (θ) $ distribution which has the probability mass function of $ f(x,θ)=θ^x (1-θ)^(1-x),x∈{0,1} $. If we repeat the experiment with the same coin for enough time, the average number of heads we will have would be nθ.
    • Let $ x_1,x_2,…,x_n $ be the number of customers that arrive at n different identical counters in unit time. Then the $ X_i's $ can be though of as i.i.d. random variable with Poisson distribution with mean $ θ $, which varies in the set $ (0,∞) $, representing the parameter space $ Θ $. The probability mass function of $ f(x,θ)=(e^{-θ} θ^{x})/x! $.


After determining the parameters in the model, we will be able to apply the characteristic of the distribution and the model to the data. The characteristics of various distributions will be discussed further in the Distribution section. We will also discuss about hypothesis testing and estimation later.

R examples:

Random number generator: the random variable follows a normal distribution. Suppose it follows $ N(0,1) $. ## random number generator to generator 10 random variables follow a normal distribution with mean 0, variance 1:

> runif(10,0,1) $ [1] 0.64900447 0.82074379 0.56889471 0.95659206 0.69771341 0.19772881 0.07656862 $ $ [8] 0.29823980 0.31825198 0.45029058 $


  1. generate 5 random variables follow a Poisson distribution with lambda = 2

> rpois(5,2)

[1] 3 2 1 4 1


  1. generate 5 random variables follow a Binomial distribution with $ p = 0.3, n = 10 $

> rbinom(5,10,0.3)

[1] 2 3 3 2 3



4) Applications

4.1) This article (http://link.springer.com/article/10.1007/BF00341287) titled Parametric Inference For Imperfectly Observed Gibbsian Fields presents a maximum likelihood estimation method for imperfectly observed Gibbsian fields on a finite lattice. This method is an adaptation of the algorithm given in Younes. Presentation of the new algorithm is followed by a theorem about the limit of the second derivative of the likelihood when the lattice increases, which is related to convergence of the method. The paper offers some practical remarks about the implementation of the procedure.


4.2) This article (http://www.pnas.org/content/101/46/16138.short) uses graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.


5) Software

http://socr.ucla.edu/htmls/SOCR_Distributions.html

http://socr.ucla.edu/htmls/exp/Bivariate_Normal_Experiment.html

http://socr.ucla.edu/htmls/dist/Multinomial_Distribution.html

http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_Activities_Binomial_Distributions



6) Problems

6.1) Suppose we are flipping a fair dice, what would be the average probability that we are going to roll three six in a row? What kind of model we are inferring on?


6.2) Consider the unfair coin flipping game, where the probability of flipping a head is unknown. Construct an experiment to test the probability of flipping a head in a single experiment. What is the probability that we are going to roll 5 heads out of 8 flips?


6.3) Random number generator is a commonly used in scientific studies. Explain how it works.


6.4) The average number of homes sold by realty Tom is 3 houses per day, what is the probability that exactly 4 houses will be sold tomorrow?


6.5) Suppose that the average number of patients with cancer seen per day is 5, what is the probability that less than 4 patients with cancer will be seen on the next day?


7) References

http://www.itl.nist.gov/div898/handbook/eda/eda.htm

http://mirlyn.lib.umich.edu/Record/000252958

http://mirlyn.lib.umich.edu/Record/012841334



ANSWERS:

6.4) This is a Poisson experiment in which we have the following parameters: μ=3 given that 3 houses are sold per day on average; x=4 given that we want to find the likelihood that 4 houses will be sold tomorrow.

$ p(x;μ)={e^{-μ {μ^{x}/x!, p(4,3) ={e^{-3 3^{4}/4!≈0.168 $

$ p(x;μ)=(e^(-μ) μ^x)/x!,p(4,3)=(e^(-3) 3^4)/4!≈0.168 $


6.5) This is a Poisson experiment in which we know the following parameters: μ=5 given that 5 patients with cancer will be seen per day on average; $ x=0,1,2,or 3 $ since we want to have 0, 1, 2, or 3 patients with cancer be seen the next day.

$ p(x;μ)=\frac(e^{-μ}μ^{x})/x!,p(x≤3,5)=p(0,5)+p(1,5)+p(2,5)+p(3,5) $


$ =(e^(-5) 5^0)/0!+(e^(-5) 5^1)/1!+(e^(-5) 5^2)/2!+(e^(-5) 5^3)/3! $

$ ≈0.0067+0.03369+0.084224+0.140375≈0.2650 $















Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif