Difference between revisions of "SMHS ParamInference"
(→Scientific Methods for Health Sciences - Parametric Inference) |
|||
Line 68: | Line 68: | ||
− | After determining the parameters in the model, we will be able to apply the characteristic of the distribution and the model to the data. The characteristics of various distributions will be discussed further in the Distribution section. We will also discuss about hypothesis testing and estimation later. | + | After determining the parameters in the model, we will be able to apply the characteristic of the distribution and the model to the data. The characteristics of various distributions will be discussed further in the ''Distribution'' section. We will also discuss about hypothesis testing and estimation later. |
R examples: | R examples: | ||
− | Random number generator: the random variable follows a normal distribution. Suppose it follows N(0,1). | + | Random number generator: the random variable follows a normal distribution. Suppose it follows $ N(0,1) $. |
## random number generator to generator 10 random variables follow a normal distribution with mean 0, variance 1: | ## random number generator to generator 10 random variables follow a normal distribution with mean 0, variance 1: | ||
> runif(10,0,1) | > runif(10,0,1) | ||
− | + | $ [1] 0.64900447 0.82074379 0.56889471 0.95659206 0.69771341 0.19772881 0.07656862 $ | |
− | + | $ [8] 0.29823980 0.31825198 0.45029058 $ | |
+ | |||
# generate 5 random variables follow a Poisson distribution with lambda = 2 | # generate 5 random variables follow a Poisson distribution with lambda = 2 | ||
+ | |||
> rpois(5,2) | > rpois(5,2) | ||
+ | |||
[1] 3 2 1 4 1 | [1] 3 2 1 4 1 | ||
− | # generate 5 random variables follow a Binomial distribution with p = 0.3, n = 10 | + | |
+ | # generate 5 random variables follow a Binomial distribution with $ p = 0.3, n = 10 $ | ||
+ | |||
> rbinom(5,10,0.3) | > rbinom(5,10,0.3) | ||
+ | |||
[1] 2 3 3 2 3 | [1] 2 3 3 2 3 | ||
+ | |||
Line 94: | Line 101: | ||
4) Applications | 4) Applications | ||
− | 4.1) This article (http://link.springer.com/article/10.1007/BF00341287) titled Parametric Inference For Imperfectly Observed Gibbsian Fields presents a maximum likelihood estimation method for imperfectly observed Gibbsian fields on a finite lattice. This method is an adaptation of the algorithm given in Younes. Presentation of the new algorithm is followed by a theorem about the limit of the second derivative of the likelihood when the lattice increases, which is related to convergence of the method. The paper offers some practical remarks about the implementation of the procedure. | + | 4.1) This article (http://link.springer.com/article/10.1007/BF00341287) titled ''Parametric Inference For Imperfectly Observed Gibbsian Fields'' presents a maximum likelihood estimation method for imperfectly observed Gibbsian fields on a finite lattice. This method is an adaptation of the algorithm given in Younes. Presentation of the new algorithm is followed by a theorem about the limit of the second derivative of the likelihood when the lattice increases, which is related to convergence of the method. The paper offers some practical remarks about the implementation of the procedure. |
4.2) This article (http://www.pnas.org/content/101/46/16138.short) uses graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models. | 4.2) This article (http://www.pnas.org/content/101/46/16138.short) uses graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models. | ||
+ | |||
5) Software | 5) Software | ||
+ | |||
http://socr.ucla.edu/htmls/SOCR_Distributions.html | http://socr.ucla.edu/htmls/SOCR_Distributions.html | ||
Line 108: | Line 117: | ||
http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_Activities_Binomial_Distributions | http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_Activities_Binomial_Distributions | ||
+ | |||
+ | |||
Line 143: | Line 154: | ||
6.4) This is a Poisson experiment in which we have the following parameters: μ=3 given that 3 houses are sold per day on average; x=4 given that we want to find the likelihood that 4 houses will be sold tomorrow. | 6.4) This is a Poisson experiment in which we have the following parameters: μ=3 given that 3 houses are sold per day on average; x=4 given that we want to find the likelihood that 4 houses will be sold tomorrow. | ||
− | p(x;μ)=(e^(-μ) μ^x)/x!,p(4,3)=(e^(-3) 3^4)/4!≈0.168 | + | $ p(x;μ)={e^{-μ {μ^{x}/x!, p(4,3) ={e^{-3 3^{4}/4!≈0.168 $ |
+ | |||
+ | $ p(x;μ)=(e^(-μ) μ^x)/x!,p(4,3)=(e^(-3) 3^4)/4!≈0.168 $ | ||
+ | |||
− | + | 6.5) This is a Poisson experiment in which we know the following parameters: μ=5 given that 5 patients with cancer will be seen per day on average; $ x=0,1,2,or 3 $ since we want to have 0, 1, 2, or 3 patients with cancer be seen the next day. | |
+ | $ p(x;μ)=\frac(e^{-μ}μ^{x})/x!,p(x≤3,5)=p(0,5)+p(1,5)+p(2,5)+p(3,5) $ | ||
− | |||
− | |||
− | + | $ =(e^(-5) 5^0)/0!+(e^(-5) 5^1)/1!+(e^(-5) 5^2)/2!+(e^(-5) 5^3)/3! $ | |
− | + | $ ≈0.0067+0.03369+0.084224+0.140375≈0.2650 $ | |
Revision as of 09:47, 25 July 2014
Scientific Methods for Health Sciences - Parametric Inference
IV. HS 850: Fundamentals
Parametric Inference
1) Overview: Statistics aims to retrieving the ‘causes’ (e.g. parameters of a probability density function) from the observations. In statistical inference, we aim to collect information on the underlying population based on a sample drawn from it. The ideal case would be to find the perfect model with unknown parameters based on which we can make further inference about the data (the population) and of which the parameters can be determined with data we have. In this lecture, we are going to introduce to the concept of variables, parametric models and making inference based on the parametric model.
2) Motivation: consider a well-known example of flipping a coin 10 times. Experience tells us that the outcome of the number of heads in one experiment with 10 flips would follow a Binomial Distribution with $ p=P(head)$ in one flip. Here, we have chosen the model to be a Binomial $ (n,p) $, where $ n=10 $. So, the next step would be to determine on the value of $ p $. An obvious way of doing this would be to flip the coin many times (say 100) and get the number of heads and the estimate of $ p $ would just be the number of heads in the 100 flips divided by 100, say $ 63/100 $. Based on the information, we have the number of heads in our experiment follows a Binomial distribution with $ (10,0.63) $. That is, we can infer that we will flip an average of 6.3 heads in 10 flips if we repeat the experiment enough time. So, what is a random variable? How to build up a parametric model based on the data? What kind of inference can we make based on the parametric model?
3) Theory
- 3.1) Random variable: a variable whose value is subject to variations due to chance (i.e., randomness). It can take on a set of values, each with an associated probability for discrete variables or a probability density function for continuous variables. The value of a random variable represents the possible outcomes of a yet-to-be-performed experiment, or the possible outcomes of a past experiment whose already-existing value is uncertain. The possible values of a random variable and their associated probabilities (known as a probability distribution) can be further described with mathematical functions.
There are two types of random variables: Discrete random variables: take on a specified finite or countable list of values, endowed with a probability mass function, characteristic of a probability distribution; Continuous random variables: take on any numerical value in an interval or collection of intervals, via a probability density function that is characteristic of a probability distribution, or a mixture of both types.
- 3.2) Parameters: a characteristic, or measurable factor that can help in defining a particular system. It is an important element to consider in evaluation or comprehension of an event. Say, μ is often used as the mean and σ is often used as the standard deviation in statistics. The following table provides of a list of commonly used parameters with descriptions:
Parameter | Description | Parameter | Description |
x ̅ | Sample mean | α,β,γ | Greek |
μ | Population mean | θ | Lower case for Theta |
σ | Population standard deviation | φ | Lower case for Phi |
σ^2 | Population variance | ω | Lower case for Omega |
s | Sample standard deviation | ∆ | Increment |
s^2 | Sample variance | ν | Nu |
λ | Poisson mean, Lambda | τ | Tau |
χ | χ distribution, Chi | η | Eta |
ρ | The density, Rho | τ | Sometimes used in tau function |
ϕ | Normal density function, Phi | Θ | Parameter space |
Γ | Gamma | Ω | Sample Space, Omega |
∂ | Per/ divided | δ | Lower case for Delta |
S | Sample space | Κ,k | Kappa |
3.3) Parametric model: a collection of probability distribution that can be described using a finite number of parameters. These parameters are usually collected together to form a single k-dimensional parameter vector $ θ=(θ_1,θ_2,…,θ_k) $. The main characteristics of a parametric mode: all the parameters are in finite-dimensional parameter spaces.
Each member of the collection of the parametric model $ p_θ $ is described by a finite-dimensional parameter $ θ $. The set of all allowable values for the parameter is denoted $ Θ⊆R^k $, and the model itself is written as $ P={p_θ |θ∈Θ} $, when the model consists of absolutely continuous distribution, it is often specified in terms of corresponding probability density function $ P={f_θ |θ∈Θ}$. It’s identifiable if the mapping $ θ→p_θ $ is invertible, that is there are no two different parameter value $ θ_1 $ and $ θ_2 $ such that $ p_{θ_1} =p_{θ_2} $.
Consider one of the most popular distribution of normal distribution, where the parameter is $ θ=(μ,σ) $, where $ μ∈R $ is a location parameter, and σ>0 is a scale parameter. This parameterized family: $ p={f_θ (x)=1/(√2π σ) e^ {-1/(2σ^2}^{(x-μ)^2} ) |μ∈R,σ>0} $.
3.4) Parametric inference: here, we would be interested in estimating $ θ $, or more generally, a function of $ θ $, say $ g(θ) $. Let’s consider a few examples that will enable us to understand this.
- Let $ x_1,x_2,…,x_n $ be the outcomes of n independent flips of the same coin. Here, we code $ X_i=1 $ if the ith toss produces a Head and code $ X_i=0 $ if the ith toss produces a tail. So $ θ $, which is the probability of flipping a head in a single toss could be any number between 0 and 1. We know that $ x_i’s are i.i.d. $ and the common distribution $ p_θ $ is the Bernoulli $ (θ) $ distribution which has the probability mass function of $ f(x,θ)=θ^x (1-θ)^(1-x),x∈{0,1} $. If we repeat the experiment with the same coin for enough time, the average number of heads we will have would be nθ.
- Let $ x_1,x_2,…,x_n $ be the number of customers that arrive at n different identical counters in unit time. Then the $ X_i's $ can be though of as i.i.d. random variable with Poisson distribution with mean $ θ $, which varies in the set $ (0,∞) $, representing the parameter space $ Θ $. The probability mass function of $ f(x,θ)=(e^{-θ} θ^{x})/x! $.
After determining the parameters in the model, we will be able to apply the characteristic of the distribution and the model to the data. The characteristics of various distributions will be discussed further in the Distribution section. We will also discuss about hypothesis testing and estimation later.
R examples:
Random number generator: the random variable follows a normal distribution. Suppose it follows $ N(0,1) $. ## random number generator to generator 10 random variables follow a normal distribution with mean 0, variance 1:
> runif(10,0,1) $ [1] 0.64900447 0.82074379 0.56889471 0.95659206 0.69771341 0.19772881 0.07656862 $ $ [8] 0.29823980 0.31825198 0.45029058 $
- generate 5 random variables follow a Poisson distribution with lambda = 2
> rpois(5,2)
[1] 3 2 1 4 1
- generate 5 random variables follow a Binomial distribution with $ p = 0.3, n = 10 $
> rbinom(5,10,0.3)
[1] 2 3 3 2 3
4) Applications
4.1) This article (http://link.springer.com/article/10.1007/BF00341287) titled Parametric Inference For Imperfectly Observed Gibbsian Fields presents a maximum likelihood estimation method for imperfectly observed Gibbsian fields on a finite lattice. This method is an adaptation of the algorithm given in Younes. Presentation of the new algorithm is followed by a theorem about the limit of the second derivative of the likelihood when the lattice increases, which is related to convergence of the method. The paper offers some practical remarks about the implementation of the procedure.
4.2) This article (http://www.pnas.org/content/101/46/16138.short) uses graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.
5) Software
http://socr.ucla.edu/htmls/SOCR_Distributions.html
http://socr.ucla.edu/htmls/exp/Bivariate_Normal_Experiment.html
http://socr.ucla.edu/htmls/dist/Multinomial_Distribution.html
http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_Activities_Binomial_Distributions
6) Problems
6.1) Suppose we are flipping a fair dice, what would be the average probability that we are going to roll three six in a row? What kind of model we are inferring on?
6.2) Consider the unfair coin flipping game, where the probability of flipping a head is unknown. Construct an experiment to test the probability of flipping a head in a single experiment. What is the probability that we are going to roll 5 heads out of 8 flips?
6.3) Random number generator is a commonly used in scientific studies. Explain how it works.
6.4) The average number of homes sold by realty Tom is 3 houses per day, what is the probability that exactly 4 houses will be sold tomorrow?
6.5) Suppose that the average number of patients with cancer seen per day is 5, what is the probability that less than 4 patients with cancer will be seen on the next day?
7) References
http://www.itl.nist.gov/div898/handbook/eda/eda.htm
http://mirlyn.lib.umich.edu/Record/000252958
http://mirlyn.lib.umich.edu/Record/012841334
ANSWERS:
6.4) This is a Poisson experiment in which we have the following parameters: μ=3 given that 3 houses are sold per day on average; x=4 given that we want to find the likelihood that 4 houses will be sold tomorrow.
$ p(x;μ)={e^{-μ {μ^{x}/x!, p(4,3) ={e^{-3 3^{4}/4!≈0.168 $
$ p(x;μ)=(e^(-μ) μ^x)/x!,p(4,3)=(e^(-3) 3^4)/4!≈0.168 $
6.5) This is a Poisson experiment in which we know the following parameters: μ=5 given that 5 patients with cancer will be seen per day on average; $ x=0,1,2,or 3 $ since we want to have 0, 1, 2, or 3 patients with cancer be seen the next day.
$ p(x;μ)=\frac(e^{-μ}μ^{x})/x!,p(x≤3,5)=p(0,5)+p(1,5)+p(2,5)+p(3,5) $
$ =(e^(-5) 5^0)/0!+(e^(-5) 5^1)/1!+(e^(-5) 5^2)/2!+(e^(-5) 5^3)/3! $
$ ≈0.0067+0.03369+0.084224+0.140375≈0.2650 $
- SOCR Home page: http://www.socr.umich.edu
Translate this page: