SOCR EduMaterials ModelerActivities NormalBetaModelFit

From SOCR
Revision as of 17:28, 5 July 2007 by IvoDinov (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

SOCR Educational Materials - Activities - SOCR Normal and Beta Distribution Model Fit Activity

Summary

This activity describes the process of SOCR model fitting in the case of using Normal or Beta distribution models. Model fitting is the process of determining the parameters for an analytical model in usch a way that we obtain optimal parameter estimates according to some critirion. There are many strategies for parameter estimation. The differences between most of these are the underlying cost-functions and the optimization strategies applied to maximize/minimize the cost-function.

Goals

The aims of this activity are to:

  • motivate the need for (analytical) modeling of natural processes
  • illustrate how to use the SOCR Modeler to fit models to real data
  • present applications of model fitting

Background & Motivation

Suppose we are given the sequence of numbers {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and asked to find the best (Continuous) Uniform Distribution that fits that data. In this case there are two parameters that need to be estimated - the minimum (m) and the maximum (M) of the data. These parameters determine exactly the support (domain) of the continuous distribution and we can explicitely write the density for the (best fit) continuous uniform distribution as:

\(f(x) = {{1}\over{M-m}}\), for \(m \le x \le M\) and \(f(x)=0\), for \(x \notin [m:M]\).

Having this model distribution, we can use it's analytical form (\(f(x)\))to compute probabilities of events, critical functional values and, in general, do inference on the native process withiout acquirying additional data. Hence a good strategy for model fitting is extremely useful in data analysis and statitical inference. Of course, any inference based on models is only going to be as good as the data and the optimization strategy used to generate the model.

Let's look at another motivational example. This time, suppose we have recorded the following measurements from a procces {1.2, 1.7, 3.4, 1.5, 1.1, 1.7, 3.5, 2.5}. Taking bin-size of 1, we can easily calculate the frequency histogram for this process as {6, 1, 2}.

SOCR Activities NormalBetaModelFit Dinov 070507 Fig1.png

We can now ask about the best Beta distribution model fit to the histogram of the data!

Exercises

Exercise 1

Go to the SOCR Modeler and click on the Data Generation tab. Select 200 observations from the Generalized Beta Distribution, as shown on the image below. Choose this four-tuple for the parameters \( \alpha=1.5; \beta=3; A=0; B=7\). Copy these 200 values in your mouse buffer (CNT-C) and paste them in the Data tab of the LineCharts --> PowerTransformHistogramChart under SOCR Charts. Then Map this column to XYValue (under the MAP tab) and click Update_Chart. This will generate the histogram of the 200 observations. Indeed, this graph should look like a discrete analog of the Generalized Beta density curve. You can see exactly what the Generalized Beta Distribution looks like by going to SOCR Distributions and selecting \( Beta(\alpha=1.5; \beta=3; A=0; B=7)\).

SOCR Activities PowerTransformGraphing Dinov 022007 Fig10.jpg

SOCR Activities PowerTransformGraphing Dinov 022007 Fig9.jpg

Applications

TBD




Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif