SOCR EduMaterials Activities 2D PointSegmentation EM Mixture

From SOCR
Revision as of 15:47, 4 April 2007 by IvoDinov (talk | contribs)
Jump to: navigation, search

SOCR Educational Materials - Activities - SOCR Activity Demonstrating Expectation Maximization and Mixture Modeling

Summary

This is activity demonstrates mixture modeling and expectation maximization (EM) applied to the problem of 2D point cluster segmentation.

Background

You may find useful this review the mathematics of Expectation Maximization and Mixture Modeling. In this activity we will demonstrate how the EM and mixture modeling may be used to obtain cluster classification of points in 2D. There are a number of problems where such segmentation is very important for solving a practical problem. 1D and 3D applications of the EM mixture modeling are also included at the end.

Exercises

Exercise 1: SOCR Charts Activity

  • This exercise demonstrates the applications of expectation maximization and mixure modeling for cluster classification of points in 2D. Go to SOCR Charts and select Line Charts --> SOCR EM MixtureModelChart. The image below demonstrates the overall look-and-feel of the SOCR EM Mixture Model applet. Try adding several clusters of points and select different number of the kernels/mixtures you want to fit to this data. Notice the adaptive behavior of the algorithm.
SOCR Activities EMMixtureModel Dinov 040407 Fig1.jpg
  • Data: You can enter data (paired X,Y observations) either via the Data-Tab, using copy-and-paste, or via the add random points button (RandomPts). This applet was designed specifically so that you may enter your own paired observations in tabular form. If you are not interested in data analysis, but want to explore more the properties of the EM mixture model you may try the second SOCR applet (SOCR Experiments, select Mixture Model EM Experiment from the drop down list of SOCR Experiments on the top-left), which allows you to manually add points by clicking on the graphing canvas.
  • Applet specifications: You can fit one of two models to your data (Linear or elliptical Gaussian), see the GaussianMix button. The Normal button allows you to choose Slow, Normal and Fast speed for the iterative EM estimation. The ClearPts will remove all data and reset the applet. The InitKernels button re-initializes the kernels (location and size) and is useful for shaking the EM algorithm to escape local minima. The drop-down list on the top-right allows you to select the number of kernels you want to fit to this data. There is no exact method for determining this number in all situations (manually or automatically). Typically, one uses the physical or biological properties of the studied data/process to determine the appropriate integer value. The Step, Run and Stop buttons on the top have the natural functions associated with driving the iterative EM algorithm. The Segment button allows labeling/classification of the points once the EM algorithm has converged or is stopped.
  • Segmentation Results: Once the EM algorithm converges to a visually satisfactory result you should stop the iterative process (Stop button) and click on the Segment button. You will obtain a color classification of all points in 2D based on which of the kernels is most likely to contain the point in its neighborhood. In addition to this visual classification, the Data tab-panel will contain a couple of result columns that contain the complete analytical description of the kernels (as 2D Gaussians), the mixture-model weight coefficients, the log-likelihood function (quantifying how good the math between the mixture model and the data is) and membership of all data points to one of the kernels.
SOCR Activities EMMixtureModel Dinov 040407 Fig4.jpg


Exercise 2: 1D and 3D Examples of SOCR EM Mixture Modeling

You may also see the action of the same SOCR EM Mixture modeling algorithm for analyzing 1D or 3D data.

  • For 1D data, you can see the EM mixture model fitting used by the SOCR Modeler to fit a polynomial, spectral or distribution model to (randomly sampled or observed) data. To see this, go to SOCR Modeler and select MixedFit_Modeler from the drop-down list of models on the top-left. The figure below shows the result of fitting a 3-kernel Mixture of Normal (Gaussian) distributions to the histogram of a random sample of 100 Cauchy random variables.
SOCR Activities EMMixtureModel Dinov 040407 Fig2.jpg
  • A demonstration of a 3D data analysis using the SOCR EM Mixture model is included in the LONI Viz Manual. This shows how 3D brain imaging data may be segmented into three tissue types (White Matter, Gray Matter and Cerebrospinal Fluid). This is achieved by LONI Viz sending the segmentation tasks to SOCR and SOCR returning back the 3D segmented volumes which are superimposed dynamically on top of the initial anatomical brain imaging data in real time. The figure below illustrates this functionality. Other external computational tools could also invoke SOCR statistical computing resources directly by using the SOCR JAR binaries and the SOCR Documentation.
SOCR Activities EMMixtureModel Dinov 040407 Fig3.jpg


Questions

  • How stable is the EM algorithm in finding a solution? Does this solution appear to be unique?
  • What affects the convergence properties of the EM algorithm (data characteristics, number and properties of the starting initial kernel(s), etc.)?

References





Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif