SOCR Resampling HTML5 Project
Contents
SOCR Project - SOCR HTML5 Resampling, Randomization and Simulation Project
Project goals
The goal of this project is to design a new, modern and portable SOCR web-app that demonstrates the concepts of statistical resampling, randomization and probabilistic simulation, which is purely based on HTML5, CSS3 and JavaScript framework. The implementation of this project demands platform portability, computational efficiency, usability (complete functionality via user-friendly modern interface), extensibility and ease of documentation, support and servicing to the entire community.
Project specification
The general stats-education community needs new web-applications (web-apps) that run in the browser on portable devices and demonstrate graphically and interactively simulation, sampling-resampling and bootstrapping-based statistical inference. This project specification describes some specific examples, applications and use-cases that would aid with the design of a new SOCR web-app that we can test in the classroom. The core functionality, usability and appearance of this new web-app is described below.
The two basic directions for Sampling/Resampling-based Inference are:
- Simulation-Driven: We have several experiments (dice, coins, cards, etc.) generate 1, or many, sample(s). First, we need to replicate 3+ of these simulations in HTML5. Then we can show the sample (user controls the sample size, N), animate resampling from the sample K times (K defaults to 10,000, but generally in the range [10:100,000]), present the bootstrap distribution and show the resampling based inference (e.g., the outcomes may be H/T, Die<3, or 5-card-hand has a pair).
- Data-Driven: User provides their own dataset and postulates a hypothesis. We show the data graphically and animate K (K defaults to 10,000, but generally in the range [10:100,000]) resamples with repetition, then make the bootstrap-based inference, as in the simulation-driven case (1).
Use-Cases/Utilization Protocol
- Identify Data
- User specified Data: Provide a generic SOCR data-spreadsheet where users can past in multicolumn data (e.g., SOCR Data).
- Data from SOCR Experiments (see Applets and Activities)
- Map Data to discrete Graphical Objects in a Data-Canvas
- Select a column from the Data-Spreadsheet
- Choose object type (e.g., Coin, Die, Card, etc.)
- User Resampling Functionality (User control specs)
- Sampling with or without replacement
- Specify N=original data sample size, K=number of resamples, M=size of each of the samples to be drawn.
- Animate each sample (one drawing observation (M of them) at a time) for each sample (K of them)
- Animate each resample (K resamples in total).
- Typical sizes: N~100, K~10,000, M~100
- User selects hypothesis
- Running the experiment
- Discrete mode or Animated mode
- Step = obtain one sample (of size M)
- Run = obtain all K samples (each of size M)
- Visualize the results (either statically, discrete mode, or dynamically, animation mode)
- Show summary statistics tables
- All samples KxM (columns = contain the random samples within one resampling step, column-size=M; rows = contain the simulations for all resamples, row-size is K)
- Boot-strap-based inference (responding to the user hypothesis) just like we do in the SOCR General CI applet (bootstrap estimation).
Social networking/sharing
The SOCR HTML5 Randomization and Resampling webapp should allow users that have online data to share the state of their entire web-app (in playable format) with any other user using unique URL's. For example:
This unique URL web-app play format uses the following components:
- Main Applet URL: http://SOCR.ucla.edu/htmls/HTML5/ResamplingSimulation/
- PHP script that harvests the tabular data from URL: File:Jnlp writer php.zip (WebApp.php)
- Reference to an online Dataset (tabular format required): ?http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_Dinov_010309_HousingPriceIndex
- Variable Mapping Syntax: &cell_start=(0,0)&cell_end=(20,15)&K=10000&M=100
- Load-only or Play/Run action: &play=true
See this example using similar invocation protocol for the Pipeline environment (http://ucla.in/xIL1E8).
See also
The links below provide some interesting examples of Java code for dynamic animations. These may be useful for the new SOCR Resampling/Simulation Experiment when we get to illustrating the random sampling/resampling/drawing of data (or SOCR Experiments objects) and depicting this via animation. Some of these have very clever image warping/Bezier/path function representations which may be applicable for our coins, cards, dice.
- SOCR Resampling and Simulation activity and Webapp.
- Java Animation 1, e.g., TransformAnim.
- See the Randomization web-cast and this example randomization web-site.
Exemplary tools that can be employed
- SOCR 3.0 GoogleCode, JavaScript Redesign (SVN Source Code Repository)
- JSXGraph HTML5/JS Mathematical Functions Charts and graphs
- D3
- See the JavaScript InfoVis Toolkit
- Manual Graphics Paint canvas in HTML5
- RGraph HTML5 Charts and Graphs
- Rendera: Interactive HTML5/CSS3/JS web-page Editor
Extensions
Some of the following features may be extremely interesting to include in the Randomization webapp:
- Video capability for training and demonstration purposes.
- Social-networking mechanism for sharing of the webapp and its state (perhaps multiple users interacting jointly, as a team, with the same instance of the web-app?
- Dynamic HTML5-based spreadsheet for data manipulation (e.g., DHTMLX).
- Data drag-and-drop functionality in the webapp.
References
- SOCR GSoC 2012 Randomization Project
- Ivo Dinov's Bootstrapping Notes
- Wikipedia resampling section
- Introducing Statistical Inference to Biology Students Through Bootstrapping and Randomization
- George Cobb's 2009 TISE paper
Translate this page: