# SOCR ResamplingSimulation Activity

## Contents

## SOCR Educational Materials - Activities - SOCR Resampling, Randomization and Simulation Activity

This activity illustrates the processes of sampling, resampling, similation and randomization using the SOCR Resampling, Randomization and Simulation Webapp. It is implemented in HTML5/JavaScript and should be portable on any computer, operating system and web-browser.

## Goals

The aims of this activity are to:

- Demonstrate the concepts of simulation and data generation
- Illustrate data resampling on a massive scale
- Reinforce the concept of resampling and randomization based statistical inference
- Demonstrate the similarities and differences between parametric-based and resampling-based statistical inference

## Background

Random (re)sampling applies stochasticity or randomness in the sampling scheme and reflects what is sampled and what the distribution we sample from is. In parametric-based statistical inference, the random sampling reflects the stochastic nature of selecting observations from the sample space. In contrast, in randomization-based inference (e.g., bootstrapping), the random sampling reflects the resampling and stochastic assignment of units to treatments or groups.

## Requirements & usability

A modern web-browser with enabled HTML and JavaScript support is required (mobile devices, tablets and phones should work fine).

- Go to the SOCR Resampling/Simulation Webapp.
- Test the webapp
- Report any constructive and critical feedback

## Learning Activity

Load the SOCR resampling and randomization webapp in your browser.

You can perform single sample or multiple sample based statistical inference using this resource. Let's take a 2-sample case as a specific example where we are looking for group differences. Follow this protocol to get some simulations/results (both for teaching/learning randomization-based inference, or do do real data analysis):

- You can either generate random data or copy-paste in your own data. For instance you can generate data using coins/cards, etc., or use one of the SOCR datasets (e.g., Human Heights/Weights)
- Simulation-Driven Randomization Inference:

- To use the Coin-Toss experiment to generate data, click “
*Binomial Coin Toss*” - Choose the parameters -- number of coins, probability of Heads, and number of samples (e.g., k=2)
- Click “
*Generate Dataset*“ (you can click this button multiple times, notice how the data samples change) - Click “
*Generate Ransom Samples*” - Select sample sizes (e.g., 10) and number of repeated samples (e.g., 10,000)
- Click the “RUN” button
- You can inspect all samples (for the k=2 groups) in the right panel of the webapp (use “Show” button and inspect all the glyphs on the top)
- Then select “Test Statistics”, e.g., p-value, and Click “Infer” button
- This will automatically open you the “Inference Plot” tab where the randomization distribution (of p-values) is shown and the initial p_o value is drawn on top to show the relation to the resampling-based distribution.
- You can always make modifications of your prior choices in the “Control” tab.

- Data-Driven Randomization Inference:

- Back at the Webapp startup screen select the “Use Excel Datasheet” Option
- Click the “Reset” button to remove any previous data from the webapp buffer.
- Copy-paste data from any data-table, For instance from this Heights/Weights dataset.
- Let’s select a set of say 20 Weights and click “Use Selected” (this would represent sample 1). Repeat this selection with another set of 20 Weights.
- Click “Proceed”. You should see a summary indicating the sample-sizes of the 2 groups of data you selected
- Click “Done” – this will open the “Control” panel
- Select sample sizes (e.g., 10) and number of repeated samples (e.g., 10,000)
- Click the “RUN” button
- You can inspect all samples (for the k=2 groups) in the right panel of the webapp (use “Show” button and inspect all the glyphs on the top)
- Then select “Test Statistics”, e.g., p-value, and Click “Infer” button
- This will automatically open you the “Inference Plot” tab where the randomization distribution (of p-values) is shown and the initial p_o value is drawn on top to show the relation to the resampling-based distribution.
- You can always make modifications of your prior choices in the “Control” tab.

- Some new features (e.g., data import from WorldBank and other URLs) will be added it the next 2 weeks

## Practice experiments

Repeat the protocol above with different (observed or simulated) data, different study designs (e.g., single sample, vs. multiple samples, etc.)

## Videos

## See also

## References

- Dinov, ID, Christou, N and Sanchez, J. (2008) Central Limit Theorem: New SOCR Applet and Demonstration Activity, Journal of Statistics Education, Volume 16, Number 2.

Translate this page: