# SOCR Wiki Activities Project

## Contents

## SOCR Project - SOCR Activities Project

### Project goals

The goal of this project is to develop new hand-on dynamic activities demonstrating the utilization of various SOCR Tools, Data and resources.

### Examples of activities

Explore the different SOCR tools, existing activities and data and either design a new activity, or extend and improve an existing SOCR activity. Some examples of proposed new activities are included below.

#### SOCR Cartography, GIS mapping and Spatial Statistics

The SOCR Cartography project provides a mechanism to plot, analyze and morph GIS maps, spatial statistics and cartographic maps according to some data anchored at geographic locations (e.g., counties within the US, California or world wide. We need to develop activities and instructions and demonstrations on how to analyze spatial data using the new SOCR Cartography applet.

#### SOCR 3D Charts

Develop new SOCR activities for the new SOCR 3D Charts applet. For instance use any spatial or geographic data, e.g., California Ozone data, to render a 3D plot of 2 spatial and 1 altitude variables, or use the multivariate US population data by county to analyze the relations between different demographics, socioeconomics and health.

#### Random Rectangle Areas Activity

Design a new SOCR Applet and/or Activity that demonstrates estimation, random sampling, and bias. We can use line lengths, number of balls in urns, etc. Show how SOCR applets can be used to control many of the parameters (e.g., number of objects, the number of sub-objects, sizes, distributions, etc.) For example, this activity may demonstrate how subjective samples may be compared to random samples and whether there is sampling bias. The aim is to show why randomization is an important part of the process of data collection. This activity may also showcase the random sampling, parametric assumptions, the effects of the sample size, confidence and prediction.

- Look at the image and write down your guess as to the average area of the rectangles on the sheet. Each small square is one square unit. Then, guess the average rectangular area.

- Now select 5 representative rectangles and write down the area for each of them. Compute the average of the five areas, and compare it to your guess (are these close?)

- Use the SOCR random number generator to select 5 distinct random rectangles between 1 and 100, and find the average area of these 5 rectangles. Repeat this process 10 times, and record your average area each time.

- Repeat the previous step using a sample of 10 (instead of 5) distinct random rectangles and compute the average area. Do that again 10 times and record the average area.

- Using all these data, calculate the means, standard deviations, and the five-number summaries for these 2 distributions (samples of 5 or 10 rectangles).

- Questions:
- How do the centers and spreads of the various distributions compare?
- Which method of sampling (subjective or random) do you think is doing a better job? Why?
- How does the amount of spread in the 10-rectangle-based sampling distribution compare with the 5-rectangle-based sampling distribution? Which distribution gives a more trustworthy estimate of the true population mean?

#### Data Analysis Activities

The data-analysis types of activities include using any dataset from the SOCR Data collection, drafting an interesting hypotheses and/or research questions about the data (or the underlying process), using the SOCR tools to analyze the data, and making some statistical inference and decision making regarding the initial research questions based on the findings.

For example, one can use the Ranking of the top 100 Countries in the World dataset to ask a large number of questions about what factors, and how, influence the human perception of what the "best" countries to live may be. One may ask how is the overall country ranking (lower rank like 1, 2, 3, etc. is better, higher rank like 98, 99, 100 is worse) is affected by the country's religiosity, education level, health services, etc. Fitting a multiple linear regression with the overall ranking as the response (dependent variable) and all of the other variables as predictors (independent variables, or covariates) may provide clues to the power of the predictors to explain the observed country rankings.

Translate this page: