Selecting an Experimental | AP Stats Unit 3 Study Guide

Quick Summary

This guide will equip you to select and justify the most appropriate experimental design for a given research question. You will learn to distinguish between a completely randomized design, a randomized block design, and a matched pairs design, understanding that the goal of blocking is to reduce unwanted variability in the results, allowing for a clearer conclusion about cause and effect. After mastering this material, you will be able to describe in detail how to implement any of these designs in a real-world scenario.

Key Concepts

An experimental design is the overall strategy for how you will collect data in an experiment. The goal is to structure the experiment so that you can make a valid cause-and-effect conclusion. The choice of design depends on the nature of your experimental units and the variables you suspect might influence the outcome.

Completely Randomized Design (CRD)
- What it is: The most basic experimental design. In a CRD, all experimental units are assigned to the treatments completely at random. Think of it as putting all the subjects' names in a hat, mixing them up, and drawing out names for each treatment group.
- When to use it: This design is best when you believe the experimental units are relatively homogeneous (similar to each other) with respect to any variables that could affect the response.
- Primary Goal: To create treatment groups that are roughly equivalent at the beginning of the experiment. The random assignment should balance out the effects of all other variables (both known and unknown) among the treatment groups, so that the only systematic difference between them is the treatment they receive.
- Example: To test the effect of three different energy drinks on students' test scores, you could take a group of 60 student volunteers and randomly assign 20 to Drink A, 20 to Drink B, and 20 to Drink C.
[Image: Diagram of a Completely Randomized Design showing a pool of experimental units being randomly assigned to three different treatment groups.]
Randomized Block Design
- What it is: A design where experimental units are first grouped into blocks based on a shared characteristic that is expected to affect the response variable. Then, within each block, a completely randomized design is carried out.
- What is a block? A block is a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments.
- When to use it: Use this design when you have a group of experimental units that are heterogeneous (different from each other) in a specific way. By blocking, you can control for the effect of that specific variable.
- Primary Goal: To reduce unwanted variability. By grouping similar subjects together, you can account for the variation between the blocks and get a much clearer picture of the effect of the treatment within the blocks. The key idea is: "Block what you can, randomize what you cannot."
- Example: A researcher wants to test the effectiveness of a new fertilizer on tomato plants. They know that the location in the greenhouse (full sun vs. partial shade) will significantly impact growth. They should create two blocks: "full sun" and "partial shade." Within the full sun block, they would randomly assign half the plants to the new fertilizer and half to a control. They would do the same thing independently within the partial shade block. This separates the effect of the location from the effect of the fertilizer.
[Image: Diagram of a Randomized Block Design. It shows a pool of subjects first being separated into two blocks (e.g., Block A and Block B). Then, within each block, subjects are randomly assigned to the available treatments.]
Matched Pairs Design
- What it is: A special and powerful type of randomized block design where the "blocks" are of size two. These pairs of experimental units are matched as closely as possible.
- Two common forms of matched pairs:
  1. Similar Subjects: Two subjects who are very similar in key characteristics are paired up. For each pair, one subject is randomly assigned to Treatment 1 and the other to Treatment 2. (e.g., pairing two people with the same age, gender, and fitness level to test a new diet).
  2. Each Subject as Their Own Control: A single subject receives both treatments, often in a random order. The two measurements on the same subject form the "pair." (e.g., a person's blood pressure is measured before and after taking a medication; a person tests both a name-brand and a generic tire on their car, with the order randomized).
- When to use it: This is the ideal design when you can create very close matches or when it is feasible for each subject to receive both treatments.
- Primary Goal: To control for as much variability as possible. Since the two units in a pair are so similar (or are the same person), almost all the variation between them can be attributed to the treatment. This makes it easier to detect a treatment effect.
[Image: Diagram of a Matched Pairs Design. It shows subjects being formed into pairs. For each pair, a coin flip determines which subject gets Treatment A and which gets Treatment B.]

Key Vocabulary

Completely Randomized Design (CRD): An experimental design where all experimental units are assigned to treatments entirely by chance.
Block: A group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments.
Randomized Block Design: An experimental design where subjects are first divided into blocks, and then a completely randomized design is conducted separately within each block.
Matched Pairs Design: A special case of a randomized block design where blocks are of size two. The "pairs" can be two similar subjects or one subject who receives both treatments.
Variability: The natural differences in the responses of experimental units. A key goal of good design is to control and reduce this variability to better see the effect of the treatments.

Calculator Tech (TI-84)

No major calculator functions are required for this topic. The focus is on the conceptual understanding and description of the designs.

How to Show Work on the FRQ

On the AP exam, you will be asked to describe how to implement a specific experimental design. Your description must be clear, detailed, and replicable. Use the following templates.

How to Describe a Completely Randomized Design

Your description must clearly explain the process of random assignment.

Label: "Obtain a list of all $[n u mb er]$ experimental units. Assign each unit a unique integer from 1 to $[t o t a l n u mb er N]$ ."
Randomize: "Use a random number generator (like $r an d I n t (1, N)$ ) to generate $[n u mb er f or g ro u p 1]$ unique integers.
Assign: "The units corresponding to these numbers will receive $[T re a t m e n t 1]$ . The next $[n u mb er f or g ro u p 2]$ unique integers generated will be assigned to $[T re a t m e n t 2]$ , and the remaining $[n u mb er f or g ro u p 3]$ units will be assigned to $[T re a t m e n t 3]$ ."
Compare: "After $[t im e p er i o d]$ , we will measure the $[res p o n se v a r iab l e]$ for each unit and compare the $[e . g ., a v er a g eres p o n se]$ across the treatment groups."

How to Describe a Randomized Block Design

Your description must first explain the blocking, then the randomization within the blocks.

Create Blocks: "First, create blocks based on $[t h e b l oc kin gv a r iab l e]$ . For example, create one block of all the $[d escr i pt i o n o f b l oc k 1, e . g .,^{'} ma l e s^{'}]$ and a second block of all the $[d escr i pt i o n o f b l oc k 2, e . g .,^{'} f e ma l e s^{'}]$ ."
Randomize Within Block 1: "Within the first block ( $[b l oc k 1 nam e]$ ), label the $[n u mb er]$ units from 1 to `N1$. Use a random number generator to select $[n u mb er]$ units to receive $[T re a t m e n t 1]$ . The remaining units in this block will receive $[T re a t m e n t 2]$ ."
Randomize Within Block 2: "Repeat this process independently for the second block ( $[b l oc k 2 nam e]$ ). Label the $[n u mb er]$ units from 1 to `N2$. Use a random number generator to select $[n u mb er]$ units to receive $[T re a t m e n t 1]$ . The remaining units in this block will receive $[T re a t m e n t 2]$ ."
Compare: "After $[t im e p er i o d]$ , we will measure the $[res p o n se v a r iab l e]$ for all units. We will then compare the results of the treatments, making sure to do the comparison separately within each block first, and then overall."

How to Describe a Matched Pairs Design

Your description must explain how pairs are formed and how treatments are randomized within each pair.

Form Pairs: "Create pairs of experimental units based on $[cr i t er ia f or ma t c hin g, e . g .,^{'} s imi l a r a g e an d w e i g h t^{'}]$ . OR, if applicable: 'Each subject will serve as their own control, so each subject forms a pair of observations (one for each treatment).'"
Randomize Within Pairs: "For each pair, we will use a random process, such as flipping a coin. If the coin lands on heads, the first subject in the pair will receive $[T re a t m e n t 1]$ and the second subject will receive $[T re a t m e n t 2]$ . If the coin lands on tails, the assignment will be reversed." (If using subjects as their own control, you would say: "The order of treatments will be randomized for each subject.")
Compare: "After the experiment, we will measure the $[res p o n se v a r iab l e]$ . For each pair, we will calculate the difference in the response variable between the two treatments. We will then analyze these differences to see if there is evidence that one treatment is more effective."

Practice Problems

Problem 1:

A biologist wants to test the effectiveness of a new pesticide on soybean plants. She has 40 soybean plants available for her experiment. She suspects that the amount of sunlight a plant receives (full sun vs. partial shade) could affect how well the pesticide works. 20 of the plants are in a full-sun location, and 20 are in a partial-shade location. She plans to compare the new pesticide to a placebo (a spray with no active ingredient). Describe a randomized block design for this experiment.

Solution:

First, we will create two blocks based on the amount of sunlight: one block for the 20 plants in the full-sun location and a second block for the 20 plants in the partial-shade location. This is appropriate because sunlight is expected to affect the response variable (plant health/insect damage). Within the full-sun block, we will label the 20 plants with unique integers from 1 to 20. Using a random number generator, we will select 10 unique numbers. The plants corresponding to these numbers will receive the new pesticide, and the remaining 10 plants in the full-sun block will receive the placebo. We will repeat this random assignment process independently for the 20 plants in the partial-shade block. After a set period of time, we will measure the level of insect damage (the response variable) for all 40 plants and compare the effectiveness of the pesticide to the placebo within each block and overall.

Problem 2:

An athletic shoe company has developed a new type of running shoe insert that it claims can improve a runner's time in a 1-mile race. They recruit 50 amateur runners to participate in an experiment. Explain how you would implement a matched pairs design for this experiment and justify why this design is preferable to a completely randomized design.

Solution:

A matched pairs design is most appropriate here. Each of the 50 runners will serve as their own control. For each runner, we will randomly determine the order in which they use the inserts. We can do this by flipping a coin for each runner: if it's heads, the runner will first race the mile with the new inserts and then, after a sufficient rest period, race the mile with standard inserts. If it's tails, the order will be reversed (standard inserts first, then new inserts). We will record the 1-mile race time for each condition. After all runners have completed both races, we will analyze the differences in race times for each individual runner. This matched pairs design is preferable to a completely randomized design because running ability varies tremendously from person to person. By having each runner test both inserts, we control for the massive variability in natural running speed among the participants. This allows us to isolate the effect of the insert itself, making it much easier to detect a true difference if one exists.

Common Mistakes to Avoid

Confusing Blocking with Stratifying: This is the most common error. Blocking is for experiments; you group similar subjects to control for a variable before assigning treatments. Stratifying is for sampling; you split the population into groups (strata) to ensure representation in your sample. You block to reduce variability; you stratify to reduce sampling error.
Forgetting to Randomize Within Blocks: Simply creating blocks is not enough. The key to a valid randomized block design is the random assignment of treatments within each block. Failing to describe this step invalidates the design.
Choosing a Poor Blocking Variable: You should only block on a variable that you strongly suspect will affect the response variable. Blocking by a variable that is unrelated to the response (e.g., blocking by eye color when testing a new fertilizer) does not reduce variability and only complicates the experiment unnecessarily.
Describing a Block Design as "Completely Randomized": If you first separate subjects into groups (blocks) and then randomize within those groups, it is a randomized block design, not a completely randomized design. A CRD has only one single pool of subjects for randomization.
Stating that Blocking Eliminates Confounding: Blocking does not eliminate all confounding. It is a powerful tool to control for the specific variable you blocked on. Random assignment is still essential within the blocks to help balance out all other potential confounding variables that you did not block on.

Selecting an Experimental Design - AP Statistics Study Guide

Quick Summary

Key Concepts

Key Vocabulary

Calculator Tech (TI-84)

How to Show Work on the FRQ

How to Describe a Completely Randomized Design

How to Describe a Randomized Block Design

How to Describe a Matched Pairs Design

Practice Problems

Common Mistakes to Avoid