PrepGo

Introduction to Experimental Design - AP Statistics Study Guide

Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026

Learn with study guides reviewed by top AP teachers. This guide takes about 24 minutes to read.

Quick Summary

This guide will enable you to distinguish between observational studies and experiments, recognizing that only well-designed experiments can establish a cause-and-effect relationship. You will learn to identify the key components of an experiment—subjects, factors, treatments, and response variables—and master the four foundational principles of experimental design: comparison, random assignment, control, and replication. By the end of this lesson, you will be able to design and describe a completely randomized design, a randomized block design, and a matched pairs design to answer a research question.

Key Concepts

  • Observational Study vs. Experiment: This is the most fundamental distinction in data collection.

    • An observational study observes individuals and measures variables of interest but does not attempt to influence the responses. For example, a study that tracks the health of people who choose to drink coffee versus those who do not. Observational studies can show an association but cannot prove causation due to potential confounding variables.

    • An experiment deliberately imposes some treatment on individuals to measure their responses. For example, a study where researchers randomly assign participants to either drink coffee or not, and then track their health. By controlling other variables and using random assignment, a well-designed experiment can establish a cause-and-effect relationship.

  • The Language of Experiments:

    • Experimental Units: The individuals (people, animals, or objects) to which treatments are applied. When the units are human, they are often called subjects.

    • Explanatory Variable (Factor): The variable that is intentionally manipulated by the researchers.

    • Levels: The specific values of the explanatory variable (factor) used in the experiment.

    • Treatment: A specific experimental condition applied to the units. If there is only one factor, the treatments are the same as the levels. If there are multiple factors, a treatment is a combination of levels from each factor.

    • Response Variable: The variable that is measured to assess the outcome of the study.

    Example: A study wants to test the effect of different fertilizers (Factor) on the height of tomato plants (Response Variable). The fertilizers are Fertilizer A, Fertilizer B, and a No-Fertilizer control (Levels). The three treatments are applying Fertilizer A, applying Fertilizer B, and applying no fertilizer. The experimental units are the tomato plants.

  • The Four Principles of Experimental Design:

    1. Comparison: An experiment must compare two or more treatments to prevent confounding. Simply giving a treatment to a group and observing the result is not enough. We need to compare the results to a group that received a different treatment or a control group (which may receive a placebo).

    2. Random Assignment: This is the cornerstone of a valid experiment. Treatments must be assigned to experimental units using a chance process (like flipping a coin, drawing names from a hat, or using a random number generator). Random assignment creates roughly equivalent groups at the beginning of the experiment by balancing the effects of lurking variables that we cannot control. This is the mechanism that allows us to infer causation.

    3. Control: Researchers must actively control for other variables that could affect the response. This is done by keeping these other variables constant for all experimental units. For example, in the fertilizer experiment, all plants should receive the same amount of water, sunlight, and be planted in the same type of soil. This prevents these other variables from becoming confounding variables.

    4. Replication: The experiment must use enough experimental units in each treatment group to ensure that any differences in the effects of the treatments are not just due to chance. A single subject per group is not a valid experiment.

  • Confounding: A confounding variable is a variable that is related to both the explanatory variable and the response variable, making it impossible to determine which variable is causing the change in the response. Random assignment is the primary tool to combat confounding by balancing the effects of unknown or uncontrollable confounding variables among the treatment groups.

    [Image: Diagram showing a confounding variable with arrows pointing to both the explanatory variable and the response variable, illustrating the tangled relationship.]

  • Types of Experimental Designs:

    • Completely Randomized Design (CRD): The simplest design. All experimental units are assigned to treatments completely at random.

      [Image: Flowchart of a Completely Randomized Design. Box on left says "Experimental Units". Arrow points to "Random Assignment". From there, arrows split to "Treatment Group 1", "Treatment Group 2", etc. Below these, arrows converge to a final box "Compare Results".]

    • Randomized Block Design: Used to reduce unwanted variability. First, subjects are grouped into blocks based on a shared characteristic that is expected to affect the response (e.g., age, gender, breed). Then, within each block, subjects are randomly assigned to the different treatments. This ensures that each treatment is tested on subjects with similar characteristics, making the comparison more precise.

      [Image: Flowchart of a Randomized Block Design. Box on left says "Experimental Units". Arrow points to "Group into Blocks". From there, arrows point to "Block 1", "Block 2", etc. Within each block box, an arrow points to "Random Assignment", which then splits to "Treatment 1", "Treatment 2", etc. At the end, all results are brought together in a "Compare Results" box.]

    • Matched Pairs Design: A special type of block design where the blocks are of size two, or each subject receives both treatments. The "pairs" can be two very similar subjects (like twins) or the same subject at two different times or on two different parts of their body (e.g., testing a new skin cream on a person's left arm vs. their right arm). The order of treatments should be randomized if each subject receives both.

  • Blinding and Placebos:

    • A placebo is a "dummy" treatment with no active ingredient. The placebo effect occurs when subjects show a response simply because they believe they are receiving a real treatment. Using a placebo as a control allows researchers to separate the physiological effects of a treatment from the psychological placebo effect.

    • Blinding is the practice of withholding information about which treatment is being administered.

      • Single-blind: Either the subjects or the researchers interacting with them do not know who is receiving which treatment.

      • Double-blind: Neither the subjects nor the researchers interacting with them know who is receiving which treatment. This is the gold standard as it prevents bias from both the subjects' expectations and the researchers' potential influence.

  • Scope of Inference:

    • Random Sampling of subjects from a population allows us to generalize our conclusions to that larger population.

    • Random Assignment of subjects to treatments allows us to make cause-and-effect conclusions.

    [Image: A 2x2 grid. Top labels: "Random Assignment", "No Random Assignment". Side labels: "Random Sampling", "No Random Sampling". The four cells contain the appropriate inference: (1) Generalize and Infer Causation, (2) Infer Causation but cannot Generalize, (3) Generalize but cannot Infer Causation, (4) Cannot Generalize or Infer Causation.]

Key Vocabulary

  • Experiment: A study in which researchers deliberately impose treatments on experimental units to observe their responses, allowing for the investigation of cause-and-effect relationships.

  • Random Assignment: The use of a chance process to assign experimental units to treatment groups. This creates roughly equivalent groups and is the key principle for establishing causation.

  • Confounding Variable: A variable that is associated with both the explanatory variable and the response variable, making it difficult to separate their effects on the response.

  • Control Group: A group of experimental units that receives a baseline treatment, such as a placebo or the existing standard treatment, used for comparison with the other treatment groups.

  • Randomized Block Design: An experimental design where subjects are first grouped into blocks based on a shared characteristic, and then random assignment to treatments is carried out separately within each block.

  • Placebo Effect: The phenomenon where some subjects experience a real change in their condition after receiving a fake or inactive treatment simply because they expect it to work.

  • Double-Blind: An experimental procedure in which neither the subjects nor the researchers administering the treatments and measuring the response know which treatment any individual subject is receiving.

Calculator Tech (TI-84)

While most of this topic is conceptual, the TI-84 can be used to perform random assignment.

  • Using for Random Assignment: To randomly assign 20 subjects to two treatment groups (10 in each):

    1. Label the subjects with unique integers from 1 to 20.

    2. On the calculator, press MATH, go to the PRB menu, and select 5:randInt(.

    3. Enter randInt(1, 20, 10). This will generate 10 random integers between 1 and 20. These 10 subjects will be assigned to Treatment Group A. The remaining 10 subjects will be assigned to Treatment Group B.

    4. Check for repeated numbers if your calculator does not have the function. If it does, use that instead for a sample without replacement.

How to Show Work on the FRQ

Describing an experimental design on an FRQ requires a clear, step-by-step process that another person could easily follow. Always be specific about the randomization method.

Template for a Completely Randomized Design (CRD):

  1. Subjects & Treatments: Start with the [number] of [experimental units/subjects]. The treatments are [list the specific treatments, including any control or placebo].

  2. Random Assignment: Describe the randomization process. "For each subject, we will [flip a coin / roll a die / use a random number generator]. If we get [outcome 1], the subject will be assigned to [Treatment Group 1]. If we get [outcome 2], the subject will be assigned to [Treatment Group 2]." OR "Label each of the [number] subjects with a unique integer from 1 to N. Then, write the numbers on identical slips of paper, place them in a hat, and mix thoroughly. The first n subjects drawn will be assigned to [Treatment Group 1], the next n subjects to [Treatment Group 2], and so on."

  3. Administration of Treatments: State that all other potential confounding variables (e.g., environment, diet, care) will be kept the same for all groups. If applicable, mention if the experiment will be single- or double-blind.

  4. Response Variable & Comparison: After [a specified amount of time], we will measure the [name of the response variable] for all subjects. We will then compare the [average/proportion of the response variable] across the treatment groups to see if there is a statistically significant difference.

Template for a Randomized Block Design:

  1. Subjects & Blocking: Start with the [number] of [experimental units/subjects].

  2. Form Blocks: Create blocks based on the [blocking variable, e.g., age, fitness level]. For example, "We will form blocks of subjects with similar [blocking variable]."

  3. Random Assignment within Blocks: "Within each block, we will randomly assign the subjects to the treatments. To do this, for each block, we will [describe the randomization process, e.g., 'flip a coin for each pair of subjects to assign one to Treatment A and the other to Treatment B']." This step is critical; randomization must occur within each block.

  4. Administration & Response: The rest is similar to the CRD. Describe administering the treatments and keeping other variables constant. After the experiment, measure the [response variable].

  5. Comparison: "We will compare the results of the treatments within each block and then combine the results from all blocks to make an overall comparison."

Practice Problems

Problem 1: A pharmaceutical company has developed a new drug designed to reduce high blood pressure. They recruit 100 volunteers with high blood pressure to participate in a study. Describe a completely randomized design for this study.

Solution:

The 100 volunteers are the subjects. The two treatments will be the new drug and a placebo. First, we will label each volunteer with a unique integer from 1 to 100. We will then use a random number generator to select 50 unique integers from this range. The 50 volunteers corresponding to these numbers will be assigned to the experimental group and will receive the new drug. The remaining 50 volunteers will be assigned to the control group and will receive a visually identical placebo. To prevent bias, this experiment should be double-blind, meaning neither the volunteers nor the health professionals administering the pills and measuring blood pressure will know who is in which group. After a period of 8 weeks, we will measure the blood pressure of all 100 volunteers. Finally, we will compare the average reduction in blood pressure between the drug group and the placebo group to determine if the new drug is effective.

Problem 2: An agricultural scientist wants to test the effect of a new fertilizer on the yield of corn. She has a large field to use for the experiment, but she knows that the soil moisture varies from the north end (drier) to the south end (wetter). She has 30 plots of land available for the experiment and wants to test the new fertilizer against the current standard fertilizer. Describe a randomized block design for this experiment.

Solution:

The experimental units are the 30 plots of land. The treatments are the new fertilizer and the standard fertilizer. Because soil moisture is expected to affect corn yield and it varies from north to south, we will use the location of the plots as a blocking variable. We will create 15 blocks, with each block consisting of two adjacent plots of land (a north-south pair). Within each of these 15 blocks, we will use a random process to assign the treatments. For each block, we will flip a coin. If it's heads, the northern plot will receive the new fertilizer and the southern plot will receive the standard fertilizer. If it's tails, the assignment will be reversed. All other variables, such as watering schedule and pest control, will be kept constant for all 30 plots. At the end of the growing season, we will measure the yield of corn (in bushels per acre) for each plot. We will then compare the yield from the new fertilizer to the yield from the standard fertilizer within each block and then across all 15 blocks to determine if the new fertilizer provides a significantly different yield.

Common Mistakes to Avoid

  • Confusing Random Sampling with Random Assignment: Random sampling involves selecting individuals from a population to participate in a study and is essential for generalizing results to that population. Random assignment involves assigning participants already in the study to treatment groups and is essential for making cause-and-effect conclusions. You can have one without the other.

  • Confusing Blocking with Stratifying: These are parallel concepts for different contexts. Blocking is done in experiments to reduce variability by creating homogeneous groups before random assignment. Stratifying is done in sampling to create representative samples by dividing the population into homogeneous groups before taking a random sample from each group.

  • Vague Description of Randomization: Do not just say "randomly assign subjects to groups." You must describe a specific, implementable process. The "names in a hat" method is a classic, foolproof description. Using a random number generator is also excellent, as long as you are specific about how you will use it (e.g., "label subjects 1-50, then generate 25 unique numbers for Group A").

  • Stating that Experiments Eliminate Confounding: Experiments don't magically eliminate confounding variables. Instead, random assignment works to balance the effects of potential confounding variables (both known and unknown) across the treatment groups, so that they do not systematically favor one outcome over another.

  • Misinterpreting "Control": The word "control" has two meanings in experiments. The principle of control refers to keeping extraneous variables constant for all groups. A control group is a specific group used for comparison. Don't use these terms interchangeably.