PrepGo

Random Sampling and Data Collection - AP Statistics Study Guide

Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026

Learn with study guides reviewed by top AP teachers. This guide takes about 18 minutes to read.

Quick Summary

This guide will enable you to distinguish between observational studies and experiments, the two primary methods of data collection. You will learn that only well-designed experiments can establish a cause-and-effect relationship by controlling for confounding variables. By the end of this lesson, you will be able to identify the key components of an experiment—such as treatments, experimental units, and variables—and articulate why association does not imply causation in the context of observational studies.

Key Concepts

  • Observational Study vs. Experiment: This is the most fundamental distinction in data collection.

    • An observational study observes individuals and measures variables of interest but does not attempt to influence the responses. Researchers are passive observers. For example, a study that tracks a group of smokers and a group of non-smokers to compare their rates of lung cancer is an observational study because researchers are not assigning people to smoke.

    • An experiment deliberately imposes some treatment on individuals (the experimental units) to measure their responses. Researchers are active participants who manipulate a variable to see what happens. For example, a study that randomly assigns 100 people to take a new vitamin and 100 people to take a placebo is an experiment because a treatment is being actively imposed.

  • Why Experiments are Necessary for Causation:

    • The primary goal of an experiment is to establish that changes in an explanatory variablecause changes in a response variable.

    • Observational studies can show an association or correlation between variables, but they cannot prove causation. For example, an observational study might find that people who drink coffee live longer. This is an association. It does not mean coffee causes a longer life. Perhaps people who drink coffee also tend to exercise more or have healthier diets.

  • Confounding Variables:

    • A confounding variable is a variable that is related to both the explanatory variable and the response variable, making it impossible to separate their effects on the response. Confounding is the primary reason observational studies cannot be used to show cause-and-effect.

    • Classic Example: A study finds a strong positive association between ice cream sales (explanatory variable) and the number of drowning deaths (response variable). Does eating ice cream cause drowning? No. The confounding variable is temperature. When the temperature is high, people buy more ice cream, and people also go swimming more often, which leads to more drowning incidents. Temperature is associated with both ice cream sales and drowning, confounding the relationship between them.

    • [Image: Diagram showing a confounding variable (e.g., Temperature) with arrows pointing to both the explanatory variable (Ice Cream Sales) and the response variable (Drowning Deaths)]

  • The Language of Experiments: To understand and describe experiments, you must know the specific vocabulary.

    • Experimental Units: The individuals (people, animals, or objects) to which treatments are applied. If the units are human, they are often called subjects.

    • Explanatory Variable (or Factor): The variable that is intentionally manipulated by the researcher. An experiment may have several explanatory variables.

    • Response Variable: The variable that measures the outcome of the study. It is what you record at the end of the experiment to compare the effects of the treatments.

    • Treatments: The specific experimental conditions applied to the units. A treatment is formed by a specific level of an explanatory variable. If there is one explanatory variable, the levels of that variable are the treatments. If there are multiple explanatory variables, a treatment is a combination of the levels of each variable.

    • Example: A researcher wants to test the effect of a new fertilizer and different watering schedules on tomato plant growth.

      • Experimental Units: The individual tomato plants.

      • Explanatory Variables: (1) Type of fertilizer, (2) Watering schedule.

      • Response Variable: The final weight of tomatoes produced by each plant.

      • Levels & Treatments: Suppose the fertilizer has two levels (New Formula, Old Formula) and the watering schedule has three levels (Daily, Every 3 Days, Weekly). The experiment would have 2 x 3 = 6 total treatments: (New Formula, Daily), (New Formula, Every 3 Days), (New Formula, Weekly), (Old Formula, Daily), (Old Formula, Every 3 Days), (Old Formula, Weekly).

Key Vocabulary

  • Experiment: A study in which researchers deliberately impose treatments on experimental units to measure their responses and determine a cause-and-effect relationship.

  • Observational Study: A study in which researchers observe individuals and measure variables of interest without attempting to influence the responses. Can only show association, not causation.

  • Confounding: Occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other.

  • Explanatory Variable: The variable that is intentionally manipulated in an experiment to observe its effect on the response variable. Also known as a factor.

  • Response Variable: The variable that measures the outcome of a study.

  • Treatment: A specific condition applied to the experimental units in an experiment, corresponding to a level of the explanatory variable.

  • Experimental Unit: The person, animal, or object to which a treatment is randomly assigned.

Calculator Tech (TI-84)

No major calculator functions are required for this topic.

How to Show Work on the FRQ

On the AP exam, you must be able to clearly communicate your understanding of data collection methods. Use these templates for precise, scorable answers.

How to Justify a Study Type and Explain Confounding

When asked to identify a study type and discuss causation, follow this structure:

  1. Identify the Study Type: "This is an observational study because the researchers did not impose a treatment. They merely observed [describe what was observed] and recorded [the response]." OR "This is an experiment because a treatment, [name the treatment], was deliberately imposed on the [experimental units]."

  2. Address Causation: "Because this is an observational study, we cannot conclude that [explanatory variable] causes a change in [response variable]. There may be a confounding variable."

  3. Describe the Confounding Variable (CRITICAL 2-PART EXPLANATION):

    • "A possible confounding variable is [name the variable]."

    • (Part 1: Link to Explanatory): "It is plausible that [confounding variable] is associated with the explanatory variable because [explain the connection]."

    • (Part 2: Link to Response): "It is also plausible that [confounding variable] is associated with the response variable because [explain the connection]."

    • Conclusion: "Therefore, we cannot tell whether the observed difference in the response is due to the explanatory variable or the confounding variable."

How to Identify the Components of an Experiment

When asked to identify the parts of a described experiment, be specific.

  • Experimental Units: The experimental units are the [specific number and description of the individuals/objects in the study].

  • Explanatory Variable(s): The explanatory variable is [name the general factor being tested].

  • Treatments: The treatments are [list all specific treatments clearly. If there's a control group, list it as a treatment].

  • Response Variable: The response variable is [describe exactly what is being measured and how it will be used to compare the treatments].

Practice Problems

Problem 1: A school district notes that students who are members of the school's orchestra tend to have higher GPAs than students who are not in the orchestra. They conclude that playing a musical instrument causes an increase in academic performance.

(a) Identify the study type.

(b) Explain why the district's conclusion is not justified. Identify a potential confounding variable and describe how it confounds the results.

Solution:

(a) This is an observational study because the researchers did not impose a treatment. They merely observed the students' existing orchestra membership and recorded their GPAs.

(b) Because this is an observational study, the district cannot conclude that playing an instrument causes an increase in GPA. An association was found, but it could be due to a confounding variable. A possible confounding variable is the level of student motivation or parental involvement. (Part 1: Link to Explanatory) It is plausible that highly motivated students or those with highly involved parents are more likely to join the orchestra. (Part 2: Link to Response) It is also plausible that these same students (highly motivated or with involved parents) are more likely to study hard and achieve a high GPA, regardless of their musical activities. Therefore, we cannot tell whether the higher GPAs are due to playing an instrument or due to the underlying motivation and support structure of the students.

Problem 2: A pharmaceutical company wants to test a new drug designed to reduce blood pressure. They recruit 120 volunteers who suffer from high blood pressure. They plan to give one group the new drug and a second group a visually identical pill that contains no active ingredient. After 30 days, they will measure each volunteer's blood pressure.

Identify the key components of this experiment.

Solution:

  • Experimental Units: The experimental units are the 120 volunteers with high blood pressure.

  • Explanatory Variable: The explanatory variable is the type of drug administered.

  • Treatments: There are two treatments: (1) the new drug, and (2) the placebo (the pill with no active ingredient).

  • Response Variable: The response variable is the change in blood pressure for each volunteer, measured after the 30-day period. This change will be compared between the two treatment groups.

Common Mistakes to Avoid

  • Confusing Association with Causation: This is the single most important mistake to avoid. Never state that an observational study proves a cause-and-effect relationship. Use phrases like "is associated with," "is linked to," or "is correlated with," but not "causes."

  • Vague Description of Confounding: It is not enough to simply name a confounding variable (e.g., "lifestyle"). You must complete the two-part explanation: describe how the variable is plausibly linked to both the explanatory variable and the response variable, as shown in the FRQ template.

  • Mixing Up Explanatory and Response Variables: Double-check which variable is being manipulated (explanatory) and which is the outcome being measured (response). A good way to check is to say the sentence: "We are studying the effect of the [explanatory variable] on the [response variable]."

  • Confusing the Explanatory Variable with the Treatments: The explanatory variable is the general factor (e.g., "fertilizer type"), while the treatments are the specific levels or versions being tested (e.g., "Fertilizer A," "Fertilizer B," and "Control Group/No Fertilizer"). Be specific when listing treatments.