Inference and Experiments | AP Stats Unit 3 Study Guide

Quick Summary

This guide explains how the design of a study determines the types of conclusions we can draw from the data. You will learn to analyze a study's methodology to determine if its results can be generalized to a larger population (based on random sampling) and whether a cause-and-effect relationship can be established (based on random assignment in an experiment). Mastering this allows you to critically evaluate statistical claims and justify the scope of inference for any given study.

Key Concepts

The primary goal of statistical inference is to draw conclusions that go beyond the data at hand. The "scope of inference" refers to the extent to which we can legitimately make these conclusions. There are two fundamental questions we must ask about any study's design:

Were the individuals randomly selected from a population? This determines if we can generalize our findings.
Were the treatments randomly assigned to the individuals? This determines if we can conclude causation.

Let's break down these two pillars of inference.

Generalizability (Random Sampling):
- If a study uses random sampling (like a Simple Random Sample) to select participants from a specific population, the sample is likely to be representative of that population.
- Therefore, any conclusions or findings from the sample can be generalized to the entire population from which it was drawn.
- If a study uses a non-random method to gather participants (e.g., volunteers, a convenience sample), the sample is likely biased.
- Therefore, the results cannot be generalized to any larger population. The conclusions apply only to the individuals who participated in the study.
Causation (Random Assignment):
- If a study is an experiment that uses random assignment to place participants into treatment groups, it helps ensure that the groups are roughly equivalent at the start.
- This process balances out the effects of lurking or confounding variables. Therefore, if there is a statistically significant difference in the outcomes between the groups, we can conclude that the treatment caused the difference.
- If a study is an observational study where researchers do not randomly assign treatments (they just observe choices and outcomes), we cannot conclude causation.
- An observed relationship in such a study is merely an association or correlation. A confounding variable could be the true cause of the observed difference.

The Four Scenarios for Scope of Inference

We can combine these two concepts into a powerful framework for determining the scope of inference for any study.

[Image: A 2x2 grid. Rows are "Random Assignment" and "No Random Assignment". Columns are "Random Sampling" and "No Random Sampling".]

Scenario 1: Random Sampling + Random Assignment
- Design: An experiment on a randomly selected sample of subjects. This is the "gold standard" of study design.
- Inference: We can generalize the results to the population from which the sample was drawn, AND we can conclude a cause-and-effect relationship.
- Example: Randomly select 500 adults with high blood pressure from the U.S. population. Randomly assign half to a new medication and half to a placebo.
Scenario 2: Random Sampling + NO Random Assignment
- Design: An observational study on a randomly selected sample.
- Inference: We can generalize the findings to the population, but we can only conclude an association, not causation.
- Example: Randomly select 1000 U.S. adults and find that those who report eating breakfast regularly have lower rates of heart disease. We can't say breakfast causes better health, only that there's a link.
Scenario 3: NO Random Sampling + Random Assignment
- Design: An experiment on a group of volunteers or a convenience sample.
- Inference: We cannot generalize the results to a larger population, but we can conclude a cause-and-effect relationship for the group of individuals in the study.
- Example: Recruit 100 student volunteers for a study on memory. Randomly assign half to listen to classical music while studying and half to study in silence. If the music group performs better, we can say the music caused the improvement for these students, but not for all students in general.
Scenario 4: NO Random Sampling + NO Random Assignment
- Design: An observational study on a group of volunteers or a convenience sample.
- Inference: We cannot generalize to any population, and we cannot conclude causation. The results are limited to describing an association within the specific group of participants.
- Example: A doctor observes that among her patients (a convenience sample), those who own a pet seem to have lower stress levels. This is just an interesting observation with a very limited scope of inference.

Key Vocabulary

Inference: The process of using data from a sample to draw conclusions about a larger population or about a cause-and-effect relationship.
Generalizability: The extent to which the findings from a sample can be reliably applied to the broader population from which the sample was drawn.
Causation: A relationship where a change in one variable (the explanatory variable) is directly responsible for causing a change in another variable (the response variable).
Association: A statistical relationship between two variables. They tend to vary together, but one does not necessarily cause the other.
Random Sampling: A selection process where every member of a population has an equal chance of being included in the sample. This is the key to generalizability.
Random Assignment: An experimental technique where subjects are placed into different treatment groups by a chance process. This is the key to establishing causation.
Observational Study: A study in which researchers measure variables of interest without imposing any treatment or attempting to influence the responses.
Experiment: A study in which researchers deliberately impose treatments on subjects to measure their responses and compare the effects of different treatments.

Calculator Tech (TI-84)

No major calculator functions are required for this topic. The focus is on understanding study design and logical reasoning, not on calculations.

How to Show Work on the FRQ

On the AP exam, you will be asked to determine and justify the scope of inference. Your answer must be a clear, well-written justification, not just a simple "yes" or "no." Use the following two-part template to structure your response.

Template for Justifying Scope of Inference

Part 1: Address Generalizability

Start by identifying how the subjects were obtained.
If subjects were randomly selected: "Because the subjects were randomly selected from the population of [clearly state the specific population], the results of this study can be generalized to that population."
If subjects were NOT randomly selected: "Because the subjects were not randomly selected from a larger population (e.g., they were volunteers or a convenience sample), the results of this study cannot be generalized to any larger population. The conclusions apply only to the participants in this study."

Part 2: Address Causation

Next, identify whether treatments were imposed and how subjects were assigned to them.
If treatments were randomly assigned (an experiment): "Because this was an experiment in which treatments were randomly assigned to the subjects, a cause-and-effect relationship can be established between the [state the explanatory variable] and the [state the response variable]."
If treatments were NOT randomly assigned (an observational study): "Because this was an observational study and treatments were not randomly assigned, we cannot conclude a cause-and-effect relationship. We can only conclude that there is an association between [state the explanatory variable] and [state the response variable], as there may be confounding variables that influenced the results."

Practice Problems

Problem 1:

A large school district wants to investigate the relationship between students' participation in extracurricular activities and their academic performance. They obtain a list of all 20,000 high school students in the district and use a computer to select a random sample of 500 students. For each student, they record whether the student participates in at least one extracurricular activity and their current grade point average (GPA). The data show a statistically significant association, with students who participate in activities having a higher average GPA. What conclusions can be drawn from this study? Justify your answer.

Solution:

Because the 500 students were randomly selected from the population of all 20,000 high school students in the district, the results of this study can be generalized to all high school students in that district. However, this was an observational study because treatments (participation in extracurricular activities) were not randomly assigned to the students. Therefore, we cannot conclude a cause-and-effect relationship. We can only conclude that there is an association between participating in extracurricular activities and GPA for students in this district. It is possible that a confounding variable, such as a student's motivation or family background, is responsible for the observed difference in GPA.

Problem 2:

To test the effectiveness of a new vitamin supplement on memory, a researcher recruits 80 student volunteers from a local university. The researcher randomly assigns 40 of the volunteers to take the vitamin supplement daily for a month and the other 40 to take a placebo. At the end of the month, all students take a standardized memory test. The group taking the vitamin supplement scores significantly higher, on average, than the placebo group. What conclusions can be drawn from this study? Justify your answer.

Solution:

Because this was an experiment in which the treatments (vitamin supplement or placebo) were randomly assigned to the subjects, a cause-and-effect relationship can be established. We can conclude that for the subjects in this study, the vitamin supplement caused an improvement in memory test scores. However, because the 80 subjects were volunteers from a single university and were not randomly selected from a larger population, the results of this study cannot be generalized to any larger population, such as all university students. The conclusion applies only to the 80 volunteers who participated in the experiment.

Common Mistakes to Avoid

Confusing Random Sampling with Random Assignment: These are two different processes with two different purposes. Sampling is about how you select your subjects for the study (purpose: generalizability). Assignment is about how you place those subjects into treatment groups within an experiment (purpose: establishing causation).
Overstating Conclusions: Be precise. Do not claim causation from an observational study. Do not generalize results from a study using volunteers. The scope of inference is strictly limited by the study's design.
Using the Word "Prove": Statistical studies provide strong evidence, but they do not "prove" things with absolute certainty. Use phrases like "the data provide convincing evidence for..." or "we can conclude that..."
Vague Justifications: On the FRQ, do not just say "it was a random sample." Be specific: "Because subjects were randomly selected from the population of all U.S. adults..." Similarly, don't just say "it was an experiment." Say: "Because treatments were randomly assigned to subjects..." This demonstrates a deeper understanding.

Inference and Experiments - AP Statistics Study Guide

Quick Summary

Key Concepts

Key Vocabulary

Calculator Tech (TI-84)

How to Show Work on the FRQ

Practice Problems

Common Mistakes to Avoid