Quick Summary
This guide provides a comprehensive framework for selecting the correct statistical inference procedure from the full suite of options available in AP Statistics. After mastering this material, you will be able to analyze any problem prompt, identify key characteristics such as data type and study design, and confidently name and justify the single most appropriate confidence interval or significance test required. This skill is the capstone of statistical inference and is crucial for success on the AP exam.
Key Concepts
Selecting the correct inference procedure is a systematic process. It involves answering a series of questions about the data and the goal of the analysis. The flowchart below illustrates the decision-making process you should internalize.
[Image: A flowchart titled "How to Select an Appropriate Inference Procedure." The chart starts with a box "What type of data?" pointing to two branches: "Categorical (Proportions/Counts)" and "Quantitative (Means)." Each branch then splits based on "How many samples/groups?" which then leads to the final specific procedure names.]
The Four Key Questions to Ask
What is the goal? Estimation or Hypothesis Testing?
If the prompt asks you to estimate a value, find a range of plausible values, or calculate a margin of error, you need a Confidence Interval.
If the prompt asks you to test a claim, look for evidence against a hypothesis, or determine if a result is statistically significant, you need a Significance Test.
What type of data are you working with?
Categorical Data: The data consists of counts, percentages, or proportions that fall into categories (e.g., "yes/no", "red/green/blue", "supports/opposes/undecided"). This leads to procedures involving proportions (p).
Quantitative Data: The data consists of numerical measurements where arithmetic operations like averaging make sense (e.g., height, weight, test scores, time). This leads to procedures involving means (μ).
How many samples or groups are there?
One Sample: Data is collected from a single group.
Two Samples/Groups: Data is collected from two independent groups.
Paired Data: Data consists of two measurements on the same individual (e.g., before and after a treatment) or on two matched individuals (e.g., twins). This is treated as a single sample of differences.
Multiple Samples/Groups (>2): For AP Statistics, this almost always points toward a Chi-Square test.
Do I know the population standard deviation (σ)?
For inference on means, if σ is known, you use a z-procedure. This is extremely rare in practice and on the AP exam.
If σ is unknown (which is almost always the case), you must estimate it with the sample standard deviation (s). You must use a t-procedure.
The Menu of Inference Procedures
I. Procedures for Proportions (Categorical Data)
| Number of Samples/Groups | Goal: Confidence Interval | Goal: Significance Test |
|---|---|---|
| One Sample | One-Sample z-Interval for a Proportion (p) | One-Sample z-Test for a Proportion (p) |
| Two Independent Samples | Two-Sample z-Interval for a Difference in Proportions (p₁ - p₂) | Two-Sample z-Test for a Difference in Proportions (p₁ - p₂) |
II. Procedures for Means (Quantitative Data)
| Number of Samples/Groups | Goal: Confidence Interval | Goal: Significance Test |
|---|---|---|
| One Sample | One-Sample t-Interval for a Mean (μ) | One-Sample t-Test for a Mean (μ) |
| Paired Data | t-Interval for a Mean Difference (μ_diff) | Paired t-Test for a Mean Difference (μ_diff) |
| Two Independent Samples | Two-Sample t-Interval for a Difference in Means (μ₁ - μ₂) | Two-Sample t-Test for a Difference in Means (μ₁ - μ₂) |
III. Procedures for Categorical Data (Distributions & Associations)
These are special cases that use the Chi-Square (χ^2) distribution. They are always significance tests.
Chi-Square Goodness of Fit (GOF) Test:
Use Case: You have one categorical variable from one sample.
Question: Does the observed distribution of your sample data match a hypothesized or claimed distribution?
Example: Does the distribution of M&M colors in a bag match the company's claimed percentages?
Chi-Square Test for Homogeneity:
Use Case: You have one categorical variable measured across two or more independent samples/groups.
Question: Is the distribution of the categorical variable the same (homogeneous) across the different groups?
Example: Is the distribution of political affiliation (Democrat, Republican, Independent) the same for high school graduates and college graduates?
Chi-Square Test for Independence:
Use Case: You have two categorical variables measured on one sample.
Question: Is there an association (or are they independent) between the two categorical variables in the population?
Example: In a sample of students, is there an association between their favorite subject (Math, English, Science) and their gender (Male, Female)?
Key Vocabulary
Parameter of Interest: The numerical characteristic of the population that is being estimated or tested (e.g., the true mean weight μ, the true proportion of voters p, the difference between two true means μ₁ - μ₂).
Paired Data: Data collected in pairs where the two values are not independent. Typically, this involves two measurements on the same subject (e.g., pre-test and post-test scores) or on matched subjects (e.g., twins). The analysis is done on the single sample of differences.
Goodness of Fit (GOF): A statistical test used to determine if a sample's categorical data distribution fits a claimed population distribution. It involves one sample and one categorical variable.
Homogeneity: A statistical test used to determine if the distribution of a single categorical variable is the same across two or more independent populations or groups.
Independence: A statistical test used to determine if there is a statistically significant association between two categorical variables within a single population.
Calculator Tech (TI-84)
This topic is about choosing the test, but knowing where to find it on the calculator is a key part of the process. All inference procedures are found under the STAT -> TESTS menu.
One-Sample z-Test (for p):
STAT->TESTS->5:1-PropZTest...One-Sample z-Interval (for p):
STAT->TESTS->A:1-PropZInt...Two-Sample z-Test (for p₁ - p₂):
STAT->TESTS->6:2-PropZTest...Two-Sample z-Interval (for p₁ - p₂):
STAT->TESTS->B:2-PropZInt...One-Sample t-Test (for μ):
STAT->TESTS->2:T-Test...One-Sample t-Interval (for μ):
STAT->TESTS->8:T-Interval...Two-Sample t-Test (for μ₁ - μ₂):
STAT->TESTS->4:2-SampTTest...Two-Sample t-Interval (for μ₁ - μ₂):
STAT->TESTS->0:2-SampTInt...Paired t-Test/Interval: First, calculate the differences and store them in a list (e.g., L3 = L1 - L2). Then perform a one-sample t-test or t-interval on that list of differences (L3).
Chi-Square GOF Test:
STAT->TESTS->D:χ^2GOF-Test...(Note: This is on newer TI-84 models. On older models, you must calculate the test statistic by hand using lists).Chi-Square Test (Homogeneity/Independence):
STAT->TESTS-> `C:χ^2-Test...(You must first enter the observed counts into a matrix via $2nd -> ).
How to Show Work on the FRQ
On the AP exam, you may be asked to identify and justify the appropriate inference procedure without actually performing it. To earn full credit, your response must be clear and well-supported.
Template for Justifying an Inference Procedure:
NAME the Procedure: State the full, specific name of the procedure.
Example: "The appropriate procedure is a two-sample t-test for the difference between two population means."
Non-Example: "A t-test." (This is too vague).
JUSTIFY the Choice: Address the key decision points in the context of the problem.
Identify the parameter(s) of interest: Define the parameter(s) you are testing or estimating (e.g., μ₁, μ₂, p, p₁ - p₂).
State the goal: Mention whether you are performing a significance test (testing a claim) or creating a confidence interval (estimating a value).
Describe the data type: Explain whether the data is quantitative or categorical.
Describe the study design: Explain whether you have one sample, two independent samples, or paired data.
Example FRQ Response:
Prompt: A researcher wants to know if a new fuel additive improves gas mileage. They recruit 40 car owners and measure the gas mileage (in mpg) of each car. Then, they add the new fuel additive and measure the gas mileage of the same 40 cars again. They want to determine if there is convincing evidence that the additive increases the mean gas mileage.
Your Response:
Procedure: The appropriate inference procedure is a paired t-test for a mean difference.
Justification:
We want to determine if there is convincing evidence that the additive increases gas mileage, so we will perform a significance test.
The parameter of interest is μ_diff, the true mean difference in gas mileage (after - before) for all cars of this type.
The data being collected, gas mileage in mpg, is quantitative.
The data are paired because two measurements (with and without the additive) are taken on each of the 40 cars. We will analyze the single sample of 40 differences.
The population standard deviation of the differences is unknown, so a t-procedure is appropriate.
Practice Problems
Problem 1:
A state's Department of Motor Vehicles (DMV) is concerned about wait times. They believe that the proportion of customers who wait more than 20 minutes is higher at urban DMV locations than at rural locations. To investigate, they take a random sample of 150 customers from urban locations and find that 63 waited more than 20 minutes. They also take an independent random sample of 120 customers from rural locations and find that 42 waited more than 20 minutes. The DMV wants to know if this data provides convincing evidence that the proportion of customers waiting more than 20 minutes is greater at urban locations.
Name the appropriate inference procedure and justify your choice.
Solution:
Procedure: The appropriate inference procedure is a two-sample z-test for a difference in population proportions.
Justification:
Goal: The DMV wants to determine if there is "convincing evidence" of a difference, which indicates a significance test.
Parameters: The parameters of interest are p_urban, the true proportion of all customers at urban locations who wait more than 20 minutes, and p_rural, the true proportion of all customers at rural locations who wait more than 20 minutes. We are testing a claim about the difference, p_urban - p_rural.
Data Type: The data is categorical. Each customer is categorized as either "waited more than 20 minutes" or "did not." We are working with the sample proportions of customers in the first category.
Study Design: We have two independent random samples: one from the population of urban DMV customers and another from the population of rural DMV customers.
Problem 2:
A sociologist is studying the relationship between education level and opinion on a recent public policy initiative. They survey a single random sample of 500 adults and record both their highest level of education (categorized as "High School," "Some College," "Bachelor's Degree," or "Graduate Degree") and their opinion on the initiative (categorized as "Support," "Oppose," or "No Opinion"). The sociologist wants to determine if there is an association between education level and opinion on this initiative in the adult population.
Name the appropriate inference procedure and justify your choice.
Solution:
Procedure: The appropriate inference procedure is a chi-square test for independence.
Justification:
Goal: The sociologist wants to determine if there is an "association" between two variables, which indicates a significance test.
Parameters: This test does not have a simple parameter like p or μ. The goal is to test the hypothesis that the two categorical variables are independent in the population.
Data Type: The data for both variables, education level and opinion, are categorical.
Study Design: We have one random sample of 500 adults. For each individual in that sample, we have measured two different categorical variables. The test will determine if these two variables are independent.
Common Mistakes to Avoid
Confusing Two-Sample vs. Paired Data: This is the most common error with quantitative data. If two measurements are taken on the same subject (e.g., before/after) or on deliberately matched subjects, the data are paired. Analyze the differences. If two separate, independent groups are being compared, use a two-sample procedure.
Mixing up Homogeneity and Independence: Both use the same chi-square calculation, but the study design is different. Homogeneity compares the distribution of one categorical variable across two or more populations/samples. Independence looks for an association between two categorical variables within one population/sample.
Using a t-test for Proportions: Never use a t-test for proportions. Inference for proportions always uses z-procedures because the standard deviation is a direct function of the population proportion, p. T-procedures are only for when you must estimate an unknown population standard deviation (σ) for quantitative data.
Vague Procedure Names: On the FRQ, "t-test" or "chi-square test" is not specific enough. You must state the full name: "one-sample t-test for a mean," "two-sample z-interval for a difference in proportions," or "chi-square test for homogeneity."