Quick Summary
This guide will equip you to set up a valid significance test for a population mean. You will learn to correctly formulate the null and alternative hypotheses based on a research question, identify the appropriate inference procedure (a one-sample t-test), and meticulously check the required conditions to ensure your conclusions are statistically sound.
Key Concepts
A significance test for a population mean (μ) allows us to use sample data to assess the evidence for a claim about an unknown population mean. The setup is the most critical part of this process.
1. Formulating Hypotheses
Hypotheses are always statements about a population parameter, in this case, the population mean μ. They are never about the sample statistic (x̄).
The Null Hypothesis (H₀): This is the "statement of no effect" or the status quo. It always contains a statement of equality. For a population mean, it takes the form:
H₀: μ = μ₀
where μ₀ is the specific, hypothesized value of the population mean that we are assuming to be true.
The Alternative Hypothesis (Hₐ): This is the claim we are trying to find evidence for. It determines whether the test is one-sided or two-sided.
One-Sided (Greater Than): Used when we are looking for evidence that the mean is larger than the hypothesized value.
Hₐ: μ > μ₀
(Keywords: greater than, increased, more than)
One-Sided (Less Than): Used when we are looking for evidence that the mean is smaller than the hypothesized value.
Hₐ: μ < μ₀
(Keywords: less than, decreased, fewer than, lower)
Two-Sided: Used when we are looking for evidence that the mean is simply different from the hypothesized value, without a specific direction.
Hₐ: μ \neq μ₀
(Keywords: different from, changed, not equal to)
2. Choosing the Correct Test
When the population standard deviation (σ) is unknown (which is almost always the case in practice), we must estimate it using the sample standard deviation (sₓ). This introduces more variability, and we can no longer use a Normal model (z-test). Instead, we must use a t-distribution.
- Name of the Test:One-Sample t-test for a Population Mean
3. Checking the Conditions for Inference
Before you can perform the test, you must verify that three conditions are met. These conditions ensure that the sampling distribution of the test statistic is what we expect it to be (a t-distribution), making our calculations valid.
1. Random: The data must come from a well-designed random sample or a randomized experiment. This ensures the sample is representative of the population and helps prevent bias.
- How to check: Look for the words "random sample" or "randomly selected" in the problem description.
2. 10% Condition (Independence): When sampling without replacement, the sample size n should be no more than 10% of the population size N.
n \le 0.10N
Why it's important: This condition ensures that individual observations are reasonably independent. If we sample too large a fraction of the population, the probabilities of selection change, and the standard deviation formula becomes inaccurate.
How to check: State that it's reasonable to assume the population is at least 10 times the sample size. For example, if n=50, state "It's reasonable to assume the population of all [items] is at least 500."
3. Normal/Large Sample: The sampling distribution of the sample mean (x̄) must be approximately Normal. There is a hierarchy for checking this condition:
a) Population is Normal: If the problem states that the parent population is Normally distributed, this condition is met, regardless of sample size.
b) Large Sample (Central Limit Theorem): If the sample size is large (n \ge 30), the Central Limit Theorem (CLT) states that the sampling distribution of x̄ will be approximately Normal, regardless of the shape of the population distribution. This is the most common way to meet the condition.
c) Small Sample (n < 30): If the sample size is small and the population distribution is unknown, you must examine a graph of the sample data (e.g., a boxplot, dotplot, or histogram). If the graph shows no strong skewness and no outliers, it is reasonable to assume the underlying population distribution is approximately Normal. If there is strong skewness or outliers, you should not proceed with the t-test.
[Image: Three dotplots side-by-side. The first is labeled "Okay to proceed (n=15, roughly symmetric)." The second is labeled "Do not proceed (n=15, strong right skew)." The third is labeled "Do not proceed (n=15, clear outlier)."]
Key Vocabulary
Null Hypothesis (H₀): The starting assumption or claim about a population parameter (e.g., μ = 100) that a significance test is designed to assess evidence against.
Alternative Hypothesis (Hₐ): The claim about a population parameter that we are trying to find evidence for (e.g., μ > 100).
One-Sample t-test for a Population Mean: The specific statistical test used to draw a conclusion about a single population mean when the population standard deviation (σ) is unknown.
Significance Test: A formal procedure for comparing observed data with a claim (a hypothesis) whose truth is in question.
Conditions for Inference: A set of requirements (Random, 10%, Normal/Large Sample) that must be met to ensure that the calculations and conclusions of a significance test are valid.
Central Limit Theorem (CLT): A fundamental theorem stating that when the sample size (n) is sufficiently large (n \ge 30), the sampling distribution of the sample mean (x̄) will be approximately Normal, regardless of the population's distribution.
Calculator Tech (TI-84)
While this topic focuses on setting up the test, the setup occurs within the calculator's test function.
Path:STAT -> TESTS -> 2: T-Test...
You will see the following screen, where you must input the values defined in your "State" and "Plan" steps.
Input Screen:
Inpt: Data Stats
μ₀: [Enter the hypothesized value from H₀]
x̄: [Enter the sample mean]
sₓ: [Enter the sample standard deviation]
n: [Enter the sample size]
μ: \neqμ₀ <μ₀ >μ₀ [Select the form of your Hₐ]
Inpt: Data vs. Stats:Choose if you have the raw data entered into a list (e.g., L1). The calculator will compute x̄ and sₓ for you.
Choose if you are given the summary statistics (mean, standard deviation, sample size) in the problem.
: This is the value from your null hypothesis (H₀: μ = μ₀).
, , : These are your sample statistics.
μ: \neqμ₀ <μ₀ >μ₀: This is where you specify your alternative hypothesis. Highlight the symbol that matches your Hₐ.
How to Show Work on the FRQ
To get full credit for setting up a significance test on the AP exam, use the State and Plan steps of the four-step inference process.
STATE
Parameter: Define the parameter of interest, μ, in the context of the problem.
- Template: "Let μ be the true mean [description of what is being measured] for the population of [description of the population]."
Hypotheses: State the null and alternative hypotheses using correct symbols and values.
Template:
H₀: μ = [hypothesized value]
Hₐ: μ [>, <, or \neq] [hypothesized value]
Significance Level: State the alpha (α) level if it is given.
- Template: "We will use a significance level of α = [value]."
PLAN
Name of Procedure: State the full name of the test you will perform.
- Template: "We will perform a one-sample t-test for a population mean."
Check Conditions: Check the three conditions for inference, making sure to connect each one to the context of the problem.
Random: "The problem states that the data come from a random sample of [context]."
10% Condition: "The sample size of n = [value] is less than 10% of the entire population of [context], as it is reasonable to assume there are at least 10 * [n] = [10n] such items."
Normal/Large Sample:
(If n \ge 30): "Since the sample size is n = [value] \ge 30, the Central Limit Theorem applies, and we can assume the sampling distribution of the sample mean is approximately Normal."
(If n < 30): "Since the sample size is n = [value] < 30, we must examine the sample data. A [dotplot/boxplot/etc.] of the sample data (sketch or describe it) shows no strong skewness or outliers. Therefore, it is reasonable to assume the underlying population distribution is approximately Normal."
Practice Problems
Problem 1:
A coffee machine at a university is designed to dispense an average of 12 fluid ounces of coffee per cup. The student newspaper suspects the machine is under-filling the cups. They take a random sample of 45 cups and record the amount of coffee in each. State the hypotheses and check the conditions for a significance test to determine if the machine is under-filling.
Solution:
STATE
Parameter: Let μ be the true mean amount of coffee dispensed by the machine in fluid ounces.
Hypotheses: We are testing if the machine is under-filling, which means the mean is less than 12 ounces.
H₀: μ = 12
Hₐ: μ < 12
Significance Level: Not specified, so we don't need to state it.
PLAN
Name of Procedure: We will perform a one-sample t-test for a population mean.
Check Conditions:
Random: The problem states that the students took a "random sample of 45 cups." The condition is met.
10% Condition: The sample size is n = 45. It is reasonable to assume that the machine dispenses far more than 10 * 45 = 450 cups in its lifetime. The condition is met.
Normal/Large Sample: The sample size is n = 45. Since 45 \ge 30, the Central Limit Theorem applies, and the sampling distribution of the sample mean amount of coffee is approximately Normal. The condition is met.
Problem 2:
A high school counselor wants to know if the mean time students at her school spend on homework per night differs from the national average of 2.1 hours. She takes a random sample of 15 students and records their homework times (in hours):
State the hypotheses and check the conditions for a significance test.
Solution:
STATE
Parameter: Let μ be the true mean time (in hours) spent on homework per night by students at this high school.
Hypotheses: The counselor wants to know if the mean time differs from 2.1 hours, which implies a two-sided test.
H₀: μ = 2.1
Hₐ: μ \neq 2.1
Significance Level: Not specified.
PLAN
Name of Procedure: We will perform a one-sample t-test for a population mean.
Check Conditions:
Random: The problem states that the counselor took a "random sample of 15 students." The condition is met.
10% Condition: The sample size is n = 15. It is reasonable to assume the high school has more than 10 * 15 = 150 students. The condition is met.
Normal/Large Sample: The sample size is n = 15, which is less than 30. Therefore, we must examine a graph of the sample data. A dotplot of the data is shown below:
---------------------------------The dotplot shows no strong skewness and no outliers. Therefore, it is reasonable to proceed with the assumption that the underlying population of homework times is approximately Normal. The condition is met.
Common Mistakes to Avoid
Using Sample Statistics in Hypotheses: Never write hypotheses using x̄ (e.g., H₀: x̄ = 12). Hypotheses are always about the population parameter μ. The entire purpose of the test is to use the sample statistic x̄ to make an inference about the unknown parameter μ.
Incorrectly Checking the Normal/Large Sample Condition:
For n \ge 30, don't just write "n \ge 30, so it's Normal." You must explicitly state that the Central Limit Theorem (CLT) applies and that it is the sampling distribution of x̄ that is approximately Normal.
For n < 30, do not check a graph of the population (which you don't have). You must create and comment on a graph of the sample data.
Forgetting the 10% Condition: This condition is a required part of the "Plan" step. Always address it by stating that the population is reasonably at least 10 times your sample size.
Choosing the Wrong Alternative Hypothesis: Read the prompt carefully. "Less than" or "decreased" implies Hₐ: μ < μ₀. "Greater than" or "increased" implies Hₐ: μ > μ₀. "Different from" or "changed" implies a two-sided test, Hₐ: μ \neq μ₀.
Mixing Up Parameters and Conditions: Do not confuse the condition that the population should be approximately Normal (for small samples) with the conclusion that the sampling distribution is approximately Normal (the goal of the condition check).