Setting Up a | AP Stats Unit 7 Study Guide

Quick Summary

This guide will equip you to set up a valid significance test for a population mean. You will learn to correctly formulate the null and alternative hypotheses based on a research question, identify the appropriate inference procedure (a one-sample t-test), and meticulously check the required conditions to ensure your conclusions are statistically sound.

Key Concepts

A significance test for a population mean (μ) allows us to use sample data to assess the evidence for a claim about an unknown population mean. The setup is the most critical part of this process.

1. Formulating Hypotheses

Hypotheses are always statements about a population parameter, in this case, the population mean μ. They are never about the sample statistic (x̄).

The Null Hypothesis (H₀): This is the "statement of no effect" or the status quo. It always contains a statement of equality. For a population mean, it takes the form:
H₀: μ = μ₀
where μ₀ is the specific, hypothesized value of the population mean that we are assuming to be true.
The Alternative Hypothesis (Hₐ): This is the claim we are trying to find evidence for. It determines whether the test is one-sided or two-sided.
- One-Sided (Greater Than): Used when we are looking for evidence that the mean is larger than the hypothesized value.
  Hₐ: μ > μ₀
  (Keywords: greater than, increased, more than)
- One-Sided (Less Than): Used when we are looking for evidence that the mean is smaller than the hypothesized value.
  Hₐ: μ < μ₀
  (Keywords: less than, decreased, fewer than, lower)
- Two-Sided: Used when we are looking for evidence that the mean is simply different from the hypothesized value, without a specific direction.
  Hₐ: μ \neq μ₀
  (Keywords: different from, changed, not equal to)

2. Choosing the Correct Test

When the population standard deviation (σ) is unknown (which is almost always the case in practice), we must estimate it using the sample standard deviation (sₓ). This introduces more variability, and we can no longer use a Normal model (z-test). Instead, we must use a t-distribution.

Name of the Test:One-Sample t-test for a Population Mean

3. Checking the Conditions for Inference

Before you can perform the test, you must verify that three conditions are met. These conditions ensure that the sampling distribution of the test statistic is what we expect it to be (a t-distribution), making our calculations valid.

1. Random: The data must come from a well-designed random sample or a randomized experiment. This ensures the sample is representative of the population and helps prevent bias.
- How to check: Look for the words "random sample" or "randomly selected" in the problem description.
2. 10% Condition (Independence): When sampling without replacement, the sample size n should be no more than 10% of the population size N.
- n \le 0.10N
- Why it's important: This condition ensures that individual observations are reasonably independent. If we sample too large a fraction of the population, the probabilities of selection change, and the standard deviation formula becomes inaccurate.
- How to check: State that it's reasonable to assume the population is at least 10 times the sample size. For example, if n=50, state "It's reasonable to assume the population of all [items] is at least 500."
3. Normal/Large Sample: The sampling distribution of the sample mean (x̄) must be approximately Normal. There is a hierarchy for checking this condition:
- a) Population is Normal: If the problem states that the parent population is Normally distributed, this condition is met, regardless of sample size.
- b) Large Sample (Central Limit Theorem): If the sample size is large (n \ge 30), the Central Limit Theorem (CLT) states that the sampling distribution of x̄ will be approximately Normal, regardless of the shape of the population distribution. This is the most common way to meet the condition.
- c) Small Sample (n < 30): If the sample size is small and the population distribution is unknown, you must examine a graph of the sample data (e.g., a boxplot, dotplot, or histogram). If the graph shows no strong skewness and no outliers, it is reasonable to assume the underlying population distribution is approximately Normal. If there is strong skewness or outliers, you should not proceed with the t-test.

[Image: Three dotplots side-by-side. The first is labeled "Okay to proceed (n=15, roughly symmetric)." The second is labeled "Do not proceed (n=15, strong right skew)." The third is labeled "Do not proceed (n=15, clear outlier)."]

Key Vocabulary

Null Hypothesis (H₀): The starting assumption or claim about a population parameter (e.g., μ = 100) that a significance test is designed to assess evidence against.
Alternative Hypothesis (Hₐ): The claim about a population parameter that we are trying to find evidence for (e.g., μ > 100).
One-Sample t-test for a Population Mean: The specific statistical test used to draw a conclusion about a single population mean when the population standard deviation (σ) is unknown.
Significance Test: A formal procedure for comparing observed data with a claim (a hypothesis) whose truth is in question.
Conditions for Inference: A set of requirements (Random, 10%, Normal/Large Sample) that must be met to ensure that the calculations and conclusions of a significance test are valid.
Central Limit Theorem (CLT): A fundamental theorem stating that when the sample size (n) is sufficiently large (n \ge 30), the sampling distribution of the sample mean (x̄) will be approximately Normal, regardless of the population's distribution.

Calculator Tech (TI-84)

While this topic focuses on setting up the test, the setup occurs within the calculator's test function.

Path:STAT -> TESTS -> 2: T-Test...

You will see the following screen, where you must input the values defined in your "State" and "Plan" steps.

Input Screen:

Inpt: Data Stats

μ₀: [Enter the hypothesized value from H₀]

x̄: [Enter the sample mean]

sₓ: [Enter the sample standard deviation]

n: [Enter the sample size]

μ: \neqμ₀ <μ₀ >μ₀ [Select the form of your Hₐ]

$C a l c u l a t eDr a w$

Inpt: Data vs. Stats:
- Choose $D a t a$ if you have the raw data entered into a list (e.g., L1). The calculator will compute x̄ and sₓ for you.
- Choose $St a t s$ if you are given the summary statistics (mean, standard deviation, sample size) in the problem.
$μ_{0}$ : This is the value from your null hypothesis (H₀: μ = μ₀).
$\overset{x}{ˉ}$ , $s_{x}$ , $n$ : These are your sample statistics.
μ: \neqμ₀ <μ₀ >μ₀: This is where you specify your alternative hypothesis. Highlight the symbol that matches your Hₐ.

How to Show Work on the FRQ

To get full credit for setting up a significance test on the AP exam, use the State and Plan steps of the four-step inference process.

STATE

Parameter: Define the parameter of interest, μ, in the context of the problem.
- Template: "Let μ be the true mean [description of what is being measured] for the population of [description of the population]."
Hypotheses: State the null and alternative hypotheses using correct symbols and values.
- Template:
  H₀: μ = [hypothesized value]
  Hₐ: μ [>, <, or \neq] [hypothesized value]
Significance Level: State the alpha (α) level if it is given.
- Template: "We will use a significance level of α = [value]."

PLAN

Name of Procedure: State the full name of the test you will perform.
- Template: "We will perform a one-sample t-test for a population mean."
Check Conditions: Check the three conditions for inference, making sure to connect each one to the context of the problem.
- Random: "The problem states that the data come from a random sample of [context]."
- 10% Condition: "The sample size of n = [value] is less than 10% of the entire population of [context], as it is reasonable to assume there are at least 10 * [n] = [10n] such items."
- Normal/Large Sample:
  - (If n \ge 30): "Since the sample size is n = [value] \ge 30, the Central Limit Theorem applies, and we can assume the sampling distribution of the sample mean is approximately Normal."
  - (If n < 30): "Since the sample size is n = [value] < 30, we must examine the sample data. A [dotplot/boxplot/etc.] of the sample data (sketch or describe it) shows no strong skewness or outliers. Therefore, it is reasonable to assume the underlying population distribution is approximately Normal."

Practice Problems

Problem 1:

A coffee machine at a university is designed to dispense an average of 12 fluid ounces of coffee per cup. The student newspaper suspects the machine is under-filling the cups. They take a random sample of 45 cups and record the amount of coffee in each. State the hypotheses and check the conditions for a significance test to determine if the machine is under-filling.

Solution:

STATE

Parameter: Let μ be the true mean amount of coffee dispensed by the machine in fluid ounces.
Hypotheses: We are testing if the machine is under-filling, which means the mean is less than 12 ounces.
H₀: μ = 12
Hₐ: μ < 12
Significance Level: Not specified, so we don't need to state it.

PLAN

Name of Procedure: We will perform a one-sample t-test for a population mean.
Check Conditions:
- Random: The problem states that the students took a "random sample of 45 cups." The condition is met.
- 10% Condition: The sample size is n = 45. It is reasonable to assume that the machine dispenses far more than 10 * 45 = 450 cups in its lifetime. The condition is met.
- Normal/Large Sample: The sample size is n = 45. Since 45 \ge 30, the Central Limit Theorem applies, and the sampling distribution of the sample mean amount of coffee is approximately Normal. The condition is met.

Problem 2:

A high school counselor wants to know if the mean time students at her school spend on homework per night differs from the national average of 2.1 hours. She takes a random sample of 15 students and records their homework times (in hours):

$1.5, 2.0, 2.2, 1.8, 2.5, 2.0, 3.0, 1.7, 1.9, 2.1, 2.3, 1.6, 2.0, 2.8, 2.4$

State the hypotheses and check the conditions for a significance test.