Constructing a Confidence | AP Stats Unit 6 Study Guide

Quick Summary

This guide will equip you to estimate an unknown population proportion with a specific level of confidence. You will learn to construct and interpret a one-sample z-interval for a population proportion by defining the parameter, verifying the necessary conditions for inference, calculating the interval using the standard formula, and communicating your conclusion in the context of the problem. Mastering this four-step process is essential for success on the AP exam.

Key Concepts

The primary goal of a confidence interval is to provide a range of plausible values for an unknown population parameter. For this topic, our parameter of interest is the population proportion, p. Since we rarely know p, we use data from a sample to estimate it.

The Logic of a Confidence Interval

Our estimate starts with a point estimate, which is our single best guess for the parameter. For a population proportion p, the point estimate is the sample proportion, p̂ (read "p-hat").

However, a point estimate is almost certainly wrong. Different random samples will produce different values of p̂. To account for this sampling variability, we build a "cushion" around our point estimate. This cushion is called the margin of error.

The general formula for any confidence interval is:

Point Estimate ± Margin of Error

The Formula for a One-Sample z-Interval for a Proportion

For a population proportion, the specific formula is:

p̂ ± z√([p̂(1 - p̂)] / n)*

Let's break down each component:

p̂ (the point estimate): The sample proportion of successes.
- Formula: p̂ = x / n
- x: The number of successes in the sample.
- n: The total sample size.
z (the critical value):* This value determines the width of the interval and is directly related to the confidence level. It tells us how many standard errors we need to go out from the mean of the sampling distribution to capture the desired percentage of possible p̂ values. It is found using the standard Normal distribution.
- Common Critical Values:
  - 90% Confidence: z* = 1.645
  - 95% Confidence: z* = 1.960
  - 99% Confidence: z* = 2.576
- [Image: A standard Normal curve with the central C% shaded, showing -z* and +z* as the boundaries. For a 95% interval, the central 95% is shaded, and the boundaries are at -1.96 and 1.96.]
√([p̂(1 - p̂)] / n) (the standard error): This is the standard error of the sample proportion (SEp̂). It estimates the typical distance between a sample proportion p̂ and the population proportion p. We use p̂ in the formula because the true proportion p is unknown. This is the key distinction between standard error (uses a statistic, p̂) and standard deviation of the sampling distribution (uses a parameter, p).
z√([p̂(1 - p̂)] / n) (the margin of error):* The entire second half of the formula. It is calculated by multiplying the critical value by the standard error. The margin of error represents half the total width of the confidence interval.

Conditions for Constructing the Interval

Before we can calculate a valid confidence interval, we must verify three crucial conditions. These conditions ensure that our calculations are trustworthy and the sampling distribution of p̂ is approximately Normal.

Random Condition: The data must come from a well-designed random sample or a randomized experiment.
- Why it's important: This condition ensures the sample is representative of the population, which allows us to generalize our findings from the sample to the larger population. It also helps prevent bias.
10% Condition (Independence): When sampling without replacement, the sample size n must be no more than 10% of the population size N (i.e., n \le 0.10N).
- Why it's important: This allows us to treat individual observations as independent, even though we are sampling without replacement. This is necessary for calculating the standard error correctly.
Large Counts Condition (Normality): The number of successes and failures in the sample must both be at least 10.
- Why it's important: This condition ensures that the sampling distribution of p̂ is approximately Normal. This justifies our use of the z* critical value from the Normal distribution to build the interval.
- How to check: Verify that np̂ \ge 10 and n(1 - p̂) \ge 10. Note that np̂ is simply the number of successes (x) and n(1 - p̂) is the number of failures.

Key Vocabulary

Confidence Interval: An interval of plausible values for an unknown population parameter, calculated from sample data.
Confidence Level: The long-run success rate of the method used to construct the interval. A 95% confidence level means that if we were to take many random samples and construct an interval from each, about 95% of those intervals would capture the true population parameter.
Point Estimate: A single value statistic (e.g., p̂) used to estimate a population parameter (e.g., p).
Margin of Error: The value that quantifies the sampling variability in an estimate; it is half the width of the confidence interval and represents the maximum likely difference between the point estimate and the true parameter.
Standard Error: An estimate of the standard deviation of a statistic, calculated from sample data. For a proportion, the standard error of p̂ is √([p̂(1 - p̂)] / n).
Critical Value (z)*: The multiplier used to create the margin of error. It is the z-score that corresponds to the specified confidence level in a standard Normal distribution.

Calculator Tech (TI-84)

You can calculate a one-sample z-interval for a proportion directly on your calculator, which is highly recommended for the "Do" step of an FRQ to ensure accuracy.

Function: $1 - P ro pZ I n t$

Keystrokes:

Press STAT.
Arrow over to the TESTS menu.
Scroll down to option A: 1-PropZInt... and press ENTER.

Inputs:

x: The number of successes in your sample. This must be a whole number. If you are given a percentage, you must first calculate $x$ by multiplying the percentage by the sample size $n$ and rounding to the nearest whole number.
n: The total sample size.
C-Level: The confidence level, entered as a decimal (e.g., 0.95 for 95% confidence).

After entering the values, select $C a l c u l a t e$ and press ENTER. The calculator will output the confidence interval $(l o w er b o u n d, u pp er b o u n d)$ and will also remind you of the p̂ and n you used.

How to Show Work on the FRQ

To earn full credit on an inference question, you must use the four-step State-Plan-Do-Conclude process. This structure demonstrates your complete understanding of the inference procedure.

Template for a One-Sample z-Interval for a Proportion

State:

"We want to estimate the true proportion, p, of [describe the population and the success attribute in context] at a [C]% confidence level."

Plan:

"The procedure is a one-sample z-interval for a population proportion."

"We must check the following conditions:"

Random: "[State how the data were collected]. This was a random sample of [context], so the condition is met."
10% Condition: "The sample size is n = [value]. It is reasonable to assume that the total population of [context] is at least 10 * [n] = [10n]. Therefore, the 10% condition is met."
Large Counts: "The number of successes is [x] and the number of failures is [n-x]. Since both [x] and [n-x] are \ge 10, the Large Counts condition is met."

Do:

"The sample proportion is p̂ = [x] / [n] = [value]."

"The critical value for [C]% confidence is z* = [value]."

"The confidence interval is calculated as follows:"

Formula: p̂ ± z*√([p̂(1 - p̂)] / n)
Substitution: [p̂ value] ± [z* value]√(([p̂ value](1 - [p̂ value])) / [n value])
Calculation: [p̂ value] ± [margin of error value]
Final Interval: ([lower bound], [upper bound])

(It is highly recommended to use the $1 - P ro pZ I n t$ function on your calculator for the final interval and then use the formula to show your work.)

Conclude:

"We are [C]% confident that the interval from [lower bound] to [upper bound] captures the true proportion of [describe the population and the success attribute in context]."

Practice Problems

Problem 1:

A local school district wants to know the proportion of its high school students who have a part-time job. A guidance counselor takes a simple random sample of 200 high school students and finds that 85 of them have a part-time job. Construct and interpret a 95% confidence interval for the true proportion of all high school students in this district who have a part-time job.

Solution:

State:

We want to estimate p, the true proportion of all high school students in this district who have a part-time job, with 95% confidence.

Plan:

The procedure is a one-sample z-interval for a population proportion.

We check the conditions:

Random: The problem states that a "simple random sample of 200 high school students" was taken. The condition is met.
10% Condition: The sample size is n = 200. It is reasonable to assume there are more than 10 * 200 = 2000 high school students in a local school district. The condition is met.
Large Counts: The number of successes (have a job) is 85, and the number of failures (do not have a job) is 200 - 85 = 115. Since both 85 and 115 are \ge 10, the Large Counts condition is met.

Do:

The sample proportion is p̂ = 85 / 200 = 0.425.

The critical value for 95% confidence is z* = 1.96.

The confidence interval is:

Formula: p̂ ± z*√([p̂(1 - p̂)] / n)
Substitution: 0.425 ± 1.96√([0.425(1 - 0.425)] / 200)
Calculation: 0.425 ± 1.96(0.0349)
Calculation: 0.425 ± 0.0684
Final Interval: (0.3566, 0.4934)

(Using TI-84 $1 - P ro pZ I n t$ with x=85, n=200, C-Level=0.95 gives (0.35658, 0.49342).)

Conclude:

We are 95% confident that the interval from 0.357 to 0.493 captures the true proportion of all high school students in this district who have a part-time job.

Problem 2:

A national polling organization reports that in a random sample of 1,020 American adults, 58% said they support a new environmental policy. Construct and interpret a 99% confidence interval for the proportion of all American adults who support the policy.

Solution:

State:

We want to estimate p, the true proportion of all American adults who support the new environmental policy, with 99% confidence.

Plan:

The procedure is a one-sample z-interval for a population proportion.

We check the conditions:

Random: The problem states a "random sample of 1,020 American adults" was used. The condition is met.
10% Condition: The sample size is n = 1,020. It is certain that the population of all American adults is greater than 10 * 1,020 = 10,200. The condition is met.
Large Counts: The number of successes is np̂ = 1020 * 0.58 = 591.6, which we round to x = 592. The number of failures is n(1-p̂) = 1020 * 0.42 = 428.4, which we round to 428. Since both 592 and 428 are \ge 10, the Large Counts condition is met.

Do:

The sample proportion is p̂ = 0.58.

The critical value for 99% confidence is z* = 2.576.

The confidence interval is:

Formula: p̂ ± z*√([p̂(1 - p̂)] / n)
Substitution: 0.58 ± 2.576√([0.58(1 - 0.58)] / 1020)
Calculation: 0.58 ± 2.576(0.0154)
Calculation: 0.58 ± 0.0397
Final Interval: (0.5403, 0.6197)

(Using TI-84 $1 - P ro pZ I n t$ with x=592, n=1020, C-Level=0.99 gives (0.54033, 0.61947).)

Conclude:

We are 99% confident that the interval from 0.540 to 0.620 captures the true proportion of all American adults who support the new environmental policy.

Common Mistakes to Avoid

Misinterpreting the Confidence Level: Do not say, "There is a 95% chance that the true proportion p is in the interval (0.357, 0.493)." The true proportion p is a fixed value; it is either in the interval or it is not. The 95% refers to the reliability of the method used to generate the interval, not the probability of a single interval being correct.
Misinterpreting the Confidence Interval: Avoid saying, "95% of the sample data falls between 0.357 and 0.493." The interval is about the plausible values for the population parameter, not the distribution of the sample data. Stick to the "We are C% confident..." template.
Forgetting or Botching Condition Checks: You must check all three conditions (Random, 10%, Large Counts) in context. Do not just write "n > 30" (that's for means!) or "np > 10". You must show the numbers: "np̂ = 2000.425 = 85 \ge 10".
Using p̂ in the "State" Step: The "State" step is about the parameter you are trying to estimate, which is the population proportion p. Do not mention the sample proportion p̂ until the "Do" step.
Calculator Input Error: The $1 - P ro pZ I n t$ function requires $x$ , the count of successes, not p̂, the sample proportion. If you are given p̂, you must first calculate $x = n * \overset{p}{^}$ and round to the nearest whole number before using the calculator.

Constructing a Confidence Interval for a Population Proportion - AP Statistics Study Guide

Quick Summary

Key Concepts

The Logic of a Confidence Interval

The Formula for a One-Sample z-Interval for a Proportion

Conditions for Constructing the Interval

Key Vocabulary

Calculator Tech (TI-84)

How to Show Work on the FRQ

Practice Problems

Common Mistakes to Avoid