Quick Summary
This guide will equip you to set up and prepare for a Chi-Square Goodness of Fit Test. You will learn to identify situations where this test is appropriate—when comparing the distribution of a single categorical variable to a hypothesized distribution—and to correctly state the null and alternative hypotheses. You will master the process of calculating expected counts and verifying the necessary conditions (Random, 10%, Large Counts) to ensure the validity of your test.
Key Concepts
The Chi-Square (χ^2) Goodness of Fit (GOF) Test is an inferential procedure used to determine if the observed distribution of a single categorical variable from a sample is significantly different from a hypothesized or claimed population distribution.
1. Hypotheses for a GOF Test
The hypotheses compare the observed distribution to a claimed distribution. They are always stated in words, not symbols.
Null Hypothesis (H₀): The null hypothesis states that there is no difference between the observed distribution and the hypothesized distribution.
- Template: H₀: The distribution of [categorical variable] is the same as the claimed distribution of [state the claimed proportions/percentages].
Alternative Hypothesis (Hₐ): The alternative hypothesis states that the observed distribution is different from the hypothesized one. It's a "catch-all" that simply says the null is not true.
- Template: Hₐ: The distribution of [categorical variable] is different from the claimed distribution.
Example: A company claims their bag of mixed nuts contains 50% peanuts, 30% cashews, and 20% almonds.
H₀: The distribution of nuts in the bags is 50% peanuts, 30% cashews, and 20% almonds.
Hₐ: The distribution of nuts in the bags is different from 50% peanuts, 30% cashews, and 20% almonds.
2. Conditions for Inference
Just like with proportions and means, we must check three conditions before proceeding with the test.
Random: The data must come from a random sample or a randomized experiment. This helps ensure the sample is representative of the population.
10% Condition (Independence): When sampling without replacement, the sample size n should be no more than 10% of the population size (n \le 0.10N). This allows us to assume independence between observations.
Large Counts: All expected counts must be at least 5. This condition ensures that the chi-square distribution is an appropriate model for the sampling distribution of the test statistic.
- Crucially, you check this condition using expected counts, not the observed counts from your sample.
3. Calculating Expected Counts
The expected count for any category is the number of observations we would expect to see if the null hypothesis were true.
Formula: For each category,
is the total sample size.
is the hypothesized proportion for that category (from H₀).
Example: You take a random sample of 200 nuts from the company mentioned above.
Expected Peanuts = 200 * 0.50 = 100
Expected Cashews = 200 * 0.30 = 60
Expected Almonds = 200 * 0.20 = 40
To check the Large Counts condition, you would confirm that 100, 60, and 40 are all \ge 5.
4. The Chi-Square Test Statistic (χ^2)
The test statistic measures the discrepancy between the observed counts and the expected counts. A larger χ^2 value indicates a greater difference and provides more evidence against the null hypothesis.
Formula:
χ^2 = Σ [ (Observed - Expected)^2 / Expected ]
- You calculate the component for each category and then sum them all up.
5. The Chi-Square Distribution
The χ^2 statistic follows a chi-square distribution, which has the following properties:
It is a family of distributions, defined by degrees of freedom (df).
It is always skewed to the right.
Its values are always non-negative (it starts at 0).
As degrees of freedom increase, the distribution becomes less skewed and more symmetric, resembling a normal curve.
[Image: A graph showing several chi-square distributions with increasing degrees of freedom (e.g., df=2, df=5, df=10), illustrating the right skew and how the shape changes.]
6. Degrees of Freedom (df)
For a Chi-Square Goodness of Fit test, the degrees of freedom depend on the number of categories, not the sample size.
Formula:
- is the number of categories for the variable.
Key Vocabulary
Chi-Square Goodness of Fit Test: An inference procedure used to decide whether a sample distribution of a single categorical variable is consistent with a hypothesized population distribution.
Observed Counts: The actual number of observations from the sample that fall into each category.
Expected Counts: The number of observations that would be expected to fall into each category if the null hypothesis (H₀) were true.
Chi-Square Test Statistic (χ^2): A measure of how far the observed counts are from the expected counts, calculated by summing the squared differences divided by the expected counts.
Degrees of Freedom (df): For a GOF test, this is the number of categories minus one (). It defines the specific chi-square distribution used to calculate the p-value.
Calculator Tech (TI-84)
The TI-84 has a built-in function that makes performing a Chi-Square GOF test very efficient.
Scenario: You have your observed counts and have calculated your expected counts.
Enter the Data:
Press
STAT->1:Edit...Enter your Observed Counts into list
L1.Enter your Expected Counts into list
L2.Important: The counts in L1 and L2 must line up by category.
Run the Test:
Press
STAT->TESTS.Scroll down to
D:χ^2GOF-Test...and pressENTER.You will see the following inputs:
Observed:Make sure this isL1(or whichever list you used).Expected:Make sure this isL2(or whichever list you used).`df:Enter the degrees of freedom ($k-1). The calculator will not calculate this for you.
Highlight and press
ENTER.
Interpret the Output:
: The calculated chi-square test statistic.
: The p-value associated with your test statistic.
: The degrees of freedom you entered.
`CNTRB$: A list of the components for each category. This is useful for follow-up analysis to see which categories contributed most to the χ^2 value.
How to Show Work on the FRQ
For any inference question, use the four-step State-Plan-Do-Conclude process to earn full credit.
STATE
Hypotheses:
H₀: The distribution of [categorical variable in context] is [state the hypothesized proportions/percentages, e.g., 25% for each of four categories].
Hₐ: The distribution of [categorical variable in context] is different from the hypothesized distribution.
Significance Level:
- "We will use a significance level of α = 0.05." (Or use the level given in the problem).
PLAN
Name the Test:
- "We will perform a Chi-Square Goodness of Fit Test."
Check the Conditions:
Random: "The problem states the data come from a random sample of [n subjects/items]."
10% Condition: "The sample size of n = [sample size] is less than 10% of all [population in context]." (Assume this holds if the population is large).
Large Counts: "All expected counts are at least 5." You MUST show the calculation for the expected counts and list them. For example:
Expected Category 1 = n * p₁ = [value]
Expected Category 2 = n * p₂ = [value]
...and so on for all categories.
Then state: "Since all expected counts ([list the values]) are \ge 5, the condition is met."
DO
General Formula: Write the formula for the test statistic: χ^2 = Σ [ (Observed - Expected)^2 / Expected ].
Calculations:
Degrees of Freedom: df = k - 1 = [value].
Test Statistic: Show the calculation for the first one or two components, then use "..." and write the final value from your calculator.
- χ^2 = ([Observed₁ - Expected₁]^2 / Expected₁) + ([Observed₂ - Expected₂]^2 / Expected₂) + ... = [value from calculator].
P-value: State the p-value from your calculator.
- P-value = [value from calculator].
CONCLUDE
Decision: "Because the p-value of [p-value] is [less than / greater than] our significance level of α = [alpha level], we [reject / fail to reject] the null hypothesis."
Context: "There is [convincing / not convincing] evidence to suggest that the distribution of [categorical variable in context] is different from the hypothesized distribution."
Practice Problems
Problem 1: A casino is concerned that one of its six-sided dice is unfair. They roll the die 300 times and record the following outcomes:
| Outcome | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| Count | 41 | 58 | 45 | 53 | 62 | 41 |
Does this data provide convincing evidence at the α = 0.05 level that the die is unfair?
Solution:
STATE:
H₀: The distribution of outcomes for the die is uniform (fair), with each outcome having a probability of 1/6.
Hₐ: The distribution of outcomes for the die is not uniform.
We will use a significance level of α = 0.05.
PLAN:
We will perform a Chi-Square Goodness of Fit Test.
Conditions:
Random: The problem states the die was rolled 300 times, which we can consider a random sample of all possible rolls.
10% Condition: The 300 rolls are less than 10% of all possible rolls of this die.
Large Counts: For a fair die, we expect each outcome to occur with a probability of 1/6. The total sample size is n=300.
Expected count for each outcome = 300 * (1/6) = 50.
Since the expected count of 50 is \ge 5 for all six categories, the condition is met.
DO:
Degrees of Freedom: There are k=6 categories (outcomes), so df = 6 - 1 = 5.
Test Statistic: χ^2 = Σ [ (Observed - Expected)^2 / Expected ]
χ^2 = (41-50)^2/50 + (58-50)^2/50 + (45-50)^2/50 + (53-50)^2/50 + (62-50)^2/50 + (41-50)^2/50
χ^2 = 1.62 + 1.28 + 0.5 + 0.18 + 2.88 + 1.62 = 8.08
P-value: Using a TI-84 with χ^2 = 8.08 and df = 5, we find the p-value = 0.1518.
- (Calculator: )
CONCLUDE:
Because the p-value of 0.1518 is greater than our significance level of α = 0.05, we fail to reject the null hypothesis.
There is not convincing evidence to suggest that the die is unfair. The observed differences from the expected counts are small enough to be attributed to random chance.
Problem 2: The U.S. Census Bureau reported the following age distribution for the year 2010: 19.0% were under 15, 21.2% were 15-29, 26.5% were 30-49, and 33.3% were 50 or older. A sociologist takes a random sample of 500 U.S. residents in the current year and finds 85 are under 15, 110 are 15-29, 120 are 30-49, and 185 are 50 or older. Is there convincing evidence that the current age distribution is different from the 2010 distribution? Use α = 0.01.
Solution:
STATE:
H₀: The current age distribution of U.S. residents is the same as the 2010 distribution (19.0% under 15, 21.2% 15-29, 26.5% 30-49, 33.3% 50+).
Hₐ: The current age distribution of U.S. residents is different from the 2010 distribution.
We will use a significance level of α = 0.01.
PLAN:
We will perform a Chi-Square Goodness of Fit Test.
Conditions:
Random: The problem states the data come from a random sample of 500 U.S. residents.
10% Condition: The sample size of n = 500 is less than 10% of all U.S. residents.
Large Counts: We calculate the expected counts based on the 2010 distribution and n=500.
Expected Under 15: 500 * 0.190 = 95
Expected 15-29: 500 * 0.212 = 106
Expected 30-49: 500 * 0.265 = 132.5
Expected 50+: 500 * 0.333 = 166.5
Since all expected counts (95, 106, 132.5, 166.5) are \ge 5, the condition is met.
DO:
Degrees of Freedom: There are k=4 age categories, so df = 4 - 1 = 3.
Test Statistic: χ^2 = Σ [ (Observed - Expected)^2 / Expected ]
Observed counts are 85, 110, 120, 185.
χ^2 = (85-95)^2/95 + (110-106)^2/106 + (120-132.5)^2/132.5 + (185-166.5)^2/166.5
Using the on a TI-84 (L1: {85, 110, 120, 185}, L2: {95, 106, 132.5, 166.5}, df: 3), we get χ^2 = 4.498.
P-value: The calculator gives a p-value = 0.2124.
CONCLUDE:
Because the p-value of 0.2124 is greater than our significance level of α = 0.01, we fail to reject the null hypothesis.
There is not convincing evidence to suggest that the current age distribution of U.S. residents is different from the 2010 distribution.
Common Mistakes to Avoid
Using Proportions Instead of Counts: The Chi-Square test statistic formula, χ^2 = Σ [ (Observed - Expected)^2 / Expected ], is defined only for counts. Never use percentages or proportions in the formula itself. Always convert hypothesized percentages to expected counts before calculating.
Incorrect Degrees of Freedom: Remember that for a GOF test, degrees of freedom are based on the number of categories (k), not the sample size (n). The formula is always .
Checking Conditions with Observed Counts: The "Large Counts" condition must be checked using expected counts. It's possible for an observed count to be less than 5, but as long as the corresponding expected count is 5 or more, the condition is met for that category.
Stating the Alternative Hypothesis Too Specifically: Do not state Hₐ as "all proportions are different from the hypothesized values." The test only tells you if the overall distribution is different, meaning at least one of the proportions is different. The standard wording "the distribution is different from the claimed distribution" is the correct and safest way to state Hₐ.