PrepGo

Summary Statistics for a Quantitative Variable - AP Statistics Study Guide

Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026

Learn with study guides reviewed by top AP teachers. This guide takes about 23 minutes to read.

Quick Summary

This guide will equip you to master the numerical description of quantitative data. You will learn to calculate and interpret various measures of center (mean, median) and variability (standard deviation, IQR, range), understanding how each is affected by the shape of the distribution and the presence of outliers. By the end, you will be able to select the most appropriate summary statistics for a given dataset and precisely describe the position of any value within that dataset.

Key Concepts

1. Measures of Center: Where is the "Typical" Value?

Measures of center give us a single value that attempts to describe the middle or typical entry of a dataset.

  • The Mean (x̄ or μ):

    • What it is: The arithmetic average of the data values.

    • Calculation: Sum all the data values and divide by the number of values (n).

    • Formula:x̄ = (Σxᵢ) / n

    • Key Property: The mean is not resistant. This means it is strongly influenced by outliers and skewness. An unusually high value will pull the mean up, while an unusually low value will pull it down. Think of it as the "balance point" of the distribution.

    • When to use: Best for data that is roughly symmetric and does not have strong outliers.

  • The Median (M):

    • What it is: The midpoint of the distribution.

    • Calculation:

      1. Arrange the data in ascending order.

      2. If the number of data points (n) is odd, the median is the single middle value.

      3. If n is even, the median is the average of the two middle values.

    • Key Property: The median is resistant. It is not significantly affected by outliers or skewness. It only cares about the middle position, not the actual values of the extremes. It divides the data into two equal halves: 50% of the data is at or below the median, and 50% is at or above it.

    • When to use: The best choice for data that is skewed or has outliers.

  • Comparing the Mean and Median: The relationship between the mean and median gives a clue about the shape of the distribution.

    • Symmetric Distribution: Mean \approx Median

    • Skewed Right Distribution: Mean > Median (The high outliers pull the mean to the right).

    • Skewed Left Distribution: Mean < Median (The low outliers pull the mean to the left).

    [Image: Three distributions (symmetric, skewed left, skewed right) with the mean and median labeled on each to show their relative positions.]

2. Measures of Variability (Spread): How Spread Out is the Data?

Measures of variability describe how much the data values differ from each other and from the center.

  • The Range:

    • What it is: The difference between the maximum and minimum values.

    • Calculation: Range = Maximum - Minimum

    • Key Property: Very easy to calculate, but it is not resistant as it is determined entirely by the two most extreme values (potential outliers).

  • The Interquartile Range (IQR):

    • What it is: The range of the middle 50% of the data. It measures the spread of the data while ignoring the extremes.

    • Calculation:IQR = Q3 - Q1

      • Quartile 1 (Q1): The 25th percentile. The median of the lower half of the data.

      • Quartile 3 (Q3): The 75th percentile. The median of the upper half of the data.

    • Key Property: The IQR is resistant to outliers and skewness, making it an excellent measure of spread for non-symmetric data. It is the natural partner to the median.

  • The Standard Deviation (s or σ):

    • What it is: The "typical" or "average" distance of a data point from the mean. A small standard deviation means data points are clustered tightly around the mean. A large standard deviation means data points are widely spread out.

    • Calculation: Conceptually, it's the square root of the average squared deviation from the mean. You will almost always use a calculator for this.

    • Formula:s = √[ Σ(xᵢ - x̄)^2 / (n - 1) ]

    • Key Properties:

      • The standard deviation is not resistant. Because it uses the mean in its calculation, it is heavily affected by outliers and skew.

      • Standard deviation can never be negative. It is 0 only if all data values are identical.

      • It is the natural partner to the mean.

3. The Five-Number Summary and Outliers

This summary provides a quick and effective overview of a distribution's center and spread.

  • The Five-Number Summary: Consists of five key values, always listed in this order:

    1. Minimum

    2. First Quartile (Q1)

    3. Median (M or Q2)

    4. Third Quartile (Q3)

    5. Maximum

  • Identifying Outliers: The 1.5 x IQR Rule

    This is the formal mathematical rule for identifying potential outliers.

    1. Calculate the IQR (Q3 - Q1).

    2. Calculate the Upper Fence: Q3 + 1.5(IQR)

    3. Calculate the Lower Fence: Q1 - 1.5(IQR)

    4. Any data point that falls above the upper fence or below the lower fence is considered an outlier.

4. Choosing the Right Summary Statistics

This is a critical skill for describing distributions.

  • If the distribution is roughly symmetric and has no outliers, use the Mean for center and the Standard Deviation for spread.

  • If the distribution is skewed or has outliers, use the Median for center and the IQR for spread.

5. The Effect of Linear Transformations

What happens to summary statistics if we change the units of our data? (e.g., converting from inches to centimeters, or adding 5 points to every test score).

Let's say we transform our original data into new data using the formula .

  • Measures of Center (Mean, Median, Quartiles):

    • Adding a constant to each data point adds to the measure of center.

    • Multiplying each data point by multiplies the measure of center by .

    • Rule: Measures of center are affected by both addition/subtraction and multiplication/division.

  • Measures of Spread (Range, IQR, Standard Deviation):

    • Adding a constant to each data point does NOT change the spread. (Imagine shifting the entire dataset on a number line; its width doesn't change).

    • Multiplying each data point by multiplies the measure of spread by the absolute value of , . (Spread cannot be negative).

    • Rule: Measures of spread are affected only by multiplication/division.

Key Vocabulary

  • Mean (x̄): The arithmetic average of a dataset. It is not resistant to outliers.

  • Median (M): The midpoint of a sorted dataset. It is resistant to outliers.

  • Standard Deviation (s): A measure of the typical distance of a data point from the mean. It is not resistant.

  • Interquartile Range (IQR): The range of the middle 50% of the data (Q3 - Q1). It is a resistant measure of spread.

  • Resistant: A statistic is resistant if its value is not strongly affected by extreme values (outliers) in the dataset.

  • Five-Number Summary: A list of key values describing a distribution: Minimum, Q1, Median, Q3, Maximum.

  • Outlier: A data point that falls significantly above or below the main pattern of the data, often identified using the 1.5 x IQR rule.

Calculator Tech (TI-84)

To calculate all key summary statistics at once, use the function.

Example: Find the summary statistics for the dataset: {10, 12, 15, 15, 17, 22, 30}

Step 1: Enter the data into a list.

  1. Press STAT.

  2. Select 1:Edit....

  3. If there is data in list L1, move the cursor to highlight L1 at the top, press CLEAR, then ENTER.

  4. Type each data point into L1, pressing ENTER after each one.

Step 2: Calculate the statistics.

  1. Press STAT again.

  2. Use the right arrow to move to the CALC menu.

  3. Select 1: 1-Var Stats.

  4. The menu will appear.

    • List: Make sure it says L1 (or whichever list you used). To get L1, press 2nd -> .

    • FreqList: Leave this blank.

    • Calculate: Highlight this and press ENTER.

Step 3: Read the output screen.

You will see a screen full of values. Here are the important ones for this topic:

  • : The mean.

  • : The sum of the data values.

  • : The sample standard deviation. This is the one you will almost always use in AP Statistics.

  • : The population standard deviation. (Use only if you know you have data for the entire population).

  • : The number of data points.

  • : The minimum value.

  • : The first quartile.

  • : The median.

  • : The third quartile.

  • : The maximum value.

Note: The calculator provides all five components of the five-number summary. You must calculate the IQR and Range yourself from these values.

How to Show Work on the FRQ

For this topic, FRQs will ask you to calculate and interpret statistics, or to compare distributions. Simply stating a number is not enough. You must always relate your answer back to the context of the problem.

Template for Interpreting a Statistic:

"The [name of statistic] is [value with units]. This means that [interpretation in context]."

  • Interpreting the Median: "The median test score is 85 points. This means that about half of the students scored at or below 85 points, and about half scored at or above 85 points."

  • Interpreting the Standard Deviation: "The standard deviation of the commute times is 4.5 minutes. This means that the typical distance a student's commute time is from the mean commute time is about 4.5 minutes."

  • Interpreting the IQR: "The IQR for the salaries is 12,000. This means that the range of the middle 50% of salaries is $12,000." **Template for Showing Outlier Calculation:** 1. **State the Rule:** "To check for outliers, I will use the 1.5 x IQR rule." 2. **Calculate IQR:** "First, I calculated the IQR = Q3 - Q1 = [value] - [value] = [IQR value]." 3. **Calculate Fences:** "Lower Fence = Q1 - 1.5(IQR) = [value] - 1.5([IQR value]) = [lower fence value]." "Upper Fence = Q3 + 1.5(IQR) = [value] + 1.5([IQR value]) = [upper fence value]." 4. **Conclusion:** "Since [data point] is below the lower fence of [value] / above the upper fence of [value], it is an outlier. There are no data points outside of these fences, so there are no outliers." ## Practice Problems **Problem 1:** The following data represent the number of hours a small group of 10 students spent studying for a final exam: $8, 10, 12, 12, 13, 14, 15, 16, 18, 26.

(a) Calculate the five-number summary for these data.

(b) Calculate the IQR and check for outliers using the 1.5 x IQR rule.

(c) Which measures of center and spread (Mean/SD or Median/IQR) would be more appropriate to describe this distribution? Justify your answer.

Solution:

(a) First, order the data: .

  • Minimum: 8

  • Maximum: 26

  • Median: With n=10 (even), the median is the average of the 5th and 6th values. (13 + 14) / 2 = 13.5

  • Q1: The median of the lower half () is 12.

  • Q3: The median of the upper half () is 16.

  • Five-Number Summary: Minimum=8, Q1=12, Median=13.5, Q3=16, Maximum=26.

(b)

  • Calculate IQR: IQR = Q3 - Q1 = 16 - 12 = 4 hours.

  • Calculate Fences:

    • Lower Fence = Q1 - 1.5(IQR) = 12 - 1.5(4) = 12 - 6 = 6 hours.

    • Upper Fence = Q3 + 1.5(IQR) = 16 + 1.5(4) = 16 + 6 = 22 hours.

  • Conclusion: The minimum value (8) is not below the lower fence of 6. However, the maximum value of 26 hours is above the upper fence of 22 hours. Therefore, 26 is an outlier.

(c) The Median and IQR are more appropriate. The presence of the outlier at 26 hours skews the distribution to the right. Resistant measures like the median and IQR provide a better description of the typical study time and the spread of the middle 50% of students, as they are not influenced by this extreme value. The mean would be pulled higher by the outlier, making it seem like the "typical" student studied longer than they actually did.


Problem 2:

A company that rents electric scooters charges a 1.00 unlocking fee plus $0.30 per minute. The summary statistics for the duration of 500 recent rides are: Mean = 12.2 minutes, Median = 10.5 minutes, Standard Deviation = 4.0 minutes, IQR = 6.0 minutes. The company decides to change its pricing to a $2.00 unlocking fee plus $0.30 per minute. What will be the new mean and standard deviation of the *cost* of the rides? **Solution:** 1. **Define the Transformation:** First, let's find the cost of the original rides. Let $M be the minutes and be the cost. The original cost is . We need to find the mean and standard deviation of .

  1. Apply Transformation Rules to Original Cost:

    • Mean: Measures of center are affected by both multiplication and addition.

      • Mean(C_orig) = 1.00 + 0.30 * (12.2) = 1.00 + 3.66 = $4.66

    • Standard Deviation: Measures of spread are affected only by multiplication.

      • SD(C_orig) = 0.30 * (4.0) = $1.20

  2. Analyze the New Pricing: The new pricing simply adds a 1.00 surcharge to the original cost ($C_new = C_orig + 1.00). This is a simple addition transformation.

  3. Apply Transformation Rules to New Cost:

    • New Mean: Adding a constant affects the mean.

      • New Mean = Mean(C_orig) + 1.00 = $4.66 + $1.00 = $5.66
    • New Standard Deviation: Adding a constant does not affect the standard deviation.

      • New Standard Deviation = SD(C_orig) = $1.20

Final Answer: The new mean cost will be $5.66 and the new standard deviation of the cost will be $1.20.

Common Mistakes to Avoid

  • Using Mean/SD for Skewed Data: This is the most common conceptual error. If you are told a distribution is skewed or you identify an outlier, you MUST use the Median and IQR to describe center and spread. Justifying this choice is a frequent FRQ topic.

  • Confusing Resistant and Non-Resistant Statistics: Memorize this: Median and IQR are resistant. Mean, Standard Deviation, and Range are not. Don't mix them up.

  • Incorrectly Applying Transformations to Spread: Remember that adding or subtracting a constant to every data point shifts the entire distribution but does not change its spread. Standard deviation and IQR are unaffected by addition/subtraction.

  • Reporting Calculator Syntax as Work: Writing "1-Var Stats L1" is not sufficient work for an FRQ. You must show the formula for the outlier rule or write a sentence interpreting a statistic in context.

  • Forgetting Context and Units: Your final answers for statistics should always include units (e.g., 13.5 hours, $1.20) and your interpretations must be in the context of the problem (e.g., "the typical distance from the mean ride time is...").