PrepGo

The Language of Variation: Variables - AP Statistics Study Guide

Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026

Learn with study guides reviewed by top AP teachers. This guide takes about 14 minutes to read.

Quick Summary

This guide introduces the foundational language of statistics used to describe variation in data. After mastering this lesson, you will be able to identify the individuals and variables in a dataset and confidently classify those variables as either categorical or quantitative. This skill is the critical first step for choosing appropriate graphical displays and statistical analyses, forming the bedrock of everything we will do in this course.

Key Concepts

Statistics is the science of data, and data is born from observing the world around us. The "things" we observe are called individuals, and the characteristics we measure about them are called variables. Understanding the type of variable you are working with is the most important initial step in any statistical investigation.

[Image: A simple flowchart. A box at the top says "Variable". Two arrows point down to two boxes: "Categorical Variable" and "Quantitative Variable". Under "Categorical", examples like "Eye Color" and "State of Residence" are listed. Under "Quantitative", it branches again to "Discrete" (e.g., "Number of Siblings") and "Continuous" (e.g., "Height in cm").]

  • Individuals and Variables

    • An individual (or observational unit) is the person, place, thing, or object described by a set of data. If we are studying the academic performance of students at a high school, each student is an individual.

    • A variable is any characteristic of an individual. Variables can take different values for different individuals. For our high school students, variables could include their GPA, number of AP classes taken, favorite subject, and zip code.

  • Two Fundamental Types of Variables

    Every variable you encounter in this course can be classified into one of two types. The key question to ask is: "Does it make sense to calculate an average (a mean) for the values of this variable?"

    1. Categorical Variables (also called Qualitative Variables)

      • Definition: A categorical variable places an individual into one of several groups or categories. The values are typically labels or names.

      • The Litmus Test: Calculating an average of the values is meaningless. For example, if we have data on eye color (blue, brown, green), you cannot calculate the "average eye color."

      • Examples:

        • Favorite Subject: (Math, English, Science, History)

        • Type of Pet: (Dog, Cat, Fish, Bird)

        • State of Residence: (California, Texas, New York)

        • Student ID Number: While this is a number, it acts as a unique label. The average of a list of ID numbers has no statistical meaning.

        • Zip Code: A classic "trick" variable. It's a number, but its purpose is to designate a geographical location (a category). Averaging the zip codes of all students in a school would produce a nonsensical number.

    2. Quantitative Variables

      • Definition: A quantitative variable takes numerical values for which it makes sense to perform arithmetic operations like adding, subtracting, and, most importantly, averaging.

      • The Litmus Test: Calculating an average of the values is meaningful and provides a useful measure of center. The average height of students in a class is a meaningful statistic.

      • Examples:

        • Height (in inches or cm)

        • GPA (on a 4.0 scale)

        • Age (in years)

        • Number of text messages sent yesterday

        • Time to run a mile (in minutes)

  • Sub-types of Quantitative Variables

    Quantitative variables can be further broken down into two types, which becomes important when choosing certain probability models later in the course.

    • Discrete Quantitative Variable:

      • Definition: A variable that can only take on a finite or "countable" number of values. There are gaps between the possible values.

      • Think: "How many?"

      • Examples:

        • Number of siblings: (0, 1, 2, 3, ... You can't have 2.5 siblings.)

        • Number of AP classes taken: (0, 1, 2, ...)

        • The result of a die roll: (1, 2, 3, 4, 5, 6)

    • Continuous Quantitative Variable:

      • Definition: A variable that can take on any value within a given interval. Between any two possible values, there is always another possible value.

      • Think: "How much?" This type of data is typically measured, not counted.

      • Examples:

        • Height: (A person can be 68 inches, 68.1 inches, 68.11 inches, etc.)

        • Weight: (150.2 lbs, 150.21 lbs, etc.)

        • Exact time to finish a race: (9.58 seconds, 9.581 seconds, etc.)

Key Vocabulary

  • Variable: A characteristic of an individual that can take different values for different individuals.

  • Categorical Variable: A variable that places an individual into a group or category. Its values are labels.

  • Quantitative Variable: A variable that takes numerical values for which arithmetic operations like averaging make sense.

  • Individual: The person, object, or case described by a set of data.

  • Data: The collection of values that a set of variables takes on for a set of individuals.

  • Discrete Variable: A quantitative variable whose possible values are countable, with gaps between them (e.g., number of pets).

  • Continuous Variable: A quantitative variable that can take any value within an interval, typically obtained by measuring (e.g., height).

Calculator Tech (TI-84)

No major calculator functions are required for this topic. The classification of variables is a conceptual skill that you perform before any data is entered into the calculator.

How to Show Work on the FRQ

While an FRQ will rarely ask you to simply "classify this variable," this skill is the required first step for many questions. For example, you must identify the variable type correctly to choose the right graph (e.g., bar chart vs. histogram) or the correct inference procedure (e.g., chi-square test vs. t-test).

When justifying your choice of graph or procedure later in the course, you will earn credit by explicitly identifying the variable and its type.

Template for Justification in an FRQ:

"The variable, , is quantitative because it is a numerical measurement/count for which an average is meaningful. For example, we can calculate the average . Therefore, a is an appropriate graphical/analytical tool."

"The variable, , is categorical because it places each individual into a distinct group, such as . An average of these categories would be meaningless. Therefore, a is an appropriate graphical/analytical tool."

Practice Problems

Problem 1:

A high school guidance counselor collects the following data from a random sample of 100 seniors:

  • Student's last name

  • Number of AP courses they have taken

  • Primary college major of interest (e.g., Engineering, Biology, Undecided)

  • Their score on the SAT (out of 1600)

  • Whether they have a part-time job (Yes/No)

  • The zip code of their home address

For each of the six variables collected, identify the variable and classify it as either categorical or quantitative. For quantitative variables, further classify them as discrete or continuous. Provide a brief justification for each classification.

Solution:

  • Last Name: This is a categorical variable. It serves as a unique identifier or label for each student (the individual). It would be meaningless to calculate an "average last name."

  • Number of AP courses taken: This is a quantitative variable because we can calculate a meaningful average number of AP courses for the group of seniors. Specifically, it is discrete because a student can take 0, 1, 2, etc., courses, but not 2.5 courses. The values are countable integers.

  • Primary college major of interest: This is a categorical variable. It places each student into a group based on their academic interest (e.g., the "Engineering" group, the "Biology" group). An average of these majors cannot be calculated.

  • Score on the SAT: This is a quantitative variable because it is a numerical measure of performance, and the average SAT score for the sample is a very meaningful statistic. It is technically discrete since scores can only be integers (e.g., 1400, 1410), but because the range of possible values is so large, it is often treated as continuous for modeling purposes.

  • Whether they have a part-time job: This is a categorical variable. It places students into one of two groups: "Yes" or "No."

  • Zip code: This is a categorical variable. Although it is a number, it represents a geographical location. Calculating the average zip code of the students would produce a number that does not represent a meaningful center of location and has no statistical value.

Problem 2:

A researcher is studying the effectiveness of a new weight-loss program. They record the following information for each of 50 participants:

  • Age (in years)

  • Weight at the start of the program (in pounds)

  • Weight at the end of the program (in pounds)

  • Weight change (calculated as )

  • Satisfaction with the program (rated on a scale of "Very Unsatisfied," "Unsatisfied," "Neutral," "Satisfied," "Very Satisfied")

Identify each variable and classify it as categorical or quantitative. Justify your reasoning.

Solution:

  • Age: This is a quantitative variable. It is a numerical measurement, and we can calculate the mean age of the participants. It is continuous, as age can be measured with increasing precision (e.g., 45.7 years).

  • Weight at the start of the program: This is a quantitative variable. It is a numerical measurement for which an average is meaningful. It is continuous because weight can be measured to many decimal places (e.g., 185.6 pounds, 185.62 pounds).

  • Weight at the end of the program: This is a quantitative variable for the same reasons as starting weight. It is a continuous numerical measurement.

  • Weight change: This is a quantitative variable. It is a calculated numerical value, and the average weight change is a key metric for determining the program's effectiveness. It is also continuous.

  • Satisfaction with the program: This is a categorical variable. It places each participant into one of five ordered categories. While we might assign numbers (1-5) to these levels for analysis, the variable's fundamental nature is categorical. The "average satisfaction" is not a directly calculated mean; instead, we would analyze the proportions of participants in each category.

Common Mistakes to Avoid

  • The "It's a Number, So It's Quantitative" Fallacy: This is the most common error. Always ask yourself the key question: "Does an average of this variable make sense in this context?" A jersey number, student ID number, or zip code are all numbers, but they are categorical because they function as labels. Averaging them is meaningless.

  • Confusing Discrete vs. Continuous: Remember, if you can count the possible values (even if there are infinitely many, like 0, 1, 2, 3,...), it's discrete. If you have to measure it and it can take any value in an interval, it's continuous. Don't overthink it—the main distinction for AP Statistics is simply categorical vs. quantitative.

  • Misclassifying Survey Scales: A rating scale like "1=Strongly Disagree, 2=Disagree, 3=Neutral..." is technically categorical (specifically, ordinal). For the AP exam, treat it as categorical unless you are explicitly told to treat it as quantitative by averaging the numerical responses. The proper way to summarize this data is with proportions for each category (e.g., "45% of users were 'Satisfied'").

  • Ignoring Context: The classification of a variable can change based on how it is recorded. "Age" is quantitative. But if a survey records age by asking people to check a box ("18-25", "26-35", "36-45"), then the variable becomes categorical. Always pay attention to how the data is actually defined and collected in the problem.