PrepGo

AP Statistics Practice Quiz: Least Squares Regression

Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026

Test your understanding with short quizzes. This quiz has 16 questions to check your progress.

Question 1 of 16

A researcher develops a least-squares regression model to predict a student's final exam score based on the number of hours they studied. The model is given by: `predicted_score = 45 + 5.2 * hours`. What are the estimated coefficients of this model?

All Questions (16)

A researcher develops a least-squares regression model to predict a student's final exam score based on the number of hours they studied. The model is given by: `predicted_score = 45 + 5.2 * hours`. What are the estimated coefficients of this model?

A) The slope is 45 and the y-intercept is 5.2.

B) The slope is 5.2 and the y-intercept is 45.

C) The slope is 45 and the response variable is 5.2.

D) The explanatory variable is 5.2 and the y-intercept is 45.

Correct Answer: B

Based on the content, the coefficients of the least-squares regression model are the estimated slope and y-intercept. In the form `ŷ = a + bx`, 'a' is the y-intercept and 'b' is the slope. Here, the y-intercept is 45 and the slope is 5.2.

A real estate agent uses a least-squares regression line to model the price of a house (in thousands of dollars) based on its size (in square feet). The resulting equation is `predicted_price = 50 + 0.12 * size`. How should the slope of this line be interpreted?

A) For every 1 square foot increase in size, the predicted price of the house increases by $50.

B) For every 1 square foot increase in size, the predicted price of the house increases by $120.

C) For every $1 increase in price, the predicted size of the house increases by 0.12 square feet.

D) A house with 0 square feet is predicted to cost $50,000.

Correct Answer: B

The slope is the amount the predicted y-value changes for every unit increase in x. Here, the slope is 0.12. Since the price is in thousands of dollars, a 0.12 increase means 0.12 * $1000 = $120. So, for each additional square foot, the predicted price increases by $120.

Which of the following is a fundamental property of the least-squares regression line?

A) It minimizes the sum of the absolute values of the residuals.

B) It passes through the origin (0,0).

C) It minimizes the sum of the squared residuals.

D) It connects the minimum and maximum data points.

Correct Answer: C

The provided content explicitly states that the least-squares regression model minimizes the sum of squared residuals. This is the core principle that defines how the line is fitted to the data.

A study on the relationship between the age of a car (in years) and its value (in dollars) found a coefficient of determination (r-squared) of 0.75. What is the correct interpretation of this value?

A) 75% of the cars in the study are accurately priced by the model.

B) There is a 75% chance that the model's prediction is correct.

C) The value of the car is predicted to decrease by 75% each year.

D) 75% of the variation in a car's value is explained by the model based on its age.

Correct Answer: D

The content defines r-squared, the coefficient of determination, as the proportion of variation in the response variable (car's value) that is explained by the model using the explanatory variable (age).

For a set of bivariate data, the correlation coefficient is r = 0.8, the standard deviation of the explanatory variable (x) is sx = 2, and the standard deviation of the response variable (y) is sy = 5. What is the slope of the least-squares regression line?

A) 2.0

B) 0.32

C) 3.125

D) 1.25

Correct Answer: A

According to the provided content, the slope `b` of the regression line can be calculated as b = r(sy/sx). Plugging in the given values: b = 0.8 * (5 / 2) = 0.8 * 2.5 = 2.0.

A regression model is created to predict a person's weight (in pounds) based on their height (in inches). The y-intercept of the model is -120 pounds. Why might this y-intercept not have a logical interpretation in this context?

A) The y-intercept should always be positive.

B) A height of 0 inches is a significant extrapolation and is not a meaningful value in this context.

C) The slope, not the y-intercept, is the most important part of the model.

D) Weight cannot be negative, so the model is invalid.

Correct Answer: B

The content states that the y-intercept may not have a logical interpretation in context. The y-intercept is the predicted y-value when the explanatory variable is 0. In this case, it's the predicted weight for a person with a height of 0 inches, which is physically impossible and far outside the range of the data used to create the model.

In a least-squares regression analysis, which of the following points is the regression line guaranteed to pass through?

A) The origin (0,0)

B) The point representing the median of x and the median of y.

C) The point representing the mean of x and the mean of y, (x-bar, y-bar).

D) The first data point collected.

Correct Answer: C

The provided content explicitly states that the least-squares regression line passes through the point (x-bar, y-bar), which represents the mean of the explanatory variable and the mean of the response variable.

A biologist models the number of chirps a cricket makes per minute based on the ambient temperature in degrees Fahrenheit. The model is `predicted_chirps = -10 + 4 * temperature`. What is the correct interpretation of the y-intercept?

A) For every 1-degree increase in temperature, the number of chirps is predicted to decrease by 10.

B) At a temperature of 0 degrees Fahrenheit, the model predicts -10 chirps per minute.

C) The minimum number of chirps predicted by the model is -10.

D) For every 4-degree increase in temperature, the number of chirps is predicted to increase by 1.

Correct Answer: B

The y-intercept is the predicted y-value when the explanatory variable is 0. In this context, it is the predicted number of chirps when the temperature is 0 degrees Fahrenheit. Even if a negative number of chirps is not physically possible, this is the correct mathematical interpretation of the coefficient.

A least-squares regression line is fitted to a dataset where the mean of the explanatory variable is x-bar = 20 and the mean of the response variable is y-bar = 50. If the y-intercept of the line is a = 10, what is the slope, b?

A) 2.0

B) 2.5

C) 0.5

D) 4.0

Correct Answer: A

The least-squares regression line must pass through the point (x-bar, y-bar). Therefore, the point (20, 50) must satisfy the equation `y = a + bx`. Plugging in the known values: `50 = 10 + b * 20`. Solving for b: `40 = 20b`, so `b = 2.0`.

What is the primary goal when we estimate the parameters for a least-squares regression line model?

A) To find the correlation coefficient, r.

B) To find the estimated slope and y-intercept.

C) To calculate the mean of the x and y variables.

D) To determine the range of the data.

Correct Answer: B

The content states that we 'Estimate parameters for the least-squares regression line model' and that 'The coefficients of the least-squares regression model are the estimated slope and y-intercept.' Therefore, the goal is to find these two coefficients.

A researcher finds that the correlation between two variables is r = -0.9. The standard deviation of the response variable, sy, is 12, and the standard deviation of the explanatory variable, sx, is 15. What is the slope of the least-squares regression line?

A) 1.125

B) -1.125

C) 0.72

D) -0.72

Correct Answer: D

The formula for the slope is b = r(sy/sx). Using the given values: b = -0.9 * (12 / 15) = -0.9 * 0.8 = -0.72. The sign of the slope must match the sign of the correlation coefficient.

The coefficient of determination for a linear model is found to be 0.49. Which statement is a correct conclusion?

A) The correlation coefficient, r, is 0.7 or -0.7.

B) 49% of the data points lie directly on the regression line.

C) The slope of the regression line is 0.49.

D) The model's predictions are incorrect 51% of the time.

Correct Answer: A

The coefficient of determination, r-squared, is the square of the correlation coefficient, r. Therefore, r is the square root of r-squared. The square root of 0.49 is 0.7. Since r can be positive or negative, the correlation could be 0.7 or -0.7. The other options misinterpret r-squared.

In the equation `predicted_y = 150 - 2x`, what does the value -2 represent?

A) The predicted value of y when x is 0.

B) The amount the predicted y-value decreases for every one-unit increase in x.

C) The coefficient of determination.

D) The mean of the x-variable.

Correct Answer: B

The value -2 is the slope of the regression line. The slope is defined as the amount the predicted y-value changes for every one-unit increase in x. Since the slope is negative, it represents a decrease.

Two different datasets are analyzed. For Dataset A, r = 0.5, sx = 10, sy = 5. For Dataset B, r = 0.5, sx = 5, sy = 10. How does the slope of the regression line for Dataset A (b_A) compare to the slope for Dataset B (b_B)?

A) b_A is four times as large as b_B.

B) b_B is four times as large as b_A.

C) The slopes are equal.

D) b_B is twice as large as b_A.

Correct Answer: B

Using the formula b = r(sy/sx): For Dataset A, b_A = 0.5 * (5/10) = 0.5 * 0.5 = 0.25. For Dataset B, b_B = 0.5 * (10/5) = 0.5 * 2 = 1.0. Comparing the two, b_B (1.0) is four times as large as b_A (0.25).

A least-squares regression model is created to predict sales based on advertising spending. The model has an r-squared value of 0.90. Which of the following is the most appropriate conclusion?

A) 90% of the variation in sales is explained by the linear relationship with advertising spending.

B) Increasing advertising spending will cause sales to increase.

C) The relationship between sales and advertising spending is positive and very strong.

D) 90% of the company's sales are a direct result of its advertising spending.

Correct Answer: A

r-squared is the proportion of variation in the response variable (sales) explained by the model. Option A is the precise definition. Option B implies causation, which cannot be concluded from regression alone. Option C describes correlation (r), not r-squared, and we don't know the sign. Option D is an incorrect interpretation.

The process of finding the best-fitting straight line for a set of data by ensuring the sum of the squared differences between observed and predicted values is as small as possible is known as:

A) Coefficient of determination.

B) Correlation analysis.

C) Least-squares regression.

D) Parameter estimation.

Correct Answer: C

This question is a direct definition of the core principle of least-squares regression, which, as stated in the content, is a model that 'minimizes the sum of squared residuals'.