The Core Idea: Semi-log Plots
In mathematics, we often seek to find a function that best models a set of data. For data that grows or decays at an ever-increasing rate, an exponential model may be appropriate. However, looking at a standard scatterplot of the data can be misleading; a curve that appears exponential might also be well-represented by a different type of function, like a quadratic or power function. The fundamental challenge is to determine, with confidence, if an exponential model is truly the best choice.
This is where the semi-log plot becomes an essential analytical tool. By changing the scale of the vertical axis from linear to logarithmic, a semi-log plot transforms data. If the underlying relationship between the variables is truly exponential, this transformation will cause the data points to appear arranged in a straight line. Our eyes are excellent at detecting linearity, making this a powerful test. This topic also extends to the broader concept of model assessment. Beyond specific tools like semi-log plots, we can determine if any model is a good fit by visually inspecting how closely its graph follows the data or by analyzing a residual plot for randomness.
Key Definitions and Principles
This topic focuses on graphical tests for determining the appropriateness of an exponential model and the general fitness of any model.
Semi-log Plot
A semi-log plot is a graphical representation of data on a coordinate plane where the vertical axis (y-axis) has a logarithmic scale and the horizontal axis (x-axis) has a linear scale.
Linear Scale: Values are spaced evenly. The distance between 1 and 2 is the same as the distance between 9 and 10.
Logarithmic Scale: Values are spaced according to their logarithm. The distance between 1 and 10 is the same as the distance between 10 and 100, and between 100 and 1000. This scale compresses larger values, allowing a wide range of data to be displayed.
The Linearity Test for Exponential Models
This is the central principle for using semi-log plots.
Rule: If a set of data has an exponential relationship (i.e., it can be modeled by a function of the form ), then a semi-log plot of that data will be approximately linear.
Converse: If a semi-log plot of a dataset is approximately linear, then an exponential model is appropriate for that data.
Assessing a Model's Fit
These are general principles for determining if a proposed model is reasonable for a given dataset.
Visual Fit: A model is considered a good fit if its graph on a standard scatterplot appears to pass through or very close to the data points, capturing the overall trend of the data.
Residual Analysis: A residual is the directed vertical distance between an observed data point and the value predicted by a model. It is calculated as . A model is considered a good fit if a plot of its residuals against the independent variable shows the points are randomly scattered with respect to the horizontal axis. There should be no discernible pattern (e.g., a U-shape, a funnel shape, or a clear trend).
Understanding Data Transformation
The core function of a semi-log plot is to perform a data transformation. We are not changing the data values themselves, but rather the way they are displayed. An exponential function, such as , has a constant multiplicative rate of change. By applying a logarithm to the y-values, we are essentially converting this multiplicative relationship into an additive one.
Consider the equation . If we take the logarithm of both sides (e.g., the natural log), we get:
If we let , , and , the equation becomes . This is the equation of a line where the new vertical variable is the logarithm of the original . A semi-log plot achieves this transformation graphically. By plotting the original pairs on axes where the vertical scale is logarithmic, we are effectively plotting on linear axes. This is why an exponential relationship appears linear on a semi-log plot. This powerful visual check allows us to confirm if our choice of an exponential model is justified by the data's underlying structure.
Core Concepts & Rules
A semi-log plot is defined by its axes: the vertical (y) axis is logarithmic, and the horizontal (x) axis is linear.
The primary use of a semi-log plot in this course is to determine if an exponential model is appropriate for a set of data.
Data that can be modeled by an exponential function will appear as a straight line on a semi-log plot.
If a dataset's semi-log plot is not linear, an exponential model is not an appropriate choice.
A good model should visually "fit" the data on a standard scatterplot, meaning its graph follows the trend of the points closely.
A good model will produce a residual plot where the residual points are randomly scattered around the horizontal axis ().
A clear pattern in a residual plot (e.g., a curve, a fan shape) indicates that the chosen model is not a good fit for the data.
Step-by-Step Example 1: Analyzing a Semi-log Plot
Problem: A biologist tracks the area covered by a specific type of algae in a pond over several weeks. The data is presented in the table below. Two plots of the data are also provided: a standard scatterplot and a semi-log plot. Based on the plots, is an exponential model appropriate for this data? Justify your answer.
Data:
| Weeks (t) | Area (sq. meters) |
|---|---|
| 0 | 5.1 |
| 1 | 9.8 |
| 2 | 21.2 |
| 3 | 43.5 |
| 4 | 85.9 |
| 5 | 175.1 |
Plots:
Plot A: Standard Scatterplot (A graph with linear x-axis and linear y-axis, showing points that form a steep upward curve).
Plot B: Semi-log Plot (A graph with linear x-axis and logarithmic y-axis, showing the same points forming a nearly perfect straight line).
Solution:
Step 1: Analyze the Standard Scatterplot (Plot A).
Observe the standard scatterplot where both axes are linear. The data points clearly form a curve that is increasing at an increasing rate. This visual evidence suggests that an exponential model might be appropriate, but it is not conclusive. Other function types, like a parabola, could also produce a similar curve.
Step 2: Analyze the Semi-log Plot (Plot B).
Observe the semi-log plot. Note that the horizontal axis (Weeks) is linear (0, 1, 2, 3, 4, 5), while the vertical axis (Area) is logarithmic (the markings might be 1, 10, 100, 1000). Examine the pattern of the data points on this plot. The points lie very close to a single straight line.
Step 3: Apply the Linearity Test and Form a Conclusion.
The Essential Knowledge states that if a set of data has an exponential relationship, then a semi-log plot of the data will be approximately linear. Since the data points in Plot B form a clear linear pattern, we can conclude that an exponential model is appropriate for modeling the area of the algae over time.
Justification: An exponential model is appropriate for this dataset because the semi-log plot of the data is approximately linear.
Step-by-Step Example 2: Using Residuals to Assess Model Fit
Problem: A financial analyst uses a regression model to predict a stock's price based on the number of days since a major company announcement. After fitting a model, the analyst calculates the residuals (Actual Price - Predicted Price) for each day. A plot of these residuals is shown below. Based on the residual plot, is the analyst's model a good fit for the data?
Residual Plot:
(A scatterplot is shown. The horizontal axis is "Days Since Announcement" from 0 to 20. The vertical axis is "Residual ()" from -2 to 2. There are about 20 points scattered both above and below the horizontal line at y=0. There is no curve, U-shape, or any other discernible pattern in the points.) **Solution:** **Step 1: Understand the Goal of Residual Analysis.** The purpose of a residual plot is to detect patterns in the errors of a model. If a model is a good fit, its errors (residuals) should be random. If there is a pattern, it implies the model is failing to capture some aspect of the underlying relationship in the data. **Step 2: Examine the Provided Residual Plot.** Observe the distribution of the points on the plot. - The points are scattered on both sides of the horizontal axis (the line $Residual = 0). Some residuals are positive (the model under-predicted) and some are negative (the model over-predicted).
The vertical spread of the points appears relatively constant across the range of the horizontal axis.
Most importantly, there is no obvious pattern. The points do not form a curve, a line, a U-shape, or a fan shape. They appear to be randomly distributed.
Step 3: Apply the Rule for Residuals and Form a Conclusion.
The Essential Knowledge states that a model for a set of data is a good fit if the residuals of the model are randomly scattered with respect to the horizontal axis. Because the provided residual plot shows no discernible pattern and the points are randomly scattered, we can conclude that the analyst's model is a good fit for the data.
Justification: The model is a good fit for the data because the corresponding residual plot shows the residuals are randomly scattered about the horizontal axis with no clear pattern.
Using Your Calculator
A graphing calculator can be used to create the plots needed to analyze data as described in this topic. While some advanced calculators may have a built-in semi-log plot option, the universal method involves transforming the data manually.
Task: Create a semi-log plot to determine if an exponential model is appropriate for a dataset.
Data:
x-values:
y-values:
Steps (TI-84 Style):
Enter Data:
Press
STAT, then select1:Edit....In the list editor, clear any old data in
L1andL2.Enter the x-values into
L1.Enter the y-values into
L2.
View the Standard Scatterplot:
Press
2ndthenY=to accessSTAT PLOT.Select
1:Plot1....Turn the plot . Select the first type (scatterplot).
Set
Xlist:L1andYlist:L2.Press
ZOOMand select9:ZoomStat. You will see a distinct upward curve.
Transform the Data for the Semi-log Plot:
Press
STAT, then1:Edit....Use the arrow keys to move the cursor up to highlight the name
L3at the top of the third column.Press
LOG$ then then (to getL2$) and close the parenthesis. Your entry should look like .Press
ENTER.L3will now be populated with the base-10 logarithm of each value fromL2.
View the Semi-log Data Plot:
Press
2ndthenY=to accessSTAT PLOT.Select
1:Plot1....Keep
Xlist:L1$, but change toL3$ (by pressing2ndthen ).Press
ZOOMand select9:ZoomStat.
Analyze the Result:
The new plot shows the relationship between and . Observe the plot on your calculator screen. The points should now appear to form a straight line.
Conclusion: Because the plot of is linear, an exponential model is appropriate for the original data.
AP Exam Quick Hit
Common Question Types
Plot Interpretation: You will be shown a standard scatterplot and a semi-log plot of the same dataset and asked to determine if an exponential model is appropriate.
- Example: "Plots A and B show the same data. Is an exponential function a suitable model for the data? Justify your answer by referencing one or both of the plots."
Model Fitness from Residuals: You will be shown a residual plot for a given model and dataset and asked to comment on the appropriateness of the model.
- Example: "The residual plot for a linear model applied to a dataset is shown. Is the linear model a good fit for the data? Explain your reasoning."
Comparing Plots: You may be shown several different plots (e.g., scatterplot, semi-log plot, residual plot) related to one dataset and asked to draw conclusions using all available information.
- Example: "A researcher is deciding between a linear and an exponential model. Based on the provided scatterplot and semi-log plot, which model type is more appropriate?"
Common Mistakes
Confusing Semi-log and Standard Plots: Stating that an exponential model is appropriate because the standard scatterplot is linear. A linear scatterplot indicates a linear model, not an exponential one.
Incomplete Justification: A correct justification must link the specific feature of the plot to the conclusion. It is not enough to say, "The model is exponential because of the semi-log plot." You must state, "The model is exponential because the semi-log plot is linear."
Misinterpreting Residual Patterns: Mistaking random scatter in a residual plot for a "bad" fit. Randomness is the goal. Conversely, identifying a clear U-shape or other pattern in the residuals and incorrectly concluding that the model is a good fit.
Confusing Exponential and Power Models: A semi-log plot (log y-axis) is used to test for exponential () models. A log-log plot (both axes logarithmic) is used for power () models. Using the linearity of one to justify the other is a common error.
Over-reliance on Visuals of the Scatterplot: Concluding that data is definitively exponential just because it looks like a steep curve on a standard scatterplot. The semi-log plot is the required confirmatory test.