Quick Summary
This guide will enable you to master the center and spread of discrete random variables. You will learn to calculate the mean (expected value) and standard deviation for a given probability distribution and, crucially, to interpret these values in the context of a real-world scenario. This skill allows you to describe the long-run average outcome and the typical variability you can expect from a random process.
Key Concepts
This section covers the two primary measures used to describe a discrete random variable's probability distribution: its center (mean) and its spread (standard deviation).
1. The Mean or Expected Value of a Random Variable
The mean of a discrete random variable, more formally called the expected value, is the long-run average value of the variable over an infinite number of repetitions of the random process. It's a weighted average of the possible outcomes, where each outcome is weighted by its probability.
Notation: The mean of a random variable X is denoted by μ_X or E(X).
Concept: Imagine you play a game of chance thousands of times. Your average winnings per game would get very close to the game's expected value. It's the theoretical long-run average. The expected value does not have to be one of the possible outcomes of the variable.
Formula: To calculate the mean of a discrete random variable X, you multiply each possible value (x_i) by its probability P(x_i) and then sum all these products.
Formula for Mean (Expected Value):
μ_X = E(X) = Σ [x_i * P(x_i)]
= x_1 * P(x_1) + x_2 * P(x_2) + ... + x_n * P(x_n)
Example: A local charity sells raffle tickets for 5 each. There is one grand prize of $1000, two second-place prizes of $100, and ten third-place prizes of $20. They will sell 500 tickets in total. Let X = the net winnings from buying one ticket. First, we need the probability distribution for X. - Net winnings for the grand prize: $1000 - $5 = $995. Probability = 1/500. - Net winnings for second prize: $100 - $5 = $95. Probability = 2/500. - Net winnings for third prize: $20 - $5 = $15. Probability = 10/500. - Net winnings for losing: $0 - $5 = -$5. Probability = (500 - 1 - 2 - 10) / 500 = 487/500. **Probability Distribution Table:** | Net Winnings (x_i) | $995 | $95 | $15 | -$5 | | :--- | :--- | :--- | :--- | :--- | | Probability P(x_i) | 1/500 | 2/500 | 10/500 | 487/500 | **Calculation of Expected Value:** E(X) = ($995)(1/500) + ($95)(2/500) + ($15)(10/500) + (-$5)(487/500) E(X) = (995/500) + (190/500) + (150/500) - (2435/500) E(X) = (1335 - 2435) / 500 = -1100 / 500 = -$2.20 **Interpretation:** If you were to buy one ticket in this raffle many, many times, your long-run average net winnings would be a loss of $2.20 per ticket. ### 2. The Variance and Standard Deviation of a Random Variable The **standard deviation** of a random variable measures the typical or average distance of the outcomes from the mean (expected value). A small standard deviation means outcomes tend to be close to the mean, while a large standard deviation means outcomes are more spread out. To find the standard deviation, we first must calculate the **variance**. - **Variance:** The variance is the average of the squared deviations of the outcomes from the mean. - **Notation:** The variance of a random variable X is denoted by **σ_X^2** or **Var(X)**. The standard deviation is denoted by **σ_X** or **SD(X)**. - **Formula:** To calculate the variance, for each outcome, you find its deviation from the mean (x_i - μ_X), square that deviation, multiply by its probability, and then sum all these values. The standard deviation is simply the square root of the variance. **Formula for Variance:** σ_X^2 = Var(X) = Σ [(x_i - μ_X)^2 * P(x_i)] = (x_1 - μ_X)^2 * P(x_1) + (x_2 - μ_X)^2 * P(x_2) + ... + (x_n - μ_X)^2 * P(x_n) **Formula for Standard Deviation:** σ_X = √Var(X) = √Σ [(x_i - μ_X)^2 * P(x_i)] - **Example (continued from above):** Let's calculate the variance and standard deviation of the net winnings (X) from the raffle ticket. We already found μ_X = -$2.20. **Calculation of Variance:** σ_X^2 = (995 - (-2.20))^2(1/500) + (95 - (-2.20))^2(2/500) + (15 - (-2.20))^2(10/500) + (-5 - (-2.20))^2(487/500) σ_X^2 = (997.2)^2(1/500) + (97.2)^2(2/500) + (17.2)^2(10/500) + (-2.8)^2(487/500) σ_X^2 = (994407.84)(1/500) + (9447.84)(2/500) + (295.84)(10/500) + (7.84)(487/500) σ_X^2 = (994407.84 + 18895.68 + 2958.40 + 3818.08) / 500 σ_X^2 = 1020080 / 500 = 2040.16 **Calculation of Standard Deviation:** σ_X = √2040.16 \approx $45.17 **Interpretation:** On average, the net winnings for a single raffle ticket will typically vary from the mean (-$2.20) by about $45.17. This large standard deviation reflects the high variability in outcomes—most people lose $5, but a few win large prizes. [Image: A probability histogram for the raffle example. The x-axis shows the four possible net winnings. The y-axis shows the probability. A vertical line is drawn at the mean (μ = -2.20), and a horizontal line segment representing one standard deviation (σ = 45.17) is shown centered at the mean.] ## Key Vocabulary - **Random Variable**: A variable whose value is a numerical outcome of a random phenomenon. We use a capital letter, like X, to denote a random variable. - **Probability Distribution**: A table, graph, or formula that gives all possible values of a discrete random variable and their corresponding probabilities. The sum of all probabilities must equal 1. - **Mean (Expected Value)**: The theoretical long-run average value of a random variable, denoted μ_X or E(X). It is the weighted average of all possible outcomes. - **Variance**: The expected value of the squared deviations from the mean, denoted σ_X^2. It measures the average squared distance of outcomes from the mean. - **Standard Deviation**: The square root of the variance, denoted σ_X. It measures the typical or average distance of the outcomes of a random variable from its mean. ## Calculator Tech (TI-84) Calculating the mean and standard deviation by hand is tedious and prone to error. Use your calculator's statistical functions to do the heavy lifting. **Steps to Calculate μ_X and σ_X:** 1. **Enter Data:** - Press `STAT` -> `1: Edit...`. - Enter the possible values of the random variable (the x_i's) into list **L1**. - Enter the corresponding probabilities (the P(x_i)'s) into list **L2**. - *Example:* For the raffle problem, L1 would be{995, 95, 15, -5}{1/500, 2/500, 10/500, 487/500}. 2. **Calculate Statistics:** - Press `STAT`. - Arrow over to the `CALC` menu. - Select `1: 1-Var Stats`. 3. **Set Up the Command:** - A menu will appear. - Set `List:` to **L1** (or wherever you put the values). - Set `FreqList:` to **L2** (or wherever you put the probabilities). This is the critical step that tells the calculator to use the probabilities as weights. - Leave $Calculate highlighted and press
ENTER.
Read the Output:
: This is the mean (expected value), μ_X. The calculator uses sample notation, but for a probability distribution, this is the true mean.
: This is the population standard deviation, σ_X. This is the correct value to use.
IGNORE : This is the sample standard deviation, which is not used for probability distributions where the probabilities define the entire population.
How to Show Work on the FRQ
To earn full credit on a Free Response Question, you must show more than just the final answer from your calculator. You need to demonstrate your understanding of the underlying formulas.
Template for Calculating and Interpreting the Mean (Expected Value):
Formula: State the formula for expected value in general terms.
- E(X) = μ_X = Σ[x_i * P(x_i)]
Substitution: Show the first few terms of the formula with values from the problem, then use "..." for the rest.
- E(X) = (value₁)(prob₁) + (value₂)(prob₂) + ... + (valueₙ)(probₙ)
Final Answer: State the final answer, including units. You can get this directly from your calculator.
- E(X) = [Final Answer with units]
Interpretation (if asked): Use the following script, filling in the context.
- "If we were to [describe the random process] many, many times, the long-run average value of [the random variable X in context] would be approximately [mean value with units]."
Template for Calculating and Interpreting the Standard Deviation:
Formula: State the formula for standard deviation.
- σ_X = √Σ[(x_i - μ_X)^2 * P(x_i)]
Substitution: Show the first term or two of the calculation inside the square root.
- σ_X = √[(value₁ - μ_X)^2(prob₁) + (value₂ - μ_X)^2(prob₂) + ... + (valueₙ - μ_X)^2(probₙ)]
Final Answer: State the final answer, including units, from your calculator.
- σ_X = [Final Answer with units]
Interpretation (if asked): Use the following script, filling in the context.
- "On average, the value of [the random variable X in context] will typically differ from the mean of [mean value with units] by about [SD value with units]."
Practice Problems
Problem 1:
A small auto repair shop has determined the number of major repairs it performs each day is a random variable X with the following probability distribution.
| Number of Repairs (x_i) | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| Probability P(x_i) | 0.2 | 0.4 | 0.3 | 0.1 |
(a) What is the expected number of major repairs the shop will perform on any given day? Show your work.
(b) Calculate and interpret the standard deviation of the number of major repairs.
Solution:
(a) Calculate the expected value.
First, I will define the random variable X = the number of major repairs on a given day.
Formula: E(X) = μ_X = Σ[x_i * P(x_i)]
Substitution: E(X) = (0)(0.2) + (1)(0.4) + (2)(0.3) + (3)(0.1)
Final Answer: E(X) = 0 + 0.4 + 0.6 + 0.3 = 1.3 repairs.
The expected number of major repairs per day is 1.3.
(b) Calculate and interpret the standard deviation.
First, I will calculate the variance using μ_X = 1.3.
Formula: σ_X = √Σ[(x_i - μ_X)^2 * P(x_i)]
Substitution: σ_X = √[(0 - 1.3)^2(0.2) + (1 - 1.3)^2(0.4) + (2 - 1.3)^2(0.3) + (3 - 1.3)^2(0.1)]
σ_X = √[(-1.3)^2(0.2) + (-0.3)^2(0.4) + (0.7)^2(0.3) + (1.7)^2(0.1)]
σ_X = √[(1.69)(0.2) + (0.09)(0.4) + (0.49)(0.3) + (2.89)(0.1)]
σ_X = √[0.338 + 0.036 + 0.147 + 0.289] = √0.81
Final Answer: σ_X = 0.9 repairs.
(Calculator check: L1={0,1,2,3}, L2={0.2,0.4,0.3,0.1}. 1-Var Stats L1, L2 gives x̄=1.3 and σx=0.9.)
Interpretation: On average, the number of major repairs on a given day will typically vary from the mean of 1.3 repairs by about 0.9 repairs.
Problem 2:
An insurance company sells a one-year term life insurance policy to a 30-year-old female. The company charges a premium of 200. Based on mortality tables, the probability that the female will survive the year is 0.9985. If she does not survive, the policy pays out $100,000 to her beneficiary. Let Y be the profit the insurance company makes from this single policy. (a) Construct the probability distribution for the random variable Y. (b) Calculate the expected profit for the insurance company. Based on this value, is this a profitable policy for the company in the long run? (c) A new analyst suggests ignoring the standard deviation because the expected profit is positive. Explain why this would be a mistake by interpreting the standard deviation of Y in this context. **Solution:** (a) **Probability Distribution** The random variable Y is the company's profit. There are two outcomes: 1. The female survives: The company's profit is the premium they keep. Y = +$200. The probability is 0.9985. 2. The female does not survive: The company's profit is the premium minus the payout. Y = $200 - $100,000 = -$99,800. The probability is 1 - 0.9985 = 0.0015. | Profit (y_i) | $200 | -$99,800 | | :--- | :--- | :--- | | Probability P(y_i) | 0.9985 | 0.0015 | (b) **Expected Profit** **Formula:** E(Y) = μ_Y = Σ[y_i * P(y_i)] **Substitution:** E(Y) = ($200)(0.9985) + (-$99,800)(0.0015) **Final Answer:** E(Y) = $199.70 - $149.70 = $50. **Conclusion:** Yes, this is a profitable policy for the company. If the company sells many, many of these policies, they can expect to make an average profit of $50 per policy. (c) **Standard Deviation** First, calculate the standard deviation. μ_Y = $50. σ_Y = √[(200 - 50)^2(0.9985) + (-99800 - 50)^2(0.0015)] σ_Y = √[(150)^2(0.9985) + (-99850)^2(0.0015)] σ_Y = √[22466.25 + 14955033.75] = √14977500 \approx $3,869.95 **Interpretation and Explanation:** The standard deviation of the profit is approximately $3,870. This value represents the typical deviation from the mean profit. Ignoring this would be a huge mistake because it quantifies the risk involved. While the company expects to make $50 per policy on average, the actual profit for any single policy will typically be about $3,870 away from that mean. This high standard deviation indicates massive variability and risk. The company needs to sell a very large number of policies to ensure that the few large losses (payouts) are balanced by the many small gains (premiums), allowing their actual average profit to approach the expected value. A small number of unexpected claims could lead to a catastrophic loss. ## Common Mistakes to Avoid 1. **Using $Sx instead of on the Calculator:** When using with a probability distribution (values in L1, probabilities in L2), the entire population of outcomes is defined. You must use the population standard deviation, . The sample standard deviation, , will be incorrect.
Incorrect Interpretation of Expected Value: Do not say "the average is 1.3 repairs" or "I expect to get 1.3 repairs tomorrow." The correct interpretation must involve the idea of a long-run average over many repetitions. For example: "If we observe many days, the long-run average number of repairs is 1.3."
Forgetting to Take the Square Root: Students often calculate the variance (σ_X^2) perfectly but then forget the final step of taking the square root to find the standard deviation (σ_X). Always double-check if the question asks for variance or standard deviation.
Arithmetic Errors with Negatives: Be extremely careful when calculating deviations from the mean, especially when the mean or the values are negative. For example, . A common error is , which is wrong. The correct calculation is . Using the calculator for the full calculation avoids these pitfalls.