In this lesson we will begin to explore the concept of statistical inference. We will look at both discrete and continuous probability distributions. The concepts of standard error and the Central Limit Theorem will be introduced which will serve as the base for the remaining lessons in this course.
Lesson 5 Learning Objectives
Upon completion of this lesson, you will be able to:
Before we begin new content, we should review a few terms from previous lessons that we will see again in this lesson:
Discrete: Data that can only take on set number of values
Continuous: Quantitative data that can take on any value between the minimum and maximum, and any value between two other values
Probability: The likelihood of an event occuring; \(P(A)=\frac{number \: of \:events \:considered\: outcome \:A}{number \:of\: total \:events}\)
\(P(A\;\cap\;B)\): Intersection of A and B; "probability of A and B"
\(P(A\;\cup\;B)\): Union of A and B; "probability of A or B" (this also includes the probability of A and B)
Mean: The numerical average; calculated as the sum of all of the data values divided by the number of values; represented as \(\overline{X}\).
Standard deviation: Roughly the average difference between individual data and the mean; for a sample, represented as s, \(s=\sqrt{\frac{\sum (x-\overline{x})^{2}}{n-1}}\)
Sample: A subset of the population from which data is actually collected
Population: The entire set of possible observations in which we are interested
Statistic: A measure concerning a sample (e.g., sample mean)
Parameter: A measure concerning a population (e.g., population mean)
Descriptive statistics: Methods for summarizing data (e.g., mean, median, mode, range, variance, graphs)
Inferential statistics: Methods for using sample data to make conclusions about a population
z score: Distance between an individual score and the mean in standard deviation units; also known as a standardized score.
Empirical Rule: For bell-shaped distributions, about 68% of the data will be within one standard deviation of the mean, about 95% will be within two standard deviations of the mean, and about 99.7% will be within three standard deviations of the mean
Random variable: a numerical characteristic that takes on different values due to chance
Coin Flips
The number of heads in four flips of a coin (a numerical property of each different sequence of flips) is a random variable because the results will vary between trials.
Heights
Sample of 100 are repeatedly pulled from the population of all Penn State students and their heights are measured. The mean height of samples of 100 Penn State students is a random variable because the statistic will vary between samples. While most sample means will be similar to the population mean, they will not all equal the population mean due to random sampling variation.
Random variables are classified into two broad types: discrete and continuous. A discrete random variable has a countable set of distinct possible values. A continuous random variable is such that any value (to any number of decimal places) within some interval is a possible value.
Continuous Random Variables:
Note : In practice, we don't measure accurately enough to truly see all possible values of a continuous random variable. For instance, in reality somebody may have exercised 4.2341567 hours last week but they probably would round off to 4. Nevertheless, hours of exercise last week is inherently a continuous random variable.
Probability distribution: A table, graph, or formula that gives the probability of a given outcome's occurrence
For a discrete random variable, its probability distribution (also called the probability distribution function) is any table, graph, or formula that gives each possible value and the probability of that value.
Note: The total of all probabilities across the distribution must be 1, and each individual probability must be between 0 and 1, inclusive.
What if we flipped a fair coin four times? What are the possible outcomes and what is the probability of each?
Figure 1 below is a probability distribution for the number of heads in 4 flips of a coin. Given that P(Heads)=.50, the probability of not flipping heads at all is 1/16, or .0625. In 6.25% of all trials, we can expect that there will be no heads. This may be written as P(X=0)=.0625. Similarly, the probability of flipping heads once in four trials is 4/16, or .25. In 25% of all trials, we can expect that heads will be flipped exactly once. This may be written as P(X=1)=.25.
This probability distribution could be constructed by listing all 16 possible sequences of heads and tails for four flips (i.e., HHHH, HTHH, HTTH, HTTT, etc.), and then counting how many sequences there are for each possible number of heads.
Heads | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
Probability | 1/16 | 4/16 | 6/16 | 4/16 | 1/16 |
A census was conducted at a university. All students were asked how many tattoos they had.
Figure 2 presents a probability distribution for the discrete variable of number of tattoos for each student. From this table we can find that 85% of students in the population do not have a tattoo, 12% of students in the population have one tattoo, 1.5% of students in the population have two tattoos, and so on. This could be written as P(X=0)=.85, P(X=1)=.12, P(X=2)=.015, etc.
Tattoos | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
Probability | .850 | .120 | .015 | .010 | .005 |
Cumulative probability: Likelihood of an outcome less than or equal to a given value occuring
To find a cumulative probability we add the probabilities for all values qualifying as "less than or equal" to the specified value.
Suppose we want to know the probability that the number of heads in four flips is less than two. If we let X represent number of heads we get on four flips of a coin, then:
Because this is a discrete distribution, the probability of flipping less than two heads is equal to flipping one or zero heads:
\(P(X<2)=P(X=0\cup1)\)
The probability of flipping 1 head and the probability of flipping 0 heads are mutually exclusive events. Thus, \(P(0 \cup1)=P(X=0)+P(X=1)\)
We can use the values from Figure 1 above to solve this equation.
\(P(X=0)+P(X=1)=(1/16)+(4/16)=5/16 \)
Cumulative distribution: A listing of all possible values along with the probability of that value and all lower values occuring (i.e., the cumulative probability)
Cumulative probabilities are found by adding the probability up to each column of the table. In Figure 3 we find the cumulative probability for one head by adding the probabilities for zero and one. The cumulative probability for two heads is found by adding the probabilities for zero, one, and two. We continue with this procedure until we reach the maximum number of heads, in this case four, which should have a cumulative probability of 1.00 because 100% of trials must have four or fewer heads.
Heads | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
Probability | 1/16 | 4/16 | 6/16 | 4/16 | 1/16 |
Cumulative Probability | 1/16 | 5/16 | 11/16 | 15/16 | 1 |
Let's construct a cumulative distribution for the data concerning number of tattoos.
Tattoos | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
Probability | .850 | .120 | .015 | .010 | .005 |
Cumulative Probability | .850 | .970 | .985 | .995 | 1 |
Note that the cumulative probability for the last column is always 1. That is, 100% of trials will be less than or equal to the maximum value.
Law of Large Numbers: Given a large number of repeated trials, the average of the results will be approximately equal to the expected value
Expected value: The mean value in the long run for many repeated samples, symbolized as \(E(X)\)
Expected Value for a Discrete Random Variable
\[E(X)=\sum x_i p_i\]\(x_i\)= value of the i^{th }outcome
\(p_i\) = probability of the i^{th} outcome
According to this formula, we take each observed X value and multiply it by its respective probability. We then add these products to reach our expected value. You may have seen this before referred to as a weighted average. It is known as a weighted average because it takes into account the probability of each outcome and weighs it accordingly. This is in contrast to an unweighted average which would not take into account the probability of each outcome and weigh each possibility equally.
Let's look at a few examples of expected values for a discrete random variable:
A fair six-sided die is tossed. You win \$2 if the result is a “1,” you win \$1 if the result is a “6,” but otherwise you lose \$1.
X | +\$2 | +\$1 | -\$1 |
---|---|---|---|
Probability | 1/6 | 1/6 | 4/6 |
\( E(X)= \$2(\frac {1}{6})+\$1 (\frac {1}{6})+(-\$1)(\frac {4}{6})=\$\frac{-1}{6}= -\$ 0.17 \)
The interpretation is that if you play many times, the average outcome is losing 17 cents per play. Thus, over time you should expect to lose money.
Using the probability distribution for number of tattoos, let's find the mean number of tattoos per student.
Tattoos | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
Probability | .850 | .120 | .015 | .010 | .005 |
\( E(X)=0 (.85)+1(.12)+ 2(.015) +3 (.010) +4(.005) =.20 \)
The mean number of tattoos per student is .20.
Recall from Lesson 3, in a sample, the mean is symbolized by \(\overline{x}\) and the standard deviation by \(s\). Because the probabilities that we are working with here are computed using the population, they are symbolized using lower case Greek letters. The population mean is symbolized by \(\mu\) (lower case "mu") and the population standard deviation by \(\sigma \) (lower case "sigma").
Sample Statistic | Population Parameter | |
Mean | \(\overline{x}\) | \(\mu\) |
Variance | \(s^{2}\) | \(\sigma ^{2}\) |
Standard Deviation | \(s\) | \(\sigma \) |
Also recall that the standard deviation is equal to the square root of the variance. Thus, \(\sigma=\sqrt{(\sigma ^{2})}\)
Knowing the expected value is not the only important characteristic one may want to know about a set of discrete numbers: one may also need to know the spread, or variability, of these data. For instance, you may "expect" to win \$20 when playing a particular game (which appears good!), but the spread for this might be from losing \$20 to winning \$60. Knowing such information can influence you decision on whether to play.
To calculate the standard deviation we first must calculate the variance. From the variance, we take the square root and this provides us the standard deviation. Conceptually, the variance of a discrete random variable is the sum of the difference between each value and the mean times the probility of obtaining that value, as seen in the conceptual formulas below:
Conceptual Formulas
Variance for a Discrete Random Variable
\( \sigma ^2= \sum [(x_i-\mu)^2 p_i] \)
Standard Deviation for a Discrete Random Variable
\( \sigma = \sqrt {\sum [(x_i-\mu)^2 p_i}]\)
\(x_i\)= value of the i^{th }outcome
\(\mu= E(X)=\sum x_i p_i\)
\(p_i\) = probability of the i^{th} outcome
In these expressions we substitute our result for E(X) into \( \mu\) because \( \mu\) is the symbol used to represent the mean of a population .
However, there is an easier computational formula. The compuational formula will give you the same result as the conceptual formula above, but the calculations are simplier.
Computational Formulas
Variance for a Discrete Random Variable
\( \sigma ^2= [\sum (x_i^2 p_i )]-\mu ^2\)
Standard Deviation for a Discrete Random Variable
\( \sigma = \sqrt {[\sum (x_i^2 p_i)] -\mu ^2}\)
\(x_i\)= value of the i^{th }outcome
\(\mu= E(X)=\sum x_i p_i\)
\(p_i\) = probability of the i^{th} outcome
Notice in the summation part of this equation that we only square each observed X value and not the respective probability. Also note that the \(\mu\) is outside of the summation.
Going back to the first example used above for expectation involving the dice game, we would calculate the standard deviation for this discrete distribution by first calculating the variance:
X | +\$2 | +\$1 | -\$1 |
---|---|---|---|
Probability | 1/6 | 1/6 | 4/6 |
\( \sigma ^2= [\sum x_i^2 p_i ]-\mu ^2 = [2^2 (\frac{1}{6})+1^2 (\frac{1}{6})+(-1)^2 (\frac{4}{6})]-(- \frac{1}{6})^2\)
\(=[ \frac{4}{6}+\frac {1}{6}+ \frac{4}{6}]-\frac{1}{36} = \frac{53}{36}=1.472 \)
The variance of this discrete random variable is 1.472.
\(\sigma=\sqrt{(\sigma ^{2})}\)
\(\sigma=\sqrt{1.472}=1.213\)
The standard deviation of this discrete random vairable is 1.213.
Binomial random variable: A specific type of discrete random variable that counts how often a particular event occurs in a fixed number of tries or trials
For a variable to be a binomial random variable, ALL of the following conditions must be met:
Notation
n = number of trials
p = probability event of interest occurs on any one trial
Number of correct guesses at 30 true-false questions when you randomly guess all answers
There are 30 trials, therefore n = 30
There are two possible outcomes (true and false) that are equally probable, therefore p = 1/2 = .5
The conditions for being a binomial variable lead to a somewhat complicated formula for finding the probability any specific value occurs (such as the probability you get 20 right when you guess as 30 True-False questions.)
We'll use Minitab Express to find probabilities for binomial random variables. However, for those of you who are curious, the by hand formula for the probability of getting a specific outcome in a binomial experiment is:
Binomial Random Variable Probability
\[P(x)= \frac {n!}{x!(n-x)!} p^x (1-p)^{n-x}\]
n = number of trials
x = number of successes
p = probability event of interest occurs on any one trial
! is the symbol for factorial. For a review of factorials, see the course algebra review page.
One can use the formula to find the probability or alternatively, use Minitab Express to find the probability. In the homework, you may use the method that you are more comfortable with unless specified otherwise.
In the following Minitab Express example we will find P(x) for n = 20, x =3, and p = 0.4
To calculate binomial random variable probabilities in Minitab:
Minitab output:
Probability Density Function
Binomial with n = 20 and p = 0.4
x | P(X = x) |
3.00 | 0.0123 |
To calculate binomial random variable probabilities in Minitab Express:
The result should be the following output:
In the following example, we illustrate how to use the formula to compute binomial probabilities by hand. If you don't like to use the formula, you can also use Minitab Express to find the probabilities.
Red Flowers
Cross-fertilizing a red and a white flower produces red flowers 25% of the time. Now we cross-fertilize five pairs of red and white flowers and produce five offspring. Find the probability that there will be no red flowered plants in the five offspring.
X = # of red flowered plants in the five offspring.
The number of red flowered plants has a binomial distribution with n = 5, p = .25
\(P(X=0)=\frac{5!}{0!(5-0)!} .25 ^0 (1- .25)^5 =1 \times .25^0 \times .75^5 =.237\)
There is a 23.7% chance that none of the five plants will be red flowered.
Cumulative probability: Likelihood that a certain number of successes or fewer will occur.
Binomial random variable probabilities are mutually exclusive, therefore we can use the addition rule that we learned in Lesson 4.
Continuing with the red flowers example, what if we wanted to know the probability that there would be one or fewer red flowered plants?
\begin{align}
P(X\ is\ 1\ or\ less)&=P(X=0)+P(X=1)\\
&= \frac{5!}{0!(5-0)!} .25^0 (1-.25)^5+\frac{5!}{1!(5-1)!} .25^1 (1-.25)^4\\
& = .237 +.395=.632 \\
\end{align}
There is a 63.2% chance that one or fewer of the five plants will be red flowered.
In the red flowers example, we first computed P(X = x) and then P(X ≤ x). This latter expression is called finding a cumulative probability because you are finding the probability that has accumulated from the minimum to some point, i.e. from 0 to 1 in this example
To use Minitab Express to solve a cumulative probability binomial problem, return to Statistics > Probability Distributions> CDF/PDF > Cumulative Distribution Function (CDF). For Value enter 1. For distribution select the binomial. There are 5 trials and the event probability is .25
To use Minitab to solve a cumulative probability binomial problem, return to Calc > Probability Distributions > Binomial as shown above. Now however, select the radio button for Cumulative Probability. For Number of Trials enter 5 and the event probability is .25. Click the radio button for Input Constant and enter the x value of 1.
The formula given earlier for discrete random variables could be used, but the good news is that for binomial random variables a shortcut formula for expected value (the mean) and standard deviation can also be used.
Bionomial Random Variable Formulas
\[\mu=np\]
\[\sigma=\sqrt {np(1-p)}\]
n = number of trials
p = probability event of interest occurs on any one trial
After you use this formula a couple of times, you'll realize this formula matches your intuition. For instance, the “expected” number of correct (random) guesses at 30 True-False questions is np = (30)(.5) = 15 (half of the questions). For a fair six-sided die rolled 60 times, the expected value of the number of times a “1” is tossed is np = (60)(1/6) = 10.
The standard deviations for these would be, for the True-False test, \(\sigma=\sqrt{30 (0.5) (1-0.5)}=\sqrt{7.5}=2.74\), and for the die, \(\sigma=\sqrt{60 \left( \frac{1}{6}\right) \left(1-\frac {1}{6}\right)}=\sqrt{ \frac{50}{6}}=2.89\).
Roulette
A roulette wheel has 38 slots, 18 are red, 18 are black, and 2 are green.You play five games and always bet on red.
How many games can you expect to win?
Recall, you play five games and always bet on red. \(n=5\) and \(p=\frac{red \;slots}{total \;slots}=\frac{18}{38}\)
\(\mu=np=5 \left( \frac{18}{38}=2.3684\right)\)
\( \sigma=\sqrt{np(1-p)}=\sqrt{5\left(\frac{18}{38} \right) \left(1-\frac{18}{38}\right)}=1.1165\)
Out of 5 games, you can expect to win 2.3684 (with a standard deviation of 1.1165).
What is the probability that you will win all five games?
\(P(x)= \frac {n!}{x!(n-x)!} p^x (1-p)^{n-x}\)
\(P(X=5)= \frac {5!}{5!(5-5)!}\left( \frac{18}{38} \right)^5 \left(1-\frac{18}{38}\right)^{5-5}\)
\(P(X=5)=\frac{5!}{5!0!} \left(.4737^{5}\right) .5263^{0} = 1(.0238)(1)=.0238\)
There is a 2.38% chance that you will win all five out of five games.
If you win three or more games, you make a profit. If you win two or fewer games, you lose money. What is the probability that you will win no more than two games?
\(P(X\leq 2)=P(X=0)+P(X=1)+P(X=2)\)
\(P(X=0)=\frac {5!}{0!(5-0)!} \left ( \frac{18}{38} \right )^0\left(1-\frac{18}{38}\right)^{5-0}=.0404\)
\(P(X=1)=\frac {5!}{1!(5-1)!} \left ( \frac{18}{38} \right )^1\left(1-\frac{18}{38}\right)^{5-1}=.1817\)
\(P(X=2)=\frac {5!}{2!(5-2)!} \left ( \frac{18}{38} \right )^2\left(1-\frac{18}{38}\right)^{5-2}=.3271\)
\(P(X\leq 2)=.0404+.1817+.3271=.5493\)
There is a 54.93% chance that you will win no more than two games. In other words, there is a 54.93% chance that you will lose money.
We just discussed discrete random variables, and now we consider continuous random variables. Recall, a continuous random variable is such that all values (to any number of decimal places) within some interval are possible outcomes. A continuous random variable has an infinite number of possible values so we can't assign probabilities to each specific value. If we did, the total probability would be infinite, rather than 1, as it is supposed to be.
To describe probabilities for a continuous random variable, we use a probability density function.
Probability density function (PDF): A curve such that the area under the curve within any interval of values along the horizontal gives the probability for that interval
The most commonly encountered type of continuous random variable is a normal random variable , which has a symmetric bell-shaped density function. The center point of the distribution is the mean value, denoted by \(\mu\) ("mu"). The spread of the distribution is determined by the variance, denoted by \(\sigma ^{2}\) ("sigma squared") or by the square root of the variance called standard deviation, denoted by \(\sigma\) ("sigma").
The distribution of IQ scores is normal with a mean of 100 and standard deviation of 15.
In other words, \(\mu=100\) and \(\sigma=15\). The probability density function is shown below.
Notice that the horizontal axis shows IQ score and the bell is centered at the mean of 100.
While we cannot determine the probability for any one given value because the distribution is continuous, we can determine the probability for a given interval of values. The probability for an interval is equal to the area under the density curve. The total area under the curve is 1.00, or 100%. In other words, 100% of observations fall under the curve.
The next figure shows the probability that the IQ of a randomly selected individual will be between 115 and 130. This probability is equal to the shaded area under the curve between 115 and 130.
Soon we will learn how to use the normal distribution (i.e., z distribution) to determine what proportion of the curve is shaded.
The Empirical Rule can be used to estimate the proportion of observations that should fall within the intervals of one, two, and three standard deviations of the mean:
68% of observations: \(\mu\pm 1(\sigma)\)
95% of observations: \(\mu\pm 2(\sigma)\)
99.7% of observations: \(\mu\pm 3(\sigma)\)
Middle 95%
Given that for the distribution of IQ scores, \(\mu=100\) and \(\sigma=15\), let's apply the Empirical Rule to determine between which two scores the middle 95% of indidivuals fall.
Middle 95%: \(100\pm2(15)=[70,130]\)
The middle 95% of IQ scores fall between 70 and 130.
Middle 99.7%
The Empirical Rule also stated that about 99.7% (nearly all) of a bell-shaped dataset will be in the interval \(mean\pm 3(standard\;deviation)\).
\(100\pm 2(15)= [55, 145]\)
99.7% of IQ scores are between 55 and 145. Notice that this interval roughly gives the complete range of the density curve shown above.
Here we will walk through a few examples of using Minitab Express to find various probabilities. We will using the following scenario: Suppose vehicle speeds at a highway location have a normal distribution with a mean of 65 mph and a standard deviation of 5 mph.
Remember that the cumulative probability for a value is the probability less than or equal to that value.
Question: What is the probability that a randomly selected vehicle will be going 73 mph or slower?
Here is Minitab output showing that the probability = .9452 that the speed of a randomly selected vehicle is less than or equal to 73 mph.
We can find this probability using either Minitab Express or Minitab:
To calculate normal random variable probabilities in Minitab:
To calculate normal random variable probabilities in Minitab:
The result should be the following output:
Here is a figure that illustrates the cumulative probability we found using this procedure:
Sometimes we want to know the probability that a variable has a value greater than some value. For instance, we might want to know the probability that a randomly selected vehicle speed is greater than 73 mph, written \(P(X > 73)\).
Previously we found \(P(Speed<73)=.9452\). The general rule for a "greater than" situation is\(P(greater\;than\;a\;value)=1-P(less\;than\;or\;equal\;to\;the\;value)\). Thus, \(P(Speed>73)=1-.9452=.0548\). The probability that a randomly selected vehicle will be going 73 mph or greater is .0548, or 5.48%.
Question: What is the probability that a randomly selected vehicle will be going more than 60 mph?
Using Minitab we can find that the probability is .1587 that a speed is less than or equal to 60 mph. Thus\(P(Speed>60mph)=1-.1587 = .8413\).
The relevant Minitab output and a figure showing the cumulative probability for 60 mph follows:
Suppose we want to know the probability a normal random variable is within a specified interval. For instance, suppose we want to know the probability a randomly selected speed is between 60 and 73 mph. The simplest approach is to subtract the cumulative probability for 60 mph from the cumulative probability for 73. In other words, \(P(60<Speed<73)=P(Speed<73)-P(Speed<60)=.9452-.1587=.7875\)
This can also be written as P(60 < X < 73) = 0.7875, where X is speed.
The general rule for an "in between" probability is P( between a and b ) = cumulative probability for value b − cumulative probability for value a
This may also be written as \(P(a<X<b)=P(X<b)-P(X<a)\).
We may wish to know the value of a variable that is a specified percentile of the values.
To calculate percentiles in Minitab:
To calculate percentiles in Minitab Express:
The result should be the following output:
Note:
Recall from Lesson 3 the formula for computing the z-score for an individual: \(z=\frac{x-\overline x}{s}\). That formula used sample statistics. This formula can also be written using population parameters: \(z=\frac{x-\mu}{\sigma}\)
Use Table A in Appendix A of your textbook or see a copy at Standard Normal Table
Table A in the textbook gives normal curve cumulative probabilities for standardized scores. This is also known as a z table.
Row labels of Table A give possible z-scores up to one decimal place. The column labels give the second decimal place of the z-score.
The cumulative probability for a value equals the cumulative probability for that value's z-score.
Vehicle speeds at a highway location have a normal distribution with a mean of 65 mph and a standard deviation of 5 mph.
What is the probability that a randomly selected car is going 73 mph or less?
It's often helpful to begin by sketching a normal distibution and shading in the appropriate region. From the graph below we can see that more than half of the curve is shaded in; this means that our final result should be greater than .50
Let’s use the z table to determine the proportion of the curve under 73 mph.
First, we need to compute the z score for this speed: \(z=\frac{73-65}{5}=1.60\)
Now we can use the z table to determine the proportion of the curve that is less than a z score of 1.6 by looking up 1.60. We look in the 1.6 row and the .00 column (1.6 plus .00 equals 1.60). The cumulative probability for z=1.60 is .9452, the same value that we got previously when using Minitab Express. There is a 94.52% chance of randomly selecting a vehicle that is going 73 mph or less.
What is the probability that a car is going 60 mph or less?
For speed = 60 the z-score is: \(z=\frac{60-65}{5}=-1.00\)
We look up -1.00 on the z table below and find a cumulative probability of .1584. There is a 15.84% chance of randomly selecting a vehicle that is going 60 mph or less.
Table A.1 gives this information:
Suppose pulse rates of adult females have a normal curve distribution with mean of 75 and a standard deviation of 8. What is the probability that a randomly selected female has a pulse rate greater than 85 ? Be careful ! Notice we want a "greater than" and the interval we want is entirely above average, so we know the answer must be less than .50
If we use Table A.1, the first step is to calculate the z-score associated with a pulse rate of 85: \(z=\frac{85-75}{8}=1.25\).
Given that z=1.25, we can use the z-table to determine the cumulative probability:
The cumulative probability for z = 1.25 is .8944. This is the proportion below a pulse rate of 85, but we want to know the proportion above a pulse rate of 85.
\(P(X>85) = 1 - P(X<85) = 1 −.8944 =.1056\)
The probability that a randomly selected female will have a pulse rate above 85 is .1056
We know that for IQ scores \(\mu=100\) and \(\sigma=15\). What proportion of IQ scores fall between 100 and 130?
First we must compute the z score associate which each of these IQ scores.
For an IQ of 100, \(z=\frac{100-100}{15}=0\)
For an IQ of 130, \(z=\frac{130-100}{15}=2.00\)
We are looking for the proportion of observations that fall between a z score of 0 and a z score of 2.00.
Using the z table above, \(P(z<0.00)=.5000\) and \(P(z<2.00)=.9772\)
\(P(0.00<z<2.00)=P(z<2.00)-P(z<0.00)=.9772-.5000=.4772\)
The proportion of IQ scores between 100 and 130 is .4772, or 47.72%.
The following table reviews the procedures that you have just learned for determining various probabilities given observations using a z table.
Type of Question | Steps |
Probability less than X | 1. Compute a z score for observation X 2. Look up the cumulative probability on the z table |
Probability greater than X | 1. Compute a z score for observation X 2. Look up the cumulative probability on the z table 3. Subtract the cumulative probability from 1 |
Probability between X and Y | 1. Compute the z scores for both observation X and Y 2. Look up the cumulative probabilities for both z scores 3. Subtract the cumulative probability for X from the cumulative probability for Y |
Practice finding the proportion of observations under the normal curve. Each question can be answered using either Minitab Express or the z table. Work through each example then click the icon to view the solution and compare your answers.
HINT: Drawing the normal curve and shading in the region you are looking for is often helpful.
1. What proportion of the standard normal curve is less than a z score of 1.64?
2. What proportion of the standard normal curve falls above a z score of 1.33?
3. What proportion of the standard normal curve falls between a z score of -.50 and a z score of +.50?
4. At one private school, a minimum IQ score of 125 is necessary to be considered for admission. IQ scores have a mean of 100 and standard deviation of 15. Given this information, what proportion of children are eligible for consideration for admission to this school?
5. ACT scores have a mean of 18 and a standard deviation of 6. What proportion of test takers score between a 20 and 26?
6. A men’s clothing company is doing research on the height of adult American men in order to inform the sizing of the clothing that they offer. The height of males in the United States is normally distributed with a mean of 175 cm and a standard deviation of 15 cm. Men who are more than 30 cm different (shorter or taller) from the mean are classified by the apparel company as special cases because they do not fit in their regular length clothing. Given this information, what proportion of men would be classified as special cases?
In this lesson we examined a number of probability distributions including discrete, binomial, and normal. The next lesson will continue to explore probability distributions with an emphasis on the distribution of sample means. It will also introduce a new distribution that is similar in shape to the normal distribution: the t distribution.
Take a moment to review what you learned in this lesson before continuing to the next.
Lesson 5 Learning Objectives
Upon completion of this lesson, you will be able to: