# 6.2 - Sampling Distribution of the Sample Mean

To review, the **Central Limit Theorem **states that if a large enough sample is taken (typically \(n\geq 30\)) then the sampling distribution of \(\bar{x}\) is approximately a normal distribution with a mean of μ and a standard deviation of \(\frac {\sigma}{\sqrt{n}}\).

Since in practice we usually do not know μ or σ we estimate these by \(\bar{x}\) and \( \frac {s}{\sqrt{n}}\) respectively. In this case *s * is the estimate of σ and is the standard deviation of the sample. The expression \( \frac {s}{\sqrt{n}}\) is known as the standard error of the mean, labeled \(SE(\overline{x})\) or \(s_{\overline{x}}\).

**Standard Error of the Mean**

\[SE(\overline{x})= \frac {\sigma}{\sqrt{n}}\]

If \(\sigma\) is unknown, estimate \(\sigma\) using \(s\)

In lesson 5 we learned that according to the **Law of Large Numbers**, if a large number of trials are performed, the mean of those trials will be approximately equal to the expected value (i.e., mean). We will apply the Law of Large Numbers in the following examples in which we simulate pulling many random samples from a population. In each of these examples we have the populations and are drawing simple random samples of a consistent size (\(n\)).The statistics computed from these samples can be compared to the values computed using the **Central Limit Theorem** which states that the mean of a distirbution of sample means will equal \(\mu\) with a standard deviation of \(\frac {\sigma}{\sqrt{n}}\).

Below are a few more examples of simulations that draw a large number of random samples from a known population. These examples show that the mean of a distribution of sample means is approximately equal to the mean of the population and that the standard deviation of a distirbution of sample means is approximately equal to \(\frac {\sigma}{\sqrt{n}}\).

### Simulating a Distribution of Sample Means from a Normal Population

Let's generate 500 samples of size heights of 4 men. Assume the distribution of male heights is normal with \(\mu=70"\) and \(\sigma=3"\). We will find the mean of each of 500 samples of \(n=4\).

Here are the first 10 sample means:

70.4 72.0 72.3 69.9 70.5 70.0 70.5 68.1 69.2 71.8

Theory says that the mean of ( \(\bar{x}\) ) \(=\mu=70\)** ** which is also the Population Mean and \(SE(\bar{x})=\frac {\sigma}{\sqrt{n}}=\frac{3}{\sqrt{4}}=1.50\)

Our simulation shows, (500 \(\bar{x}\)'s) = 69.957 and SE(of 500 \(\bar{x}\)'s) = 1.496. These values are similar to what was predicted by our equations. Our sample sizes were small \(n<30\). In the next example, let's increase our sample size.

**What if we had a larger sample size?**

Let's change the sample size from \(n=4\) to \(n=25\) and get descriptive statistics for 500 sample again:

** **

Theory says that the mean of ( \(\bar{x}\)) \(=\mu=70\)** ** which is also the Population Mean and \(SE(\bar{x})=\frac {\sigma}{\sqrt{n}}=\frac{3}{\sqrt{25}}=0.60\)

Our simulation shows, (500 \(\bar{x}\)'s) = 69.983 and SE(of 500 \(\bar{x}\)'s) = 0.592. Again, the resulrs of our simulation were similar to the results predicted by the equations.

### Simulating a Distribution of Sample Means from a **Non-Normal Population**

In the previous example, the height of men was normally distributed. Here we will look at an example in which the population is not normally distributed.

**Simulation**: Below is a histogram of the number of CDs owned by the population of Penn State Students. The distribution is strongly skewed to the right.

** **

In the population, \(\mu=84\) and \(\sigma=96\)

Let's obtain 500 samples of size 4 from this population and look at the distribution of the 500 x-bars:

Theory says that the mean of this distribution should be \(\mu=84\) which is also the Population Mean the \(SE(\overline{x})=\frac{96}{\sqrt{4}}=48\)

Simulation shows Average(500 \(\bar{x}\)'s) = **81.11 **and SE(500 \(\bar{x}\)'s for samples of size 4) = **45.1**

**What if we had a larger sample size?**

Change the sample size from \(n=4\) to \(n=25\) and get descriptive statistics and curve:

Theory says that the mean of this distribution should equal \(\mu=84\) which is also the Population Mean and the \(SE(\bar{x})=\frac {96}{\sqrt{25}}=19.2\)

The simulation shows Average(500 \(\bar{x}\)'s) = 83.281 and SE(500 \(\bar{x}\)'s for samples of size 25) = 18.268. A histogram of the 500 \(\bar{x}\)'s computed from samples of size 25 is beginning to look a lot like a normal curve.