Ask Sawal

Discussion Forum
Notification Icon1
Write Answer Icon
Add Question Icon

How to find z* for confidence interval?

3 Answer(s) Available
Answer # 1 #

{eq}z {/eq}-score: The {eq}z {/eq}- score {eq}z_{\alpha} {/eq} is a number (any decimal on the horizontal axis) such that the area to the right of {eq}z_{\alpha} {/eq} and between the standard normal curve and the horizontal axis is equal to {eq}\alpha {/eq}.

Confidence interval: For a given confidence level, a confidence interval is an interval {eq}\left(a,b\right) {/eq} of real numbers in which contains a population parameter (such as a population mean) with the given confidence level.

Now that we have seen how to find the critical {eq}z {/eq}-value for a given confidence level, let's hone our skills by walking through two practice problems!

Example 1: A peanut manufacturer wants to estimate the mean weight of their jar of peanuts. A sample of 60 jars is collected, and the sample mean weight is 3.2 pounds. The manufacturer wants a 95% confidence interval for the true mean weight of all if its peanut jars. What z-score should be used when constructing the interval?

Step 1: Determine the confidence level, denoted {eq}C% {/eq}, where C is a number (decimal) between 0 and 100.

The confidence level is 95%.

Step 2: Obtain the confidence level, denoted {eq}\alpha {/eq} by evaluating {eq}\alpha = 1 -\frac{C}{100} {/eq}.

The significance level is equal to

{eq}\alpha = 1-\frac{95}{100}=1-0.95=0.05 {/eq}.

Step 3: Use the {eq}z {/eq}-table (or a calculator) to obtain the {eq}z {/eq}-score {eq}z_{\alpha/2} {/eq}.

The value {eq}z_{\alpha/2} = z_{0.05/2} = z_{0.025} {/eq} is the number on the horizontal axis with area 0.025 to its right underneath the standard normal curve and above the horizontal axis. Equivalently, the {eq}z {/eq}-score {eq}z_{0.025} {/eq} has area {eq}1-0.025=0.975 {/eq} to its left. Using a {eq}z {/eq}-table shows that {eq}z_{0.025} = \textbf{1.96} {/eq}.

[3]
Edit
Query
Report
bacuuf Va
ARTIFICIAL GLASS EYE MAKER
Answer # 2 #

We have seen that the sample mean \(\bar{X}\) has mean \(\mu\) and variance \(\dfrac{\sigma^2}{n}\), and that the distribution of \(\bar{X}\) is approximately Normal when the sample size \(n\) is large. This raises the question: How large is 'a large sample size'? Appropriate guidelines need to take into account the nature of the population being sampled, as far as this is possible; this will be elaborated later in this section.

The Normal approximation for the distribution of \(\bar{X}\) tells us that, for large \(n\),

For a random variable with the standard Normal distribution, \(Z \stackrel{\mathrm{d}}{=} \mathrm{N}(0,1)\), we know that \(\Pr(-2 < Z < 2) \approx 0.95\). To be more precise:

We studied how to obtain the value 1.96 in the module Exponential and normal distributions . figure 29 is a visual reminder.

Figure 29: The standard Normal distribution, \(Z \stackrel{\mathrm{d}}{=} \mathrm{N}(0,1)\).

If we consider the Normal approximation to the distribution of the standardised sample mean, it follows that we can state that, for large \(n\),

We multiply through by \(\dfrac{\sigma}{\sqrt{n}}\) to obtain

In other words, the distance between \(\bar{X}\) and \(\mu\) will be less than \(1.96 \dfrac{\sigma}{\sqrt{n}}\) for 95% of sample means.

One further rearrangement gives

It is really important to reflect on this probability statement. Note that it has \(\mu\) in the centre of the inequalities. The population parameter \(\mu\) does not vary: it is fixed, but unknown. The random element in this probability statement is the random interval around \(\mu\).

This forms the basis for the approximate 95% confidence interval for the true mean \(\mu\). In a given case, we have just a single sample mean \(\bar{x}\). An approximate 95% confidence interval for \(\mu\) is given by

However, a problem remains. The uncertainty in the estimate depends on \(\sigma\), which is an unknown parameter: the true standard deviation of the parent distribution.

In the approximate methods used here, we replace the population standard deviation \(\sigma\) with the sample standard deviation.

The sample standard deviation is an estimate of the population standard deviation. Just as \(\bar{X}\) is a random variable that estimates \(\mu\) and has an observed value \(\bar{x}\) for a specific sample, so \(S\) is a random variable that estimates \(\sigma\) and has an observed value \(s\) for a specific sample. The sample standard deviation (which is dealt with in the national curriculum in Year 10) is defined as follows. For a random sample \(X_1, X_2, \dots, X_n\) from a population with standard deviation \(\sigma\), the sample standard deviation is defined to be

where \(\bar{X}\) is the sample mean. For a specific random sample \(x_1, x_2, \dots, x_n\), the observed value of the sample standard deviation is

where \(\bar{x}\) is the observed value of the sample mean.

It is reasonable to ask whether using \(S\) in place of \(\sigma\) actually works. We have extensively demonstrated the approximate Normality of the distribution of the sample mean. In particular, we have seen that, for large \(n\),

But now it seems that we are going to rely on a different, further approximation, that for large \(n\),

The result that uses \(S\) in place of \(\sigma\) is also valid; as \(n\) tends to infinity, the sample standard deviation \(S\) gets closer and closer to the true standard deviation \(\sigma\). We could revisit all of the previous examples and demonstrate this for the uniform, exponential and so on; instead, we use the strange-looking distribution to make the point.

figure 30 shows histograms of \(\dfrac{\bar{X} - \mu}{S/\sqrt{n}}\), for various values of \(n\), based on random samples from the strange distribution considered in the section Sampling from asymmetric distributions (see figure 22). Superimposed Normal distributions are added to enable a visual assessment of the adequacy of the Normal approximation.

Detailed description Figure 30: Histograms of the standardised sample mean, using the sample standard deviation in the denominator, rather than the population standard deviation, for various values of \(n\).

The standardised distributions in figure 30 using \(S\) in the denominator do not approximate Normality as well as the histograms of \(\bar{X}\) in figure 22; some skewness is evident. But for \(n=200\), the approximation is good; keep in mind that we are sampling from a decidedly odd parent distribution here.

Hence, for large \(n\), based on a random sample from a distribution with mean \(\mu\) and standard deviation \(\sigma\), an approximate 95% confidence interval for \(\mu\) is given by

We have focussed so far on 95% confidence intervals, since 95% is the confidence level that is used most commonly. The general form of an approximate \(C\%\) confidence interval for a population mean is

where the value of \(z\) is appropriate for the confidence level. For a 95% confidence interval, we use \(z=1.96\), while for a 90% confidence interval, for example, we use \(z=1.64\).

In general, for a \(C\%\) confidence interval, we need to find the value of \(z\) that satisfies

figure 32 shows the required value of \(z\) as a function of the confidence level.

Detailed description Figure 32: The relationship between the confidence level and the value of \(z\) in the formula for an approximate confidence interval.

The following figure is a repeat of figure 28. It shows confidence intervals based on the same estimated mean, but with different confidence levels. The larger confidence levels lead to wider confidence intervals.

Figure 33: Confidence intervals from the same data, but with different confidence levels.

The distance from the sample estimate \(\bar{x}\) to the endpoints of the confidence interval is

The quantity \(E\) is referred to as the margin of error. The margin of error is half the width of the confidence interval. Sometimes confidence intervals are reported as \(\bar{x} \pm E\); for example, as \(9.95 \pm 0.43\). This means that the lower and upper bounds of the interval are not directly stated, but must be derived.

In figure 33, we see larger margins of error when the confidence level is larger. This is because the value of \(z\) from the standard Normal distribution will be larger when the confidence level is larger.

We use a confidence interval when we want to make an inference about a population parameter, in this case, the population mean. The confidence interval describes a range of plausible values for the population mean that could have given rise to our random sample of observations. The margin of error in a confidence interval for the mean is based on the standard deviation divided by the square root of sample size; generally, the margin of error for a confidence interval will be smaller than the standard deviation of the sample, unless the sample size is very small.

Sometimes a confidence interval is wrongly interpreted as providing information about plausible values for the range of the data. This is illustrated in the next example.

We have seen in this module that, for large \(n\), an approximate 95% confidence interval for \(\mu\) is

It is pertinent to ask: How large is 'large'? In effect, what is the smallest sample size for which the approximation is adequate?

A commonly cited guideline is that \(n\) should be greater than 30. However this guideline does not apply in all cases; sometimes larger sample sizes are needed to safely assume that the Normal approximation is appropriate. Here are some more detailed guidelines:

These guidelines are summarised in the following table.

figure 35 provides four example populations. In the top row, from left to right, there is a Normal population, an example of a symmetric distribution (in this case, a triangular distribution), a somewhat skewed distribution and an exponential distribution. In the bottom row, under each distribution, is a random sample. The size of the samples shown correspond to the guidelines in the table. Samples of size \(n = 30\) are taken from the Normal and symmetric populations, a sample of size \(n = 60\) is taken from the skewed distribution, and a sample of size \(n = 130\) from the exponential distribution. The samples from the skewed and exponential distributions appear to be more clearly skewed than those from the Normal and triangular distributions.

Figure 35: Examples of samples from four different populations.

In practice, when calculating a confidence interval for a population mean based on a random sample, we may not have information about the population from which the sample was taken. But we do have the random sample itself! We need to make a judgement about the likely population distribution from which the sample arose. figure 36 gives examples of ten different samples from each of the four different populations shown in the top part of figure 35. It is possible to get some idea of the shape of the parent distribution from the histogram of the random sample itself, and that may assist us to judge whether the sample size is large enough for the Normal approximation to be adequate.

An important overall message, however, is that sample sizes of a few hundred or so are enough for the use of the Normal approximation in general, unless the parent distribution is really bizarre.

Figure 36: Ten samples from each of four different populations.

Next page - Answers to exercise

[2]
Edit
Query
Report
Pranii vmko Arpit
SHAPING MACHINE TENDER
Answer # 3 #

Step 1: Divide your confidence level by 2: .95/2 = 0.475. Step 2: Look up the value you calculated in Step 1 in the z-table and find the corresponding z-value. The z-value that has an area of .475 is 1.96. Step 3: Divide the number of events by the number of trials to get the “P-hat” value: 24/160 = 0.15.

[1]
Edit
Query
Report
Erland Apa
Character Actor