How do I construct a confidence interval

Confidence interval for the proportional value

Basic concepts

Confidence interval for the proportional value

A dichotomous population is assumed, in which an unknown proportion of elements has a property has and a share does not have this property.

It is supposed to be an interval estimate for carried out, i.e. a confidence interval for the unknown proportion value of the population.

This population becomes a simple random sample of size drawn so that the sample variablesare independent and identically Bernoulli distributed (see section Binomial Distribution).

It has already been shown that the sample percentage value

with the expected value

and the variance

an unbiased and consistent estimator for is (see section Properties of Estimators).

Since the construction of confidence intervals is very time-consuming for small sample sizes, only the situation is considered here that the sample size is sufficiently large that the standardized random variable

due to the central limit theorem, the approximate standard is normally distributed: .

The probability statement therefore applies

where one from the distribution function of the standard normal distribution for the given probability receives.

A suitable confidence interval for win, because in the unknown is also the variance of the estimator unknown.

This variance must also be estimated from the sample.

If you replace in the unknown share value by the estimator, then one obtains a consistent estimator for the variance of :


the confidence level can now be derived through elementary transformations:

Thus, for very large sample sizes, there is an approximate confidence interval for the unknown proportion given by a dichotomous population

For a sufficient approximation to the normal distribution, the sample size must be should, however, be chosen larger, for example .

The estimation interval is obtained for a specific sample


wherein the relative frequency of occurrence of elements with the property in the sample and whose number are in the sample.

Further information

Characteristics of the confidence interval

are random variables as they are about depend on the sample result.

Information for estimating the variance of the share value

The variance of the estimator

is unknown because it has the unknown proportional value contains.

This variance must also be estimated from the sample by using by the estimator is replaced.

The justification for the substitution is given by the fact that the expected value of with increasing sample size