## Normal distribution and it’s characteristics

The normal distribution is the most important of all probability distributions. It is applied directly to many practical problems, and several very useful distributions are based on it.

**Characteristics**

Many empirical frequency distributions have the following characteristics:

- They are approximately symmetrical, and the mode is close to the centre of the distribution.

- The mean, median, and mode are close together.

- The shape of the distribution can be approximated by a bell: nearly flat on top, then decreasing more quickly, then decreasing more slowly toward the tails of the distribution. This implies that values close to the mean are relatively frequent, and values farther from the mean tend to occur less frequently. Remember that we are dealing with a random variable, so a frequency distribution will not fit this pattern exactly. There will be random variations from this general pattern.

**A theoretical distribution** that has the stated characteristics and can be used to approximate many empirical distributions was devised more than two hundred years ago. It is called the “normal probability distribution,” or the normal distribution. It is sometimes called the Gaussian distribution.

**Probability from the Probability Density Function**

** **The probability density function for the normal distribution is given by:

where μ is the mean of the theoretical distribution, σ is the standard deviation, and π = 3.14159 … This density function extends from –∞ to +∞. Its shape is –

Because the normal probability density function is symmetrical, the mean, median and mode coincide at x = μ. Thus, the value of μ determines the location of the center of the distribution, and the value of σ determines its spread.

We have seen that probabilities for a continuous random variable are given by integration of the probability density function.

The probability that a variable, X, is between x1 and x2 according to the normal distribution is given by:

As shown as:-

**Using Tables for the Normal Distribution**

Table A1 gives values of the cumulative normal probability as a function of z, the number of standard deviations from the mean. Part of Table A1 is shown below.

we want Φ(–0.76): we look for the row labeled z0 = –0.7 along the sides and the column labeled Δz = –0.06 along the top (since –0.76 = (–0.7) + (–0.06)) and read Φ(–0.76) = 0.2236.

The diagram at the top of the table towards the right indicates that Φ(z) corresponds to the area under the curve to the left of a particular value of z (here z = –0.76).

Because the distribution is symmetrical, there must be a simple relation between Φ(–0.76) and Φ(+0.76), or in general between Φ(–z) and Φ(+z). That relation is:

**Φ (−z1) =1−Φ(+z1)**

**Using the Computer**

Cumulative normal probabilities can be obtained from computer software such as Excel. Standard cumulative normal probabilities, Φ(z), can be obtained by the Excel **function =NORMSDIST(z**), where z= x-μ/ σ is the standard normal variable. The inverse function is also available on Excel.

If we know a value of the cumulative normal probability, Φ(z), and want to find the value of z to which it applies, we can use the function =NORMSINV(cumulative probability). In both function names the letter “s” stands for the standard form—that is, a relation between Φ and z rather than between Φ and x.

**Fitting the Normal Distribution to Frequency Data**

** **A normal distribution is described completely by two parameters, its mean and standard deviation, usually the first step in fitting the normal distribution is to calculate the mean and standard deviation for the other distribution. Then we use these parameters to obtain a normal distribution comparable to the other distribution.

**(a) Fitting to a Continuous Frequency Distribution**

First, then, we need to estimate the parameters of the normal distribution that will fit the frequency distribution in which we are interested. Then we can compare the normal distribution having those parameters to the corresponding grouped frequency data.

**(b) Fitting to a Discrete Frequency Distribution**

If the distribution to which we compare a normal distribution is discrete, because the normal distribution is continuous we need a correction for continuity. The correction for continuity will be examined in the next section, in which the discrete binomial distribution is approximated by a normal distribution.

**Normal Approximation to a Binomial Distribution**

It is often desirable to use the **normal distribution** in place of another **probability distribution**. In particular, it is convenient to replace the binomial distribution with the normal when certain conditions are met. Remember, though, that the binomial distribution is discrete, whereas the normal distribution is continuous.

**The shape of the binomial distribution** varies considerably according to its parameters, n and p. If the parameter p, the probability of “success” (or a defective item or a failure, etc.) in a single trial, is sufficiently small (or if q = 1 – p is sufficiently small), the distribution is usually unsymmetrical. If p or q is sufficiently small and if the number of trials, n, is large enough, a binomial distribution can be approximated by a **Poisson distribution**.

On the other hand, if p is sufficiently close to 0.5 and n is sufficiently large, the binomial distribution can be approximated by a normal distribution. Under these conditions the binomial distribution is approximately symmetrical and tends toward a bell shape. A larger value of n allows greater departure of p from 0.5; a **binomial distribution** with very small p (or p very close to 1) can be approximated by a normal distribution if n is very large. If n is large enough, sometimes both the Poisson approximation and the normal approximation are applicable. In that case, use of the normal approximation is usually preferable because the **normal distribution** allows easy calculation of cumulative probabilities using tables or computer software.

This figure compares a binomial distribution with a normal distribution. The parameters of the binomial distribution are **p = 0.4 and n = 20** (for instance, we might take samples of 20 items from a production line when the probability that any one item will require further processing is 0.4). To fit a normal distribution we need to know the mean and the standard deviation. Remember that the mean of a binomial distribution is** μ = np**, and that the standard deviation for that distribution is **σ = np(1− p).**

** **The **normal distribution** is continuous, whereas the binomial distribution is discrete. Probabilities according to the** binomial distribution** are different from zero only when the number of defectives is a whole number, not when the number is between the whole numbers. On the other hand, if we integrate the normal distribution only for limits infinitesimally apart around the whole numbers, the area under the curve will be infinitesimally small. Then the corresponding probability will be zero.

The common solution is to integrate for wider steps, which together cover the whole range. We set limits for integration of the normal distribution halfway between possible values of the **discrete variable**. This modification is called the correction for continuity**.**

**Fitting the Normal Distribution to Cumulative ****Frequency Data**

**(a) ****Cumulative Normal Probability and Normal Probability Paper**

Instead of comparing a frequency distribution or probability distribution to a **normal probability distribution** using a **histogram** or the equivalent, often a better alternative is to compare graphically using cumulative probabilities.

However, the scale can be modified (or distorted) to give a more convenient comparison. The scale is modified in such a way that cumulative probability plotted against x or z will give a straight line for a normal distribution. A frequency distribution will still show random variations, but real departure from a normal distribution is much easier to spot.

Thus, **cumulative relative frequencies** (on the modified scale) are plotted versus the variable, x, on a linear scale. If the data came from a normal distribution, this plot will give approximately a straight line. If the underlying distribution is appreciably different from a normal distribution, larger deviations and systematic variations will be present.

**Graph paper** using such a modified or distorted scale for cumulative relative frequency, and a uniform scale for the measured variable, is called normal probability paper**. **This special type of commercial graph paper, like the special types for logarithmic and log-log scales, is available from many suppliers. Commercial normal probability paper comes with a distorted scale for relative cumulative frequency along one axis and corresponding unequally spaced grid lines.

The other scale (with corresponding grid lines) is uniform.** Points** are plotted by hand on this paper with co-ordinates corresponding to relative cumulative frequency (on the distorted scale) versus the value of the variable (on the linear scale). In most cases we will use data from a grouped frequency distribution. Since normal probability paper uses cumulative frequency or probability, data from a grouped frequency distribution should be plotted versus class boundaries, not class midpoints.

The points so plotted can be compared with the **straight line** representing a **normal distribution** fitted to the data and so having the same mean and standard deviation. Since the median of a normal distribution is equal to its mean, one point on this line should be at 50% relative cumulative frequency and x , the estimated mean. Another point should be at 97.7% relative cumulative frequency and ( x + 2s); a third should be at 2.3% relative cumulative frequency and ( x – 2s).

**Transformation of Variables to Give a Normal Distribution**

If the original variable shows a distribution which is not a normal distribution, it is very useful to try to change the variable so that the new form will follow a normal distribution. This strategy is often successful if the original distribution showed a **single mode** somewhere between the smallest and largest values of the variable, but the original distribution was not** symmetrical**.

If the original distribution was x, forms of the new variable to try include log x, 1/x. The most common transformation for this purpose is replacing x by ln x, log10 x or logarithm of x to any other base.