• Articles
  • Tutorials
  • Interview Questions
  • Webinars

What is Normal Distribution?

Normal distribution and it’s characteristics

The normal distribution is the most important of all probability distributions. It is applied directly to many practical problems, and several very useful distributions are based on it.

Characteristics

Many empirical frequency distributions have the following characteristics:

  1. They are approximately symmetrical, and the mode is close to the centre of the distribution.
  1. The mean, median, and mode are close together.
  1. The shape of the distribution can be approximated by a bell: nearly flat on top, then decreasing more quickly, then decreasing more slowly toward the tails of the distribution. This implies that values close to the mean are relatively frequent, and values farther from the mean tend to occur less frequently. Remember that we are dealing with a random variable, so a frequency distribution will not fit this pattern exactly. There will be random variations from this general pattern.

histogram of thickness of metal part
A theoretical distribution that has the stated characteristics and can be used to approximate many empirical distributions was devised more than two hundred years ago. It is called the “normal probability distribution,” or the normal distribution. It is sometimes called the Gaussian distribution.

Probability from the Probability Density Function

 The probability density function for the normal distribution is given by:
probability distribution function for normal

where μ is the mean of the theoretical distribution, σ is the standard deviation, and π = 3.14159 … This density function extends from –∞ to +∞. Its shape is –
shape of the normal distribution
Because the normal probability density function is symmetrical, the mean, median and mode coincide at x = μ. Thus, the value of μ determines the location of the center of the distribution, and the value of σ determines its spread.
We have seen that probabilities for a continuous random variable are given by integration of the probability density function.
The probability that a variable, X, is between x1 and x2 according to the normal distribution is given by:

normal distribution

As shown as:-

probability of x between x1 and x2

cumulative normal probability

Get 100% Hike!

Master Most in Demand Skills Now!

Using Tables for the Normal Distribution

Table A1 gives values of the cumulative normal probability as a function of z, the number of standard deviations from the mean. Part of Table A1 is shown below.

part of table a1

we want Φ(–0.76): we look for the row labeled z0 = –0.7 along the sides and the column labeled Δz = –0.06 along the top (since –0.76 = (–0.7) + (–0.06)) and read Φ(–0.76) = 0.2236.
The diagram at the top of the table towards the right indicates that Φ(z) corresponds to the area under the curve to the left of a particular value of z (here z = –0.76).
Because the distribution is symmetrical, there must be a simple relation between Φ(–0.76) and Φ(+0.76), or in general between Φ(–z) and Φ(+z). That relation is:

Φ (−z1) =1−Φ(+z1)

Using the Computer

Cumulative normal probabilities can be obtained from computer software such as Excel. Standard cumulative normal probabilities, Φ(z), can be obtained by the Excel function =NORMSDIST(z), where z= x-μ/ σ  is the standard normal variable. The inverse function is also available on Excel.
If we know a value of the cumulative normal probability, Φ(z), and want to find the value of z to which it applies, we can use the function =NORMSINV(cumulative probability). In both function names the letter “s” stands for the standard form—that is, a relation between Φ and z rather than between Φ and x.

Fitting the Normal Distribution to Frequency Data

 A normal distribution is described completely by two parameters, its mean and standard deviation, usually the first step in fitting the normal distribution is to calculate the mean and standard deviation for the other distribution. Then we use these parameters to obtain a normal distribution comparable to the other distribution.

(a) Fitting to a Continuous Frequency Distribution
First, then, we need to estimate the parameters of the normal distribution that will fit the frequency distribution in which we are interested. Then we can compare the normal distribution having those parameters to the corresponding grouped frequency data.

(b) Fitting to a Discrete Frequency Distribution
If the distribution to which we compare a normal distribution is discrete, because the normal distribution is continuous we need a correction for continuity. The correction for continuity will be examined in the next section, in which the discrete binomial distribution is approximated by a normal distribution.

Normal Approximation to a Binomial Distribution

It is often desirable to use the normal distribution in place of another probability distribution. In particular, it is convenient to replace the binomial distribution with the normal when certain conditions are met. Remember, though, that the binomial distribution is discrete, whereas the normal distribution is continuous.
The shape of the binomial distribution varies considerably according to its parameters, n and p. If the parameter p, the probability of “success” (or a defective item or a failure, etc.) in a single trial, is sufficiently small (or if q = 1 – p is sufficiently small), the distribution is usually unsymmetrical. If p or q is sufficiently small and if the number of trials, n, is large enough, a binomial distribution can be approximated by a Poisson distribution.
On the other hand, if p is sufficiently close to 0.5 and n is sufficiently large, the binomial distribution can be approximated by a normal distribution. Under these conditions the binomial distribution is approximately symmetrical and tends toward a bell shape. A larger value of n allows greater departure of p from 0.5; a binomial distribution with very small p (or p very close to 1) can be approximated by a normal distribution if n is very large. If n is large enough, sometimes both the Poisson approximation and the normal approximation are applicable. In that case, use of the normal approximation is usually preferable because the normal distribution allows easy calculation of cumulative probabilities using tables or computer software.

comparison of a binomial distribution with a normal distribution fitted to it

This figure compares a binomial distribution with a normal distribution. The parameters of the binomial distribution are p = 0.4 and n = 20 (for instance, we might take samples of 20 items from a production line when the probability that any one item will require further processing is 0.4). To fit a normal distribution we need to know the mean and the standard deviation. Remember that the mean of a binomial distribution is μ = np, and that the standard deviation for that distribution is σ = np(1− p).
 The normal distribution is continuous, whereas the binomial distribution is discrete. Probabilities according to the binomial distribution are different from zero only when the number of defectives is a whole number, not when the number is between the whole numbers. On the other hand, if we integrate the normal distribution only for limits infinitesimally apart around the whole numbers, the area under the curve will be infinitesimally small. Then the corresponding probability will be zero.
The common solution is to integrate for wider steps, which together cover the whole range. We set limits for integration of the normal distribution halfway between possible values of the discrete variable. This modification is called the correction for continuity.

comparison at n = 10 and p = 0.5

Fitting the Normal Distribution to Cumulative Frequency Data

(a) Cumulative Normal Probability and Normal Probability Paper
Instead of comparing a frequency distribution or probability distribution to a normal probability distribution using a histogram or the equivalent, often a better alternative is to compare graphically using cumulative probabilities.
However, the scale can be modified (or distorted) to give a more convenient comparison. The scale is modified in such a way that cumulative probability plotted against x or z will give a straight line for a normal distribution. A frequency distribution  will still show random variations, but real departure from a normal distribution is much easier to spot.
Thus, cumulative relative frequencies (on the modified scale) are plotted versus the variable, x, on a linear scale. If the data came from a normal distribution, this plot will give approximately a straight line. If the underlying distribution is appreciably different from a normal distribution, larger deviations and systematic variations will be present.
Graph paper using such a modified or distorted scale for cumulative relative frequency, and a uniform scale for the measured variable, is called normal probability paper. This special type of commercial graph paper, like the special types for logarithmic and log-log scales, is available from many suppliers. Commercial normal probability paper comes with a distorted scale for relative cumulative frequency along one axis and corresponding unequally spaced grid lines.
The other scale (with corresponding grid lines) is uniform. Points are plotted by hand on this paper with co-ordinates corresponding to relative cumulative frequency (on the distorted scale) versus the value of the variable (on the linear scale). In most cases we will use data from a grouped frequency distribution. Since normal probability paper uses cumulative frequency or probability, data from a grouped frequency distribution should be plotted versus class boundaries, not class midpoints.
The points so plotted can be compared with the straight line representing a normal distribution fitted to the data and so having the same mean and standard deviation. Since the median of a normal distribution is equal to its mean, one point on this line should be at 50% relative cumulative frequency and x , the estimated mean. Another point should be at 97.7% relative cumulative frequency and ( x + 2s); a third should be at 2.3% relative cumulative frequency and ( x – 2s).

normal probability paper

Learn Data Science

Transformation of Variables to Give a Normal Distribution

If the original variable shows a distribution which is not a normal distribution, it is very useful to try to change the variable so that the new form will follow a normal distribution. This strategy is often successful if the original distribution showed a single mode somewhere between the smallest and largest values of the variable, but the original distribution was not symmetrical.
If the original distribution was x, forms of the new variable to try include log x, 1/x. The most common transformation for this purpose is replacing x by ln x, log10 x or logarithm of x to any other base.

About the Author

Principal Data Scientist

Meet Akash, a Principal Data Scientist with expertise in advanced analytics, machine learning, and AI-driven solutions. With a master’s degree from IIT Kanpur, Aakash combines technical knowledge with industry insights to deliver impactful, scalable models for complex business challenges.