How to Calculate Standard Deviation in Excel

How to Calculate Standard Deviation in Excel

Standard deviation is a key statistical measure that helps you understand how spread out the values in a dataset are. In simpler terms, it shows how much the numbers differ from the average (mean). Whether you’re working in education, finance, or research, understanding standard deviation allows you to analyze trends and make informed decisions. This article explains how to calculate standard deviation in Excel for both sample and population data, with examples and practical tips.

Table of Contents:

What is Standard Deviation?

Standard Deviation is a descriptive statistical method that is used to calculate the measure of dispersion or the spread of values of a dataset. It indicates how much the data points deviate from the mean (average) value. A higher standard deviation indicates that data points are more spread out from the mean.

what is standard deviation

There are two ways to calculate the standard deviation: For a sample and a population. A population includes all the data points of the case you are studying, whereas a sample is a subset of that population. There is a slight variation in the formula to calculate the standard deviation. The sample standard deviation formula divides by n – 1 instead of n, in population data. This is done to provide an unbiased estimate of population variability.

standard deviation formula

Types of Standard Deviation Formulas in Excel

Excel has different formulas to calculate standard deviation depending on whether your data is sample data or the entire population. It is important to select the correct formula to ensure your results are correct. These measures are used to make decisions; therefore, it is integral that the values are as accurate as possible.

Formula Sample/Population Handles Logical/Text Values Notes
STDEV.S Sample No Recommended for numeric sample data
STDEVA Sample Yes Treats TRUE = 1, FALSE = 0
STDEV Sample No Legacy function
STDEV.P Population No Recommended for numeric population data
STDEVPA Population Yes Includes TRUE/FALSE values
Advance Your Career with Excel
Master Excel from experts and boost your productivity with real-world projects.
quiz-icon

Calculating Sample Standard Deviation

When working with a sample, there are several formulas provided by Excel. These include STDEV.S, STDEVA, and STDEV.

  1. STDEV.S

This formula calculates the standard deviation using only the numeric values present in the data. It ignores any textual or logical data (e.g., TRUE and FALSE) completely.

  1. STDEVA

This formula calculates the standard deviation for a sample, including the logical value and text entries representing numbers. It considers TRUE as 1 and FALSE as 0 and then calculates the standard deviation.

  1. STDEV

This is a legacy formula, used in older versions to calculate the standard deviation of a sample. STDEV is available for backwards compatibility, but is not recommended in newer Excel versions. Use STDEV.S instead.

Calculating Population Standard Deviation

When working with the entire population, Excel provides two major functions to calculate the standard deviation: STDEV.P and STDEVPA.

  1. STDEV.P

Just like the sample standard deviation formula, this formula calculates the standard deviation of the entire population using only the numeric values present in the data, ignoring the textual representation of numbers or logical data.

  1. STDEVPA

This formula calculates the standard deviation of the entire population, including the logical values and text representations of the numbers. Here also, TRUE is considered as 1 and FALSE is considered as 0.

Examples for Calculating Standard Deviation in Excel

Let us learn and understand how to apply each of these formulas with the help of unique examples. The examples given below are performed on Excel 2021.

Example 1: Sample Standard Deviation

A sample dataset is used when the size of the population is so large that collecting the data from individuals is impractical. This is because the task becomes tedious, time-consuming, and prone to errors. In these situations, we make informed predictions or inferences about the entire population based on the calculations of the samples.

Example:

Consider a situation where an EdTech organisation wants to calculate the standard deviation of quiz scores from a nationwide assessment conducted across India. Since the quiz was taken by students from thousands of schools across India, collecting and analysing every student’s score would be time-consuming and difficult to manage. Instead, we can collect a few representative scores, such as the top 2, middle 3, and bottom 2, from each participating school and make predictions on the whole population.

Formula Used: =STDEV.S(B2:B15)

sample standard deviation

Output:

 sample standard deviation output

Explanation: Using the formula, the standard deviation was calculated as 18.3045.

Example 2: Population Standard Deviation

Although population datasets tend to be large, there are times when it is important to analyse the whole dataset, and every data point contributes to the final decision. An example could be of a manufacturing factory. The supervisor has to calculate the standard deviation of the production times for all the units it produces in a day. This will help the executives to understand the efficiency of the factory’s production unit. They must include data from even a single unit, or else, they may come to the wrong conclusion about the performance of the production unit.

Example Scenario:

Consider a scenario where an EdTech organisation needs to assess the performance ratings of all its instructors for compliance reporting or internal audits. They will consider all the ratings to determine an accurate rating of the instructor and assess whether to continue the contract or terminate it.

Note: Here, for simplicity, we are using a small dataset with just a few ratings as opposed to population data containing hundreds or thousands of entries.

Formula Used: =STDEV.P(B2:F4)

population standard deviation formula

Output:

population standard deviation output

Explanation: Here, we input the whole matrix of values from B2 cell to F4 cell in the formula. The result was calculated to 0.196.

Example 3: Handling Logical Values

When you are dealing with data about real-life scenarios, it might have columns with logical values (TRUE and FALSE), like the purchase column in real estate datasets. By convention, Excel treats TRUE as 1 and FALSE as 0. Including TRUE/FALSE values in your data can impact the result significantly, especially in educational data where course completion or participation is often logged as Boolean values.

Example Scenario:

An EdTech organisation offers a new online course, and they want to calculate the sample standard deviation of the course completion scores from students who participated in the course. The students who completed the course are marked as TRUE, and students who did not are marked as FALSE.

Formula Used: =STDEVA(B2:B7)

Handling Logical Values

Output:

Handling Logical Values output

Explanation: In this example, we used logical values and to calculate the standard deviation, the formula treated TRUE as 1 and False as 0. The result was equal to 0.547722558.

Similarly, for the population dataset, you can replace the =STDEVA() with =STDEVPA().

Example 4: Population Standard Deviation with Mixed Data Types

Example Scenario:

An EdTech organisation stores the number of students enrolled in different online courses offered by them. Some entries may contain text (e.g., “Pending”) instead of numeric values. These non-numeric entries will be ignored by Excel in the calculation of the standard deviation of the population if STDEV.S or STDEV.P is used. In the case of STDEVPA and STDEVA, non-numerical text, and FALSE and Blank cells are considered 0.

Population Standard Deviation with Mixed Data Types

Output:

Population Standard Deviation with Mixed Data Types output

Explanation: We used two formulas to calculate the standard deviation. The =STDEV.P ignored the textual value (this means it calculated for three values only), and the standard deviation came out to be 16.32993. Whereas, the =STDEVPA formula treated “Pending” as 0 ( this means that it calculated the standard deviation for 4 elements). The value was equal to 87.74964.

How to Manage Outliers in Standard Deviation Calculations

When calculating the standard deviation in Excel, it calculates the result using all the values in the given range. Sometimes, there might exist some outliers in this range of values. Outliers are values in your dataset that are significantly higher or lower than the rest of the data. These values can greatly skew the result, leading to misleading conclusions, especially in sensitive fields like education, healthcare, or finance, where data is fluctuating.

It is important to handle outliers before calculating the standard deviation.

1. Use Conditional Formatting to Spot Them

You can highlight and filter out extreme values using Conditional Formatting. You can set custom thresholds to visually flag outliers.

2. Manually Remove or Correct Outliers

After highlighting the outliers, check each point individually based on the context of your research and analysis. If a data point is an error (e.g., typing 900 instead of 90), correct or remove it before calculation.

Caution: Don’t remove outliers without understanding their context—they might represent critical edge cases.

3. Use Filters or Helper Columns

You can use Excel’s filter option to temporarily exclude outliers and compute the standard deviation on the remaining dataset.

4. Interquartile Range (IQR) Method

There is a way to mathematically calculate and find the outliers. You can do this in the following steps:

  • Calculate Q1 (25th percentile) and Q3 (75th percentile)
  • Find IQR = Q3 – Q1
  • Any value less than Q1 – 1.5(IQR) or greater than Q3 + 1.5(IQR) is considered an outlier

You can calculate the quartiles (Q1 and Q3) using the QUARTILE() function in Excel. Use =QUARTILE.INC(range, 1) for Q1 and =QUARTILE.INC(range, 3) for Q3 in Excel.

Note: Before removing any value, understand whether it’s truly an outlier or a valid but rare data point. Sometimes, outliers that carry important meaning must be retained in the dataset while calculating statistical measures.

Get 100% Hike!

Master Most in Demand Skills Now!

Understanding the Empirical Rule in Standard Deviation

The empirical rule, also known as the 68-95-99.7 rule, helps you understand how data is distributed around the mean in a normal distribution. This rule says that

  • 68% of the data lies within 1 standard deviation of the mean.
  • 95% of the data lies within 2 standard deviations.
  • 99.7% of the data lies within 3 standard deviations.

Example:

If the average test score is 70 and the standard deviation is 5, this means that

  • 68% of students will score between 65 and 75
  • 95% will score between 60 and 80
  • 99.7% will score between 55 and 85

This rule provides a clear visualisation of data spread. The only requirement is that the data must be normally distributed.

Empirical rule in standard deviation

Steps to Add Standard Deviation Bars in Excel Charts

Adding standard deviation bars to your charts helps visually represent the spread of data points and variability within your dataset. This makes your presentation easy to understand, even to those who don’t have any technical knowledge of standard deviation.

Step 1: Calculate the Standard Deviation for your Data

Calculate the Standard Deviation for your Data

Step 2: Create a Chart

  • Select the data range
  • Go to the Insert tab in the Excel ribbon.
  • Choose a Chart Type (such as Column or Line Chart) from the Charts section.
  • In the above example, a line chart would be best to showcase the ups and downs of the data.
Create a Chart

Step 3: Add Error Bars

  • Once your chart is selected, go to the Chart Tools section and click on the “Chart Design” tab.
  • Click Add Chart Element > Error Bars > More Error Bars Options.
Add Error Bars

Step 4: Customise the Error Bar

  • In the Format Error Bars pane, select Custom under the Error Amount section.
  • Click Specify Value to enter the custom values for the error bars.
  • In the Positive Error Value and Negative Error Value boxes, input the standard deviation value that you calculated in step 1. Ensure the values are entered in array format using curly braces (e.g., {5, 5, 5, 5} if all values have the same deviation). You can also reference a cell range.
  • You can further format the line, style, width, etc., of the error bar.
Customise the Error Bar

Step 5: Interpret the Standard Deviation Bars

The standard deviation bars represent how much the quiz scores deviate from the mean. A larger bar indicates a wider spread of scores, and a smaller bar indicates that the scores are more tightly grouped around the mean.

Interpret the Standard Deviation Bars

How to Interpret Standard Deviation Results

Knowing the standard deviation value can help you interpret the results in a lot of ways

1. Understanding the spread of data

  • Values required: Mean of the data and Standard deviation of the data
  • While there’s no fixed threshold, a common guideline is:

High SD: If the SD is greater than 50% of the mean value, it indicates a high spread

Low SD: If the SD is less than 50% of the mean value, then this indicates a low spread.

How to Interpret Standard Deviation Results

2. Identifying Outliers

  • Values required: mean ± 3 SDs
  • If a data point lies outside the range of the mean ± 3 SDs, it is highly likely to be an outlier.
  • If adding a data point significantly increases or decreases the standard deviation value, then that data value is highly likely an outlier.

Difference between Standard Deviation and Standard Error

The table below summarises the key differences between Standard Deviation (SD) and Standard Error (SE).

Feature Standard Deviation (SD) Standard Error (SE)
What it measures Dispersion or spread of individual data points Precision of the sample mean as an estimate of the population mean
Formula √(∑(xi − μ)² / N) SD / √n
Use Describes the variability within a dataset Describes how reliably the sample mean estimates the population mean
Applies to Population or sample dataset Sample mean estimate of the population mean
Interpretation Higher SD means greater variability among data points Higher SE means less precision in estimating the population mean
Example Use Case Variation in quiz scores or sales figures Estimating the population mean from a sample mean

Common Mistakes to Avoid While Calculating the Standard Deviation in Excel

1. Using Non-Boolean Categorical Values in =STDEVA

In Excel, the STDEVA function treats logical values TRUE and FALSE as 1 and 0, respectively. However, non-Boolean text entries like “Yes” or “Pending” are also treated as 0, which can distort results, often to a standard deviation of zero. To avoid this problem, make sure the dataset contains TRUE or FALSE statements when using =STDEVA. On the other hand, if ‘Yes’ and ‘No’ are used, then these should be transformed into TRUE and FALSE for accurate calculations.

2. Mixing Data Types in Standard Deviation Calculations

Ensure that your data range includes consistent data types (e.g., just numbers or only Boolean values) when doing the standard deviation calculation in Excel. Incorrect results may arise from combining different data types, such as text or logical values with numbers. Excel may disregard text values or empty cells if you include them in the range for =STDEV.S or =STDEVP. This can distort your result, especially if there are a lot of non-numeric entries. Clean your data at all times to prevent this problem.

3. Ignoring Empty Cells

Keep in mind that standard deviation formulas, such as =STDEV.S or =STDEVA don’t count empty cells; they skip them. If your data range has any errors (like #DIV/0!), the formula will fail for sure. You need to remove any empty cells or errors in the data that could affect the calculation or raise errors.

Practical Applications of Standard Deviation

Standard deviation is the most common statistical tool used by students to professionals to analyse their data. Some of the practical applications of standard deviation are as follows:

  1. In engineering projects, precise measurements are crucial. Standard deviation can be used to assess the consistency of measurements in experiments. For example, if you’re measuring the dimensions of a product, a low standard deviation indicates that the measurements are consistent, while a high standard deviation may signal measurement errors or inconsistencies in the manufacturing process.
  2. In manufacturing industries, engineers and manufacturers use standard deviation to monitor product quality. When producing parts like nuts and bolts, the standard deviation of their sizes helps ensure that all products meet the required specifications. A low SD indicates that all the products are of the same size.
  3. In data science and machine learning, standard deviation helps evaluate the spread of data points. When training a machine learning model, to understand how widely data points vary around the mean, data scientists use the measure of standard deviation. This helps in making more accurate predictions or understanding the behaviour of the system being modelled.
Start Your Excel Journey – 100% Free
Learn the essentials of Excel through interactive lessons and practical examples.
quiz-icon

Conclusion

In this article, you explored the various methods for calculating standard deviation in Excel. You learnt about the two main types of datasets, namely sample and population, and how to calculate the standard deviation for each. Then we discussed how outliers affect the standard deviation calculation and how you can manage them. Additionally, you visualised the standard deviation, helping you create presentations that are easier to understand and communicate. You are now aware of the difference between standard deviation and standard error. For beginners, we also highlighted common mistakes to avoid when calculating standard deviation in your projects. This article summarises the importance of standard deviation and how you can leverage it to draw meaningful insights from your data.

To take your Excel skills to the next level, explore this comprehensive Excel training course and get hands-on practice. Also, get interview-ready with Excel interview questions curated by industry professionals.

How to Calculate Standard Deviation in Excel – FAQs

Q1. What is the formula for calculating SD?

The formula for calculating standard deviation is σ = √(Σ(xᵢ – μ)² / n), where σ is the standard deviation, xᵢ is each data point in the set, μ is the mean of the data set, n is the number of data points in the set and Σ represents the sum of all values following it.

Q2. Do I use STDEV.P or STDEV.S?

You should use STDEV.P for population data and STDEV.S for sample data.

Q3. What is the formula for STDEV.S in Excel?

In Excel, the formula for STDEV.S is =STDEV.S(range), which calculates the standard deviation for sample data.

Q4. How to calculate the standard deviation in a sheet?

In Excel, you can use the formula =STDEV.S(range) for sample data or =STDEV.P(range) for population data.

Q5. What is the standard deviation of 1, 2, 3, 4, 5?

The standard deviation of 1, 2, 3, 4, 5 is approximately 1.58, which is calculated using the formula σ = √(Σ(xᵢ – μ)² / n).

About the Author

Principal Data Scientist, Accenture

Meet Akash, a Principal Data Scientist with expertise in advanced analytics, machine learning, and AI-driven solutions. With a master’s degree from IIT Kanpur, Aakash combines technical knowledge with industry insights to deliver impactful, scalable models for complex business challenges.

EPGC Data Science Artificial Intelligence