SQL Aggregate Functions

What do you mean by Aggregate Functions in SQL?

An SQL aggregate function is an out-of-the-box function used in SQL to compute a value from a pool of values and return a summary value. Aggregate functions work against a multi-row result set and return one summary value. In this way, aggregate functions can help summarize a large dataset to provide an overview. They are often used for business intelligence, sales reports, financial summaries, and performance metrics.

Why do we need Aggregate Functions in SQL?

Summarize Large Data Sets: Aggregate functions allow users to generate a single result from several rows, making the data analysis process less complex.
Enhance Query Efficiency: It reduces the count of returned rows, which makes SQL queries much more efficient and effective.
Simplifying Complicated Computation: Aggregate functions can be used to simplify the computing process instead of manually calculating each value.
Support Decision-Making: Produce data of significance that can help in business and operational decisions.
Data Analysis and Reporting: The Aggregate function helps in analyzing trends, such as calculating the total sales over some time.

When do we need to use Aggregate functions in SQL?

We can use aggregate functions to compute total revenue, maximum sales, etc.

Example: Calculating overall sales revenue.

When reports are being generated, financial reports, sales summaries, and performance metrics are produced by businesses and analysts using aggregate functions.

Example: We can use it to calculate the number of orders in an online store

The need to group and summarize data arises when you want an analysis of that data category by some criteria.

Example: Calculating the total sales of each region

To filter grouped data by conditions, you must use the HAVING clause with aggregate functions.

Example: For calculating the average salary of a department whose salary is above 50,000

Before moving forward with the aggregate function, we need to create a table for Device Maintenance.

CREATE TABLE dvc_mnt (

    tckt_id INT PRIMARY KEY,

    dvc_owner VARCHAR(60),

    city VARCHAR(50),

    dvc_type VARCHAR(40),

    issue VARCHAR(100),

    cost DECIMAL(10,2),

    tch_name VARCHAR(50)

);

INSERT INTO dvc_mnt (

    tckt_id, dvc_owner, city, dvc_type, issue, cost, tch_name

) VALUES

(101, 'Sanjay', 'Mysuru', 'Router', 'Frequent disconnections', 750.00, 'Manoj'),

(102, 'Ritika', 'Kochi', 'Smartphone', 'Battery overheating', 1200.00, 'Neeraj'),

(103, 'Aravind', 'Jodhpur', 'Printer', 'Paper jam issue', 950.00, 'Sowmya'),

(104, 'Ishita', 'Trichy', 'Laptop', 'Screen flickering', 3000.00, 'Karthik'),

(105, 'Vimal', 'Guwahati', 'Smartwatch', 'Charging failure', 650.00, 'Divya'),

(106, 'Harshita', 'Raipur', 'Tablet', 'Software crash', 1800.00, 'Ajay'),

(107, 'Tejas', 'Varanasi', 'Monitor', 'No display', 2200.00, 'Reena');

Select * from dvc_mnt

This is how the table looks once it is created and inserted with values.

Aggregate Functions in MySQL

Aggregate functions include MIN(), MAX(), SUM(), AVG(), and COUNT()

MAX() in MySQL

The Largest (maximum) value from a given column is returned by the MAX() function, an aggregate function in MySQL.

Syntax:

SELECT MAX(column_name) AS max_value 

FROM table_name 

[WHERE condition];

Example:

— To find the highest estimated cost

SELECT MAX(cost) AS highest_cost 

FROM dvc_mnt;

Output:

Explanation: Here, to determine the highest estimated cost in the dataset, the MAX(cost) function extracts the largest value from the cost column.

MIN() in MySQL

The Smallest (minimum) value from a given column is returned by the MIN() function, an aggregate function.

Syntax:

SELECT MIN(clm_name) 

FROM table_name 

[WHERE condition];

Example:

-- To find the cheapest cost

SELECT MIN(cost) AS lowest_cost 

FROM dvc_mnt;

Output:

Explanation: Here, the lowest expected cost in the dataset is returned by the MIN(cost) function, which extracts only the smallest value from the cost column.

SUM() in MySQL

In MySQL, the SUM() function is an aggregate function that determines a numeric column’s total sum.

Syntax:

SELECT SUM(cl) 

FROM tab_name

Example:

-- To find the total costs

SELECT SUM(cost) AS total_costs 

FROM dvc_mnt;

Output:

Explanation: Here, by adding up the costs, the SUM(cost) function determines the total cost and returns a single aggregated value.

AVG() in MySQL

AVG() is an aggregate function in SQL that returns the mean (or average) of every value in a numeric column.

Syntax:

SELECT AVG(cl_name) 

FROM tab_name

Example:

-- To find the average expected costs 

SELECT AVG(cost) AS average_cost 

FROM dvc_mnt;

Output:

Explanation: Here, the AVG(cost) function first adds up the cost for each row, creating a summation of all items in the cost column, and calculates the average by dividing the total by the number of rows.

COUNT() in MySQL

COUNT() in MySQL returns the number of non-NULL values in a specified row or column. It is frequently employed to determine how many orders, clients, or records there are in a dataset.

Syntax:

SELECT COUNT(cl) 

FROM tab_name

Example:

-- To find the total number of issues in the dvc_mnt table

SELECT COUNT(*) AS total_issues 

FROM dvc_mnt

Output:

Explanation: Here, the COUNT(*) function returns the total number of issues placed after counting each row in the dvc_mnt table.

Using GROUP BY with Aggregate Function in MySQL

In MySQL, you are able to use the GROUP BY clause and aggregation functions to group a set of rows with the same value in a specific column. Using the GROUP BY clause, we can calculate totals and averages for each group of rows with aggregate functions.

Syntax:

SELECT cl_name, FUN(cl_name)   

FROM tab_name  

GROUP BY cl_name;

Example:

— Query to find the cost of each region

SELECT city, SUM(cost) AS total_cost  

FROM dvc_mnt  

GROUP BY city;

Output:

Explanation: Here, the total costs for each region are determined by SUM(cost), and records are grouped by city using GROUP BY city.

Using HAVING with Aggregate Function in MySQL

In MySQL, after aggregating the columns using SUM(), AVG(), or COUNT(), the HAVING clause is used to filter grouped results.

Syntax:

SELECT cl_name, FUN(cl_name)   

FROM tab_name  

GROUP BY cl_name  

HAVING cond;

Example:

— Query to find the city with total costs greater than 2000

SELECT city, SUM(cost) AS total_cost  

FROM dvc_mnt  

GROUP BY city  

HAVING total_cost > 2000;

Output:

Explanation: Here, the SUM(cost) determines the total costs for each region, and GROUP BY city groups the records. HAVING total_cost > 2000 filters only cities with total cost greater than 2000.

Using DISTINCT with Aggregate Function in MySQL

When using aggregate functions, the DISTINCT keyword ensures that only distinct values are selected in the results.

Syntax:

SELECT AGG_FUN(DISTINCT cl_name)  

FROM tab_name;

Example:

SELECT COUNT(DISTINCT dvc_owner) AS unique_customers  

FROM dvc_mnt;

Output:

Explanation: Here, the COUNT() function counts unique customers and ignores duplicates.

Common Errors to Avoid

Using aggregate functions without GROUP BY: Always use GROUP BY when you are selecting aggregated columns with non-aggregated columns.
Using HAVING without GROUP BY: The HAVING Clause only works with GROUP BY. So, remember to use the GROUP BY Clause when working with the HAVING Clause.
Use DISTINCT properly: The DISTINCT keyword only works on a single column and not on expressions
Using zero for division in AVG(): CASE WHEN can be used to handle NULL values to prevent division by zero.

Real-world Example

University System: A University wants to analyze the performance of its students across different departments.

Example:

CREATE TABLE students (

    student_id INT PRIMARY KEY,

    student_name VARCHAR(100),

    department VARCHAR(50),

    gpa DECIMAL(3,2)

);

INSERT INTO students (student_id, student_name, department, gpa) VALUES

(101, 'Aarav', 'Computer Science', 3.80),

(102, 'Bhavana', 'Mechanical', 3.40),

(103, 'Chirag', 'Electrical', 3.90),

(104, 'Deepika', 'Computer Science', 3.65),

(105, 'Eshan', 'Mechanical', 3.50),

(106, 'Farah', 'Civil', 3.20),

(107, 'Gautam', 'Electrical', 3.95),

(108, 'Harini', 'Computer Science', 3.70),

(109, 'Ishaan', 'Civil', 3.10),

(110, 'Jhanvi', 'Mechanical', 3.55);

SELECT 

    COUNT(student_id) AS total_students,

    AVG(gpa) AS avg_gpa,

    MAX(gpa) AS highest_gpa,

    MIN(gpa) AS lowest_gpa

FROM students;

Output:

Explanation: Here, AVG(GPA) gives the average GPA for all departments, MAX(GPA) and MIN(GPA) give the highest and lowest GPA of the students, respectively, and COUNT(student_id) provides the total number of students.

Best Practices

When selecting non-aggregate columns, use GROUP BY: Always include GROUP BY when using an aggregate function, along with other columns, to avoid errors.
Use HAVING to Filter Aggregates: HAVING will filter the data after it is aggregated, while WHERE filters the data before it is aggregated.
Optimize Performance using Indexing: Indexing a numeric column will increase the speed at which aggregate functions can be calculated.
Provide for NULL: Use COALESCE() or IFNULL() to avoid NULL values that can affect results.
Do not use aggregate functions in the WHERE clause: It is not better to perform actions with aggregate data in the WHERE Clause.
Beware of division by zero: Using CASE WHEN should relieve errors when presenting calculations of averages or ratios.

Conclusion

SQL aggregate functions are valuable for aspects of data analysis that involve summarizing data, and the use of these functions assists businesses with summarizing large amounts of data to derive meaningful insights. SUM(), AVG(), MAX(), MIN(), and COUNT() can save a lot of time and effort by performing a variety of calculations without manual calculation. In this blog, you have gained knowledge on how these aggregate functions can be used in SQL.

Take your skills to the next level by enrolling in our SQL Training Course today and gaining hands-on experience. Also, prepare for job interviews with our SQL interview questions, prepared by industry experts.