Welcome back to our SQL tutorial series! In this article, we’re going to explore the GROUP BY
clause, which is used to group rows that have the same values in specified columns into summary rows. Understanding how to use GROUP BY
is essential for aggregating data and performing complex analyses.
Understanding the GROUP BY Clause
The GROUP BY
clause groups rows that have the same values in specified columns into summary rows, like “find the number of employees in each department” or “total sales per region.” It is often used with aggregate functions such as COUNT
, SUM
, AVG
, MAX
, and MIN
.
Basic Syntax of GROUP BY
The basic syntax of the GROUP BY
clause is as follows:
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1;
Let’s look at some practical examples to understand how to use the GROUP BY
clause effectively.
Example Table: employees
We’ll continue to use the employees
table for our examples:
id | name | position | salary | department_id |
---|---|---|---|---|
1 | John Doe | Software Engineer | 80000.00 | 1 |
2 | Jane Smith | Data Analyst | 70000.00 | 1 |
3 | Emily Davis | Software Engineer | 75000.00 | 1 |
4 | Michael Brown | Project Manager | 80000.00 | 2 |
5 | Sarah Wilson | Data Analyst | 70000.00 | 2 |
departments
department_id | department_name |
---|---|
1 | IT |
2 | HR |
3 | Marketing |
Using GROUP BY with COUNT
To find the number of employees in each department, you use:
SELECT department_id, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id;
This command returns:
department_id | num_employees |
---|---|
1 | 3 |
2 | 2 |
Using GROUP BY with SUM
To calculate the total salary for each department, you use:
SELECT department_id, SUM(salary) AS total_salary
FROM employees
GROUP BY department_id;
This command returns:
department_id | total_salary |
---|---|
1 | 225000.00 |
2 | 150000.00 |
Using GROUP BY with AVG
To calculate the average salary for each department, you use:
SELECT department_id, AVG(salary) AS average_salary
FROM employees
GROUP BY department_id;
This command returns:
department_id | average_salary |
---|---|
1 | 75000.00 |
2 | 75000.00 |
Using GROUP BY with Multiple Columns
You can group by multiple columns to get more detailed summaries. For example, to find the number of employees in each position within each department, you use:
SELECT department_id, position, COUNT(*) AS num_employees
FROM employees
GROUP BY department_id, position;
This command returns:
department_id | position | num_employees |
---|---|---|
1 | Software Engineer | 2 |
1 | Data Analyst | 1 |
2 | Project Manager | 1 |
2 | Data Analyst | 1 |
Using GROUP BY with JOINs
You can use the GROUP BY
clause in conjunction with JOIN
operations to group data from multiple tables. For example, to find the total salary for each department by department name:
SELECT d.department_name, SUM(e.salary) AS total_salary
FROM employees e
JOIN departments d ON e.department_id = d.department_id
GROUP BY d.department_name;
This command returns:
department_name | total_salary |
---|---|
IT | 225000.00 |
HR | 150000.00 |
Using GROUP BY with HAVING
You can use the HAVING
clause to filter groups based on a condition. For example, to find departments with a total salary greater than $150,000:
SELECT department_id, SUM(salary) AS total_salary
FROM employees
GROUP BY department_id
HAVING SUM(salary) > 150000;
This command returns:
department_id | total_salary |
---|---|
1 | 225000.00 |
Conclusion
The GROUP BY
clause is a powerful tool for aggregating data and generating summary reports in SQL. By understanding how to use GROUP BY
with various aggregate functions and combining it with JOIN
and HAVING
clauses, you can perform complex data analyses and derive valuable insights from your data.
In our next tutorial, we’ll explore more advanced SQL topics and commands. Stay tuned and happy coding!