Overview
When writing queries, you will sometimes encounter situations where you need to use multiple JOIN statements. This can lead to poor readability. You can make your queries more readable by utilizing the Common Table Expression (CTE).
What is CTE
Common Table Expression (CTE) is a feature used in SQL to define temporary result sets. CTEs can break complex queries into multiple logical steps that are easier to read and manage. CTEs are defined using the WITH
syntax, and are temporarily created within a query and exist only during the execution of that query.
CTEs are especially useful when writing recursive queries or simplifying hierarchical structures of data and complex join operations.
Use cases
CTEs help make query statements easier to understand logically.
Basic usage
1WITH CTE_Name AS (
2 SELECT ColumnName1, ColumnName2
3 FROM table name
4 WHERE Condition
5)
6SELECT
7FROM CTE_Name;
It’s as simple as the above, and the WITH statement can be referenced while performing the query.
Used in recursive statements
1WITH RECURSIVE SubCategories AS (
2 SELECT
3 CategoryId,
4 ParentCategoryId,
5 Name,
6 1 AS Level -- Set the starting level to 1
7 FROM
8 Categories
9 WHERE
10 CategoryId = 1 -- starting category ID
11
12 UNION ALL
13
14 SELECT
15 c.CategoryId,
16 c.ParentCategoryId,
17 c.Name,
18 sc.Level + 1 -- increase child level
19 FROM
20 Categories c
21 JOIN SubCategories sc ON c.ParentCategoryId = sc.CategoryId
22)
23SELECT * FROM
24FROM SubCategories
25ORDER BY Level, CategoryId;
The above example starts with a category with a CategoryId
of 1 and retrieves that category and all of its subcategories. A CTE that starts with the RECURSIVE
keyword consists of two parts. The base case and the recursive case.
- Base case: The first
SELECT
statement defines the starting point for the recursion. In this case, we select a category with aCategoryId
of 1. - Recursive case: The
SELECT
statement following theUNION ALL
recursively looks up the subcategories. The query uses theSubCategories
CTE, which references itself, to find the subcategories of the parent category at each step.
The Level
column shows the hierarchy level, indicating how deep each category is from the initial category. This makes it easy to understand the hierarchy of each category.
Hands-on
DDL, insert test data
1CREATE TABLE Categories ( 2 CategoryId INT AUTO_INCREMENT PRIMARY KEY, 3 ParentCategoryId INT NULL, 4 Name VARCHAR(255), 5 FOREIGN KEY (ParentCategoryId) REFERENCES Categories(CategoryId) 6); 7 8INSERT INTO Categories (CategoryId, ParentCategoryId, Name) VALUES 9(1, NULL, 'Electronics'), 10(2, 1, 'Computers'), 11(3, 2, 'Laptops'), 12(4, 2, 'Desktops'), 13(5, 1, 'Cameras'), 14(6, 5, 'DSLRs'), 15(7, 5, 'Point & Shoot');
execute query
1+------------+------------------+---------------+-------+ 2| CategoryId | ParentCategoryId | Name | Level | 3+------------+------------------+---------------+-------+ 4| 1 | NULL | Electronics | 1 | 5| 2 | 1 | Computers | 2 | 6| 5 | 1 | Cameras | 2 | 7| 3 | 2 | Laptops | 3 | 8| 4 | 2 | Desktops | 3 | 9| 6 | 5 | DSLR | 3 | 10| 7 | 5 | Point & Shoot | 3 | 11+------------+------------------+---------------+-------+
Complex JOIN statement
- The
Products
table: Stores product information. - Sales` table: Stores details of each sales transaction.
- Employees` table: Stores employee information.
1-- ProductSales CTE
2WITH ProductSales AS (
3 SELECT
4 p.ProductId,
5 p.ProductName,
6 SUM(s.Quantity) AS TotalQuantity
7 FROM
8 Products p
9 JOIN
10 Sales s ON p.ProductId = s.ProductId
11 GROUP BY
12 p.ProductId, p.ProductName
13),
14
15-- EmployeeSales CTE
16 EmployeeSales AS (
17 SELECT
18 e.EmployeeId,
19 e.EmployeeName,
20 s.ProductId,
21 SUM(s.Quantity) AS QuantitySold
22 FROM
23 Sales s
24 JOIN
25 Employees e ON s.EmployeeId = e.EmployeeId
26 GROUP BY
27 e.EmployeeId, e.EmployeeName, s.ProductId
28 )
29
30-- Main query
31SELECT
32 ps.ProductName,
33 ps.TotalQuantity,
34 es.EmployeeName,
35 es.QuantitySold
36FROM
37 ProductSales ps
38 JOIN
39 EmployeeSales es ON ps.ProductId = es.ProductId
40ORDER BY
41 ps.ProductName, es.EmployeeName;
- ProductSales CTE: JOIN the
Products
andSales
tables to calculate the total number of sales (TotalQuantity) for each product. - EmployeeSales CTE: JOIN the
Sales
andEmployees
tables to calculate the products sold by each employee and their quantity (QuantitySold). - In the final
SELECT
statement, JOIN theProductSales
andEmployeeSales
CTEs to produce a result that includes the total sales for each product and the name of the employee who sold that product and the amount sold.
Hands-on
DDL, insert test data
1CREATE TABLE Products ( 2 ProductId INT PRIMARY KEY, 3 ProductName VARCHAR(255) 4); 5 6INSERT INTO Products (ProductId, ProductName) VALUES 7(1, 'Laptop'), 8(2, 'Smartphone'), 9(3, 'Tablet'); 10 11CREATE TABLE Employees ( 12 EmployeeId INT PRIMARY KEY, 13 EmployeeName VARCHAR(255) 14); 15 16INSERT INTO Employees (EmployeeId, EmployeeName) VALUES 17(1, 'John Doe'), 18(2, 'Jane Smith'), 19(3, 'Emily Jones'); 20 21CREATE TABLE Sales ( 22 SaleId INT PRIMARY KEY, 23 ProductId INT, 24 EmployeeId INT, 25 Quantity INT, 26 FOREIGN KEY (ProductId) REFERENCES Products(ProductId), 27 FOREIGN KEY (EmployeeId) REFERENCES Employees(EmployeeId) 28); 29 30INSERT INTO Sales (SaleId, ProductId, EmployeeId, Quantity) VALUES 31(1, 1, 1, 10), 32(2, 1, 2, 5), 33(3, 2, 1, 8), 34(4, 2, 3, 6), 35(5, 3, 2, 3), 36(6, 3, 3, 2), 37(7, 1, 3, 7), 38(8, 2, 2, 4);
execute the query
1+-------------+---------------+--------------+--------------+ 2| ProductName | TotalQuantity | EmployeeName | QuantitySold | 3+-------------+---------------+--------------+--------------+ 4| Laptop | 22 | Emily Jones | 7 | 5| Laptop | 22 | Jane Smith | 5 | 6| Laptop | 22 | John Doe | 10 | 7| Smartphone | 18 | Emily Jones | 6 | 8| Smartphone | 18 | Jane Smith | 4 | 9| Smartphone | 18 | John Doe | 8 | 10| Tablet | 5 | Emily Jones | 2 | 11| Tablet | 5 | Jane Smith | 3 | 12+-------------+---------------+--------------+--------------+
CTE Support
DBMS | WITH Support | WITH RECURSIVE Support | Remarks |
---|---|---|---|
PostgresSQL | Yes | Yes | Yes |
MySQL | Yes | Partially supported | WITH RECURSIVE support in version 8.0 and later |
SQLite | Yes | Yes | Yes |
SQL Server | Yes | Yes | Yes |
Oracle | Yes | Yes | Yes |
IBM DB2 | yes | yes | yes |
MariaDB | Yes | Partially supported | Supports WITH RECURSIVE in version 10.2.2 and later |
Summary
In this post, we’ve covered the concept of CTEs and how they can improve the readability and structure of SQL queries with examples of their use. While CTEs are a powerful tool for writing complex queries, you should keep performance considerations in mind when using them, such as the size of your dataset and the potential for infinite loops in recursive queries.