Overview

When writing queries, you will sometimes encounter situations where you need to use multiple JOIN statements. This can lead to poor readability. You can make your queries more readable by utilizing the Common Table Expression (CTE).

What is CTE

Common Table Expression (CTE) is a feature used in SQL to define temporary result sets. CTEs can break complex queries into multiple logical steps that are easier to read and manage. CTEs are defined using the WITH syntax, and are temporarily created within a query and exist only during the execution of that query.

CTEs are especially useful when writing recursive queries or simplifying hierarchical structures of data and complex join operations.

Use cases

CTEs help make query statements easier to understand logically.

Basic usage

1WITH CTE_Name AS (
2    SELECT ColumnName1, ColumnName2
3    FROM table name
4    WHERE Condition
5)
6SELECT
7FROM CTE_Name;

It’s as simple as the above, and the WITH statement can be referenced while performing the query.

Used in recursive statements

 1WITH RECURSIVE SubCategories AS (
 2    SELECT
 3        CategoryId,
 4        ParentCategoryId,
 5        Name,
 6        1 AS Level -- Set the starting level to 1
 7    FROM
 8        Categories
 9    WHERE
10        CategoryId = 1 -- starting category ID
11
12    UNION ALL
13
14    SELECT
15        c.CategoryId,
16        c.ParentCategoryId,
17        c.Name,
18        sc.Level + 1 -- increase child level
19    FROM
20        Categories c
21            JOIN SubCategories sc ON c.ParentCategoryId = sc.CategoryId
22)
23SELECT * FROM
24FROM SubCategories
25ORDER BY Level, CategoryId;

The above example starts with a category with a CategoryId of 1 and retrieves that category and all of its subcategories. A CTE that starts with the RECURSIVE keyword consists of two parts. The base case and the recursive case.

  • Base case: The first SELECT statement defines the starting point for the recursion. In this case, we select a category with a CategoryId of 1.
  • Recursive case: The SELECT statement following the UNION ALL recursively looks up the subcategories. The query uses the SubCategories CTE, which references itself, to find the subcategories of the parent category at each step.

The Level column shows the hierarchy level, indicating how deep each category is from the initial category. This makes it easy to understand the hierarchy of each category.

Hands-on

  1. DDL, insert test data

     1CREATE TABLE Categories (
     2    CategoryId INT AUTO_INCREMENT PRIMARY KEY,
     3    ParentCategoryId INT NULL,
     4    Name VARCHAR(255),
     5    FOREIGN KEY (ParentCategoryId) REFERENCES Categories(CategoryId)
     6);
     7
     8INSERT INTO Categories (CategoryId, ParentCategoryId, Name) VALUES
     9(1, NULL, 'Electronics'),
    10(2, 1, 'Computers'),
    11(3, 2, 'Laptops'),
    12(4, 2, 'Desktops'),
    13(5, 1, 'Cameras'),
    14(6, 5, 'DSLRs'),
    15(7, 5, 'Point & Shoot');
    
  2. execute query

     1+------------+------------------+---------------+-------+
     2| CategoryId | ParentCategoryId | Name          | Level |
     3+------------+------------------+---------------+-------+
     4|          1 |             NULL | Electronics   |     1 |
     5|          2 |                1 | Computers     |     2 |
     6|          5 |                1 | Cameras       |     2 |
     7|          3 |                2 | Laptops       |     3 |
     8|          4 |                2 | Desktops      |     3 |
     9|          6 |                5 | DSLR          |     3 |
    10|          7 |                5 | Point & Shoot |     3 |
    11+------------+------------------+---------------+-------+
    

Complex JOIN statement

  • The Products table: Stores product information.
  • Sales` table: Stores details of each sales transaction.
  • Employees` table: Stores employee information.
 1-- ProductSales CTE
 2WITH ProductSales AS (
 3    SELECT
 4        p.ProductId,
 5        p.ProductName,
 6        SUM(s.Quantity) AS TotalQuantity
 7    FROM
 8        Products p
 9            JOIN
10        Sales s ON p.ProductId = s.ProductId
11    GROUP BY
12        p.ProductId, p.ProductName
13),
14
15-- EmployeeSales CTE
16     EmployeeSales AS (
17         SELECT
18             e.EmployeeId,
19             e.EmployeeName,
20             s.ProductId,
21             SUM(s.Quantity) AS QuantitySold
22         FROM
23             Sales s
24                 JOIN
25             Employees e ON s.EmployeeId = e.EmployeeId
26         GROUP BY
27             e.EmployeeId, e.EmployeeName, s.ProductId
28     )
29
30-- Main query
31SELECT
32    ps.ProductName,
33    ps.TotalQuantity,
34    es.EmployeeName,
35    es.QuantitySold
36FROM
37    ProductSales ps
38        JOIN
39    EmployeeSales es ON ps.ProductId = es.ProductId
40ORDER BY
41    ps.ProductName, es.EmployeeName;
  • ProductSales CTE: JOIN the Products and Sales tables to calculate the total number of sales (TotalQuantity) for each product.
  • EmployeeSales CTE: JOIN the Sales and Employees tables to calculate the products sold by each employee and their quantity (QuantitySold).
  • In the final SELECT statement, JOIN the ProductSales and EmployeeSales CTEs to produce a result that includes the total sales for each product and the name of the employee who sold that product and the amount sold.

Hands-on

  1. DDL, insert test data

     1CREATE TABLE Products (
     2    ProductId INT PRIMARY KEY,
     3    ProductName VARCHAR(255)
     4);
     5
     6INSERT INTO Products (ProductId, ProductName) VALUES
     7(1, 'Laptop'),
     8(2, 'Smartphone'),
     9(3, 'Tablet');
    10
    11CREATE TABLE Employees (
    12    EmployeeId INT PRIMARY KEY,
    13    EmployeeName VARCHAR(255)
    14);
    15
    16INSERT INTO Employees (EmployeeId, EmployeeName) VALUES
    17(1, 'John Doe'),
    18(2, 'Jane Smith'),
    19(3, 'Emily Jones');
    20
    21CREATE TABLE Sales (
    22    SaleId INT PRIMARY KEY,
    23    ProductId INT,
    24    EmployeeId INT,
    25    Quantity INT,
    26    FOREIGN KEY (ProductId) REFERENCES Products(ProductId),
    27    FOREIGN KEY (EmployeeId) REFERENCES Employees(EmployeeId)
    28);
    29
    30INSERT INTO Sales (SaleId, ProductId, EmployeeId, Quantity) VALUES
    31(1, 1, 1, 10),
    32(2, 1, 2, 5),
    33(3, 2, 1, 8),
    34(4, 2, 3, 6),
    35(5, 3, 2, 3),
    36(6, 3, 3, 2),
    37(7, 1, 3, 7),
    38(8, 2, 2, 4);
    
  2. execute the query

     1+-------------+---------------+--------------+--------------+
     2| ProductName | TotalQuantity | EmployeeName | QuantitySold |
     3+-------------+---------------+--------------+--------------+
     4| Laptop      |            22 | Emily Jones  |            7 |
     5| Laptop      |            22 | Jane Smith   |            5 |
     6| Laptop      |            22 | John Doe     |           10 |
     7| Smartphone  |            18 | Emily Jones  |            6 |
     8| Smartphone  |            18 | Jane Smith   |            4 |
     9| Smartphone  |            18 | John Doe     |            8 |
    10| Tablet      |             5 | Emily Jones  |            2 |
    11| Tablet      |             5 | Jane Smith   |            3 |
    12+-------------+---------------+--------------+--------------+
    

CTE Support

DBMSWITH SupportWITH RECURSIVE SupportRemarks
PostgresSQLYesYesYes
MySQLYesPartially supportedWITH RECURSIVE support in version 8.0 and later
SQLiteYesYesYes
SQL ServerYesYesYes
OracleYesYesYes
IBM DB2yesyesyes
MariaDBYesPartially supportedSupports WITH RECURSIVE in version 10.2.2 and later

Summary

In this post, we’ve covered the concept of CTEs and how they can improve the readability and structure of SQL queries with examples of their use. While CTEs are a powerful tool for writing complex queries, you should keep performance considerations in mind when using them, such as the size of your dataset and the potential for infinite loops in recursive queries.