Common Table Expressions (WITH) & Recursion
Name a query and reuse it like a temporary table with WITH — making complex multi-step SQL readable — and walk hierarchies with a recursive CTE.
What you will learn
- Define a named temporary result with a WITH (CTE) clause
- Chain multiple CTEs to break a hard query into readable steps
- Walk a hierarchy (like categories or org charts) with a recursive CTE
Naming a query to make it readable
When a query needs several steps — summarise, then filter that summary, then join it — nesting subqueries quickly becomes a tangle of brackets. A Common Table Expression (CTE) fixes this. Written with the WITH keyword, a CTE lets you give a query a name and then use that name like a temporary table in the main query below it. It is the same idea as a derived table, but pulled out to the top where it reads top-to-bottom like steps in a recipe.
We will use this orders table:
| id | user_id | total |
|---|---|---|
| 101 | 1 | 500 |
| 102 | 2 | 250 |
| 103 | 1 | 900 |
| 104 | 3 | 300 |
| 105 | 2 | 150 |
A single CTE
Say you want each user’s total spend, then only the users who spent over 600. With a CTE you compute the totals first, name them user_totals, then query that name:
WITH user_totals AS (
SELECT user_id, SUM(total) AS spent
FROM orders
GROUP BY user_id
)
SELECT user_id, spent
FROM user_totals
WHERE spent > 600;Reading it top to bottom: WITH user_totals AS ( ... ) defines the CTE — everything in the brackets is an ordinary query that totals each user’s spend. After the closing bracket, the main query treats user_totals exactly like a table, selecting from it and filtering with WHERE spent > 600. The CTE exists only for this one statement, then disappears.
Note: Output: user_id spent 1 1400 User 1 spent 500 + 900 = 1400 (above 600); users 2 (400) and 3 (300) fall below the cut-off. The result is identical to the derived-table version from the subqueries lesson — but notice how much easier it is to read with the summary named and lifted to the top.
Chaining several CTEs
The real power appears when a problem has multiple stages. You can define several CTEs separated by commas, and each later one can use the earlier ones — turning a scary query into clear, named steps. Here we first total each user’s spend, then compute the average of those totals, then keep users who beat that average:
WITH user_totals AS (
SELECT user_id, SUM(total) AS spent
FROM orders
GROUP BY user_id
),
average_spend AS (
SELECT AVG(spent) AS avg_spent
FROM user_totals
)
SELECT user_id, spent
FROM user_totals, average_spend
WHERE spent > avg_spent;Step by step: the first CTE user_totals produces each user’s spend (1400, 400, 300). The second CTE average_spend reads user_totals to get the average of those numbers (700). The main query then compares each user’s spend against that single average and keeps only those above it. Each step has a clear name, so anyone reading the query can follow the logic without untangling nested brackets.
Note: Output: user_id spent 1 1400 The three user totals are 1400, 400 and 300; their average is about 700. Only user 1’s 1400 beats it. Written as nested subqueries this would be hard to follow — as named CTEs it reads like three plain sentences.
Recursive CTEs — walking a hierarchy
A recursive CTE is a special form that can refer to itself, letting it walk tree-shaped data — a category with sub-categories, an employee’s chain of managers, a folder of sub-folders. It is the one clean way to follow such links to any depth. Suppose a categories table where each row has a parent_id pointing to its parent category (top-level categories have NULL):
WITH RECURSIVE category_tree AS (
SELECT id, name, parent_id, 1 AS depth
FROM categories
WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.name, c.parent_id, t.depth + 1
FROM categories AS c
INNER JOIN category_tree AS t ON c.parent_id = t.id
)
SELECT id, name, depth FROM category_tree ORDER BY depth, id;A recursive CTE has two halves joined by UNION ALL. The first part (the *anchor*) selects the starting rows — here the top-level categories where parent_id IS NULL, marked depth 1. The second part (the *recursive* member) joins the categories table back to the CTE itself (category_tree), finding the children of rows already collected and adding 1 to their depth. MySQL repeats that second part — children, then grandchildren — until no new rows are found, then stops. WITH RECURSIVE is required to allow the self-reference.
Note: Output (sample): id name depth 1 Electronics 1 2 Clothing 1 3 Phones 2 4 Laptops 2 5 Smartphones 3 The query started at the top-level categories (depth 1), then found their children (Phones, Laptops at depth 2), then the children of those (Smartphones at depth 3) — following the tree down as far as it goes. This is how you list a whole category hierarchy or an org chart in one query.
Watch out: A recursive CTE that never stops finding new rows would loop forever, so MySQL applies a safety limit (cte_max_recursion_depth, default 1000). If you hit it, your data probably has a cycle (A is a child of B which is a child of A) — fix the data or add a depth condition.
Tip: Use a plain CTE to name and reuse a sub-result and make multi-step queries readable; reach for a RECURSIVE CTE only when you must walk hierarchical or tree-shaped data. A CTE and a derived table do the same job — prefer the CTE when the query has several stages, because the named, top-down form is far easier to read and maintain.
Q. What does a CTE (the WITH clause) give you compared with deeply nested subqueries?
✍️ Practice
- Rewrite a derived-table query from the subqueries lesson as a
WITHCTE and confirm it returns the same result. - Chain two CTEs: one that summarises, and one that uses the first’s result.
🏠 Homework
- Build a small
categoriestable with aparent_idcolumn and write a recursive CTE that lists every category with its depth in the tree.