Querying DataExtra· 35 min read

Subqueries & Nested SELECTs

Use the result of one query inside another — to filter by a computed value, match a list, or build a temporary table on the fly.

What you will learn

  • Use a scalar subquery to compare against a computed value
  • Filter with IN and EXISTS subqueries
  • Build a derived table in the FROM clause

A query inside a query

A subquery is a SELECT written inside another query, wrapped in parentheses. MySQL runs the inner query first, then uses its result in the outer one. This lets you ask questions that need two steps — like “show me orders bigger than the average order”, where you must first compute the average, then compare against it.

We will use this orders table throughout (totals: 500, 250, 900, 300, 150 — the average is 420):

iduser_idtotal
1011500
1022250
1031900
1043300
1052150

Scalar subquery — compare against one computed value

A scalar subquery returns a single value (one row, one column). You can drop it straight into a WHERE comparison. Here we find orders whose total is above the average total:

Orders above the average order value
SELECT id, total
FROM orders
WHERE total > (SELECT AVG(total) FROM orders);

MySQL runs the inner SELECT AVG(total) FROM orders first — it returns the single number 420. The outer query then behaves as if you had typed WHERE total > 420, keeping only the orders above the average. The beauty is you never had to know the average yourself; the subquery computes it.

Note: Output: id total 101 500 103 900 The average is 420. Orders 101 (500) and 103 (900) are above it; the other three (250, 300, 150) are below, so they are filtered out. If new orders change the average, the same query stays correct.

IN subquery — match against a list the query produces

When the inner query returns many values (a whole column), use IN to keep outer rows whose value appears in that list. Say we want all orders placed by users who live in Mumbai — the inner query produces the list of Mumbai user ids, and the outer query keeps orders matching any of them:

Orders from users in a particular city
SELECT id, user_id, total
FROM orders
WHERE user_id IN (SELECT id FROM users WHERE city = 'Mumbai');

The inner SELECT id FROM users WHERE city = 'Mumbai' returns a list of ids — say (1, 2). The outer query then keeps every order whose user_id is IN that list. This is often cleaner than a JOIN when you only need to filter, not show columns from the other table.

Note: Output (assuming users 1 and 2 are in Mumbai): id user_id total 101 1 500 102 2 250 103 1 900 105 2 150 Order 104 (user 3) is excluded because user 3 is not in the Mumbai list returned by the inner query.

EXISTS — does any matching row exist?

EXISTS checks whether a subquery returns any rows at all — it answers a yes/no question rather than comparing values. It is the natural way to ask “show users who have placed at least one order”:

Users who have placed at least one order
SELECT name
FROM users
WHERE EXISTS (
  SELECT 1 FROM orders WHERE orders.user_id = users.id
);

For each user, MySQL runs the inner query, which looks for an order belonging to that user (orders.user_id = users.id links the inner query to the current outer row — this is called a correlated subquery). If even one such order exists, EXISTS is true and the user is kept. We select 1 inside because the actual value does not matter — only whether a row comes back.

Note: Output: name Asha Ravi Meera Every user who has at least one matching order is listed. Swap EXISTS for NOT EXISTS and you get the opposite — users with no orders at all.

Derived table — a subquery in FROM

You can also use a subquery as if it were a table, by putting it in the FROM clause. This is called a derived table. Here we first total each user’s spending in an inner query, then filter that summary in the outer query — without storing anything:

Filter a grouped summary using a derived table
SELECT user_id, total_spent
FROM (
  SELECT user_id, SUM(total) AS total_spent
  FROM orders
  GROUP BY user_id
) AS user_totals
WHERE total_spent > 600;

The inner query builds a small temporary table of each user and their total spend, which we name user_totals (a derived table must have an alias). The outer query then treats user_totals like any real table and keeps only the high spenders. This is one way to “filter on an aggregate of an aggregate” cleanly.

Note: Output: userid totalspent 1 1400 User 1 spent 500 + 900 = 1400 (above 600); user 2 spent 250 + 150 = 400 and user 3 spent 300, both below the cut-off, so only user 1 remains.

Tip: Rule of thumb: use a scalar subquery to compare against one computed number, IN/EXISTS to filter by another table’s rows, and a derived table when you need to query the result of a GROUP BY. Many subqueries can also be written as JOINs — pick whichever reads more clearly.

Q. A subquery used inside a WHERE comparison like WHERE total > (SELECT AVG(total) FROM orders) must return…

Answer: A scalar subquery used in a comparison must return exactly one value. To match against many values you would use IN or EXISTS instead.

✍️ Practice

  1. Find all products priced above the average product price using a scalar subquery.
  2. Use IN with a subquery to list orders from customers in a chosen city.

🏠 Homework

  1. Write a derived-table query that lists each customer’s total spend and keeps only those above a threshold you choose.
Want to learn this with a mentor?

CodingClave runs guided, project-based training (28-day, 45-day & 6-month batches).

Explore Training →