Querying DataExtra· 35 min read

GROUP BY & HAVING

Turn rows into grouped reports — totals per category — and filter those groups with HAVING.

What you will learn

  • Group rows with GROUP BY for per-category totals
  • Combine GROUP BY with aggregate functions
  • Filter groups with HAVING (and know how it differs from WHERE)

From individual rows to grouped totals

Aggregate functions like COUNT() and SUM() squeeze many rows into one number. But usually you want one number per category — total sales per city, orders per customer, students per class. That is what GROUP BY does: it gathers rows that share a value into groups, then runs the aggregate once per group.

We will use this orders table for every example below:

iduser_idcitytotal
1011Mumbai500
1022Delhi250
1031Mumbai900
1043Delhi300
1052Mumbai150

GROUP BY in three steps

A grouped query has a clear recipe. Read the steps, then the code:

  1. Pick the grouping column and put it in both SELECT and GROUP BY — here, city.
  2. Add an aggregate that should run per group — SUM(total) to add up sales, COUNT(*) to count orders.
  3. MySQL forms one group per distinct value of the grouping column and returns one row for each.
Total sales and order count per city
SELECT city, SUM(total) AS sales, COUNT(*) AS orders
FROM orders
GROUP BY city;

Line by line: SELECT city shows which group each row is. SUM(total) AS sales adds up the total of every order inside that city group and labels the column sales. COUNT(*) AS orders counts how many orders are in the group. GROUP BY city is the instruction that makes MySQL bundle the rows by city first.

Note: Output: city sales orders Mumbai 1550 3 Delhi 550 2 The five order rows collapsed into two — one per city. Mumbai’s three orders (500 + 900 + 150) sum to 1550; Delhi’s two (250 + 300) sum to 550. The aggregate ran separately inside each group.

Watch out: The golden rule of GROUP BY: every column in your SELECT must either be in the GROUP BY list or be wrapped in an aggregate function. SELECT city, total with GROUP BY city makes no sense — which of the city’s three total values would it show? MySQL would error or pick one unpredictably.

HAVING — filtering the groups

Now suppose you only want cities whose total sales exceed 1000. You cannot use WHERE for this — WHERE filters individual rows before grouping, but “total sales” only exists after grouping. The clause that filters groups is HAVING. It looks just like WHERE, but it runs after GROUP BY and can use aggregate values.

Only cities with sales over 1000
SELECT city, SUM(total) AS sales
FROM orders
GROUP BY city
HAVING SUM(total) > 1000;

The query groups by city and totals each city’s sales exactly as before, but HAVING SUM(total) > 1000 then discards any group whose total is not above 1000. Order matters: WHEREGROUP BYHAVING. (You can use both — WHERE first trims raw rows, then HAVING trims the resulting groups.)

Note: Output: city sales Mumbai 1550 Delhi’s total of 550 is below 1000, so its group is removed by HAVING. Only Mumbai survives. If we had used WHERE SUM(total) > 1000 instead, MySQL would reject the query — you cannot aggregate in WHERE.

WHEREHAVING
FiltersIndividual rowsWhole groups
RunsBefore GROUP BYAfter GROUP BY
Can use aggregates (SUM, COUNT)NoYes

A worked example combining both

Realistic question: “Which cities had more than 1 order worth at least 200 each?” That needs WHERE (to keep only orders ≥ 200) and HAVING (to keep cities with more than one such order):

WHERE and HAVING working together
SELECT city, COUNT(*) AS big_orders
FROM orders
WHERE total >= 200
GROUP BY city
HAVING COUNT(*) > 1;

Note: Output: city big_orders Mumbai 2 First WHERE total >= 200 drops order 105 (150). The remaining Mumbai orders (101, 103) and Delhi orders (102, 104) are grouped; HAVING COUNT(*) > 1 keeps only cities with more than one — Mumbai has 2, Delhi has only 1, so Delhi is removed.

Tip: Remember the sentence: “WHERE filters rows, GROUP BY makes groups, HAVING filters groups.” Almost every SQL reporting question is some combination of those three.

Q. You want only the groups whose COUNT(*) is greater than 5. Which clause do you use?

Answer: Filtering on an aggregate (like COUNT) must happen AFTER grouping, which is what HAVING does. WHERE runs before grouping and cannot use aggregates.

✍️ Practice

  1. Group orders by user_id and show each user’s total spend with SUM(total).
  2. Use HAVING to show only users who placed more than 2 orders.

🏠 Homework

  1. Write a report query that lists each product category with its average price, then HAVING-filter to categories averaging over a chosen amount.
Want to learn this with a mentor?

CodingClave runs guided, project-based training (28-day, 45-day & 6-month batches).

Explore Training →