Summary of sub queries used in MySQL

1. Classification of sub query

1.1. Non associated subquery

Subquery statements can be executed separately without reference to any content in the included statement. It is called non associated subquery.

1.1.1. Single row single column subquery

The subquery statement returns a single row and single column table. This type of subquery is called scalar subquery and can be located on either side of the common operators (=, < >, <, >, < =, > =).

SELECT account_id, product_cd, cust_id, avail_balance FROM account WHERE account_id = (
	SELECT MAX(account_id) FROM account
);

SELECT account_id, product_cd, cust_id, avail_balance FROM account WHERE open_emp_id <> (
	SELECT e.emp_id FROM employee e 
	INNER JOIN branch b ON e.assigned_branch_id = b.branch_id
	WHERE e.title = 'Head Teller' AND b.city = 'Woburn'
);

1.1.2. Multi row single column subquery

The subquery statement returns a table with multiple rows and single columns. Operators that can be used: IN, NOT IN, ALL, ANY.

The IN operator is used to see if an expression can be found IN an expression set.

# IN operator
SELECT emp_id, fname, lname, title FROM employee WHERE emp_id IN (
	SELECT superior_emp_id FROM employee
);

The NOT IN operator is used to check whether an expression is NOT IN an expression set.

# NOT IN operator
SELECT emp_id, fname, lname, title FROM employee WHERE emp_id NOT IN (
	SELECT superior_emp_id FROM employee WHERE superior_emp_id IS NOT NULL
);

The ALL operator is used to compare a single value with each value in the collection. To build this condition, you need to use a comparison operator (=, < >, <, >, < =, > =) with the ALL operator. The condition is true only when the comparison with ALL members in the set is true.

# The ID of any employee is different from that of any supervisor. The result here is the same as that of using NOT IN above.
SELECT emp_id, fname, lname, title FROM employee WHERE emp_id <> ALL (
	SELECT superior_emp_id FROM employee WHERE superior_emp_id IS NOT NULL
);

NOT IN is equivalent to the < > all operator. Generally, NOT IN can be used directly.
When using NOT IN or < > all operators to compare a value with a set, you need to ensure that the value set does not contain NULL values, because any behavior of comparing values with NULL will produce unknown results.

# Find an account whose available balance is less than (the balance of the account with the smallest balance of all Frank Tucker accounts).
SELECT account_id, product_cd, cust_id, avail_balance FROM account WHERE avail_balance < ALL(
	SELECT a.avail_balance FROM account a
	INNER JOIN individual i ON a.cust_id = i.cust_id
	WHERE i.fname = 'Frank' AND i.lname = 'Tucker'
);

ANY operator allows a value to be compared with each value in a set. As long as one holds, the condition is true.

# Find accounts with available balances greater than (the balance of any of Frank Tucker's accounts).
SELECT account_id, product_cd, cust_id, avail_balance FROM account WHERE avail_balance > ANY(
	SELECT a.avail_balance FROM account a
	INNER JOIN individual i ON a.cust_id = i.cust_id
	WHERE i.fname = 'Frank' AND i.lname = 'Tucker'
);

The IN = ANY operator is equivalent. Generally, you can use IN directly.

1.1.3. Multiple-column subqueries

The subquery statement returns a multi column table.

SELECT account_id, product_cd, cust_id FROM account WHERE(open_branch_id, open_emp_id) IN (
	SELECT b.branch_id, e.emp_id FROM branch b 
	INNER JOIN employee e ON b.branch_id = e.assigned_branch_id
	WHERE b.name = 'Woburn Branch' AND (e.title = 'Teller' OR e.title = 'Head Teller')
);

Single column subquery equivalent to the above statement:

SELECT account_id, product_cd, cust_id FROM account WHERE open_branch_id = (
	SELECT branch_id FROM branch WHERE name = 'Woburn Branch' AND open_emp_id IN (
		SELECT emp_id FROM employee WHERE title = 'Teller' OR title = 'Head Teller'
	)
);

1.2. Associated subquery

The associated subquery depends on the containing statement and references one or more of its columns. Unlike the non associated subquery, the associated subquery is not executed once before the inclusion statement is executed, but once for each candidate row.

#Calculate the number of accounts of each customer and retrieve the customers with two accounts
#First retrieve the customer record from the customer table, and then execute a sub query for each customer. Each execution must pass c.cust to the sub query_ ID, if the subquery returns 2, the condition is met and the row is added to the result set.
SELECT c.cust_id, c.cust_type_cd, c.city FROM customer c WHERE 2 = (
	SELECT COUNT(*) FROM account a WHERE a.cust_id = c.cust_id
);
#Retrieve all customers with total accounts between 5000 and 10000.
SELECT c.cust_id, c.cust_type_cd, c.city FROM customer c WHERE (
	SELECT SUM(a.avail_balance) FROM account a WHERE a.cust_id = c.cust_id
) BETWEEN 5000 AND 10000;

The EXISTS and NOT EXISTS operators will check whether the sub query can return at least one row. Therefore, the conditions containing the query only need to know how many rows the sub query returns, regardless of the specific content of the result. Therefore, SELECT 1 can be used directly here.

#EXISTS, query the accounts with transaction records on January 15, 2000
SELECT a.account_id, a.product_cd, a.cust_id, a.avail_balance FROM account a WHERE EXISTS (
	SELECT 1 FROM `transaction` t WHERE t.account_id = a.account_id AND t.txn_date = '2000-1-15'
);

#NOT EXISTS to query all non-commercial customers
SELECT a.account_id, a.product_cd, a.cust_id FROM account a WHERE NOT EXISTS (
	SELECT 1 FROM business b WHERE b.cust_id = a.cust_id
);

Update the data and update the latest transaction time to the account table.

UPDATE account a SET a.last_activity_date = (
	SELECT MAX(t.txn_date) FROM `transaction` t WHERE t.account_id = a.account_id
)WHERE EXISTS (
	SELECT 1 FROM `transaction` t WHERE t.account_id = a.account_id
);

Delete data: delete the Department without employees.

DELETE FROM department WHERE NOT EXISTS (
	SELECT 1 FROM employee WHERE employee.dept_id = department.dept_id
);

Note that when the DELETE statement uses the associated subquery in MySQL, the table alias cannot be used, and a syntax error will be reported.

mysql> DELETE FROM department d WHERE NOT EXISTS (
	SELECT 1 FROM employee e WHERE e.dept_id = d.dept_id
);
#The following error is prompted
1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'd WHERE NOT EXISTS (
	SELECT 1 FROM employee e WHERE e.dept_id = d.dept_id
)' at line 1
mysql> 

2. Use of sub query

2.1. As a data source

Subqueries used in the FROM clause must be non associative. They are executed first and then remain in memory until the containing query is executed.

#Query the number of employees in each department
SELECT d.dept_id, d.`NAME`, e_cnt.how_many FROM department d 
INNER JOIN (
	SELECT dept_id, COUNT(*) how_many FROM employee GROUP BY dept_id
) e_cnt ON d.dept_id = e_cnt.dept_id;

In addition to using queries to summarize existing data, sub queries can also be used to generate data that does not exist in the database.

#Customers are grouped by account balance, but the definitions of these groups are not stored in the database at all. It needs to be generated with sub query.
SELECT `groups`.name, COUNT(*) num_customers FROM (
	#Generate temporary tables using subqueries
	SELECT SUM(a.avail_balance) cust_balance FROM account a 
	INNER JOIN product p ON a.product_cd = p.product_cd 
	WHERE p.product_type_cd = 'ACCOUNT' GROUP BY a.cust_id
) cust_rollup 
INNER JOIN (
	#Use subqueries to generate temporary tables. Note that groups must be apostrophed, or syntax errors will be reported (because groups is a reserved word added in version 8.0)
	SELECT 'Small Fry' name, 0 low_limit, 4999.99 high_limit
	UNION ALL
	SELECT 'Average Joes' name, 5000 low_limit, 9999.99 high_limit
	UNION ALL
	SELECT 'Heavy Hitters' name, 10000 low_limit, 9999999.99 high_limit
) `groups` ON cust_rollup.cust_balance BETWEEN `groups`.low_limit AND `groups`.high_limit
GROUP BY `groups`.name;

Of course, we can directly create a new table to record these non-existent data, but after a long time, we will find that the tables built for these special purposes in the database become many, messy and difficult to maintain. Therefore, it is better to directly use sub query for such similar requirements. Adhere to such a principle: only when there is a clear business need to save these new data can the corresponding new tables be added to the database.

#Group according to account type, account opening employee and account opening bank. Sum all savings account balances
SELECT p.`name` product, b.`name` branch, CONCAT(e.fname,' ',e.lname) name, account_groups.tot_deposits FROM (
	SELECT product_cd, open_branch_id, open_emp_id, SUM(avail_balance) tot_deposits 
	FROM account GROUP BY product_cd, open_branch_id, open_emp_id
) account_groups
INNER JOIN employee e ON e.emp_id = account_groups.open_emp_id
INNER JOIN branch b ON b.branch_id = account_groups.open_branch_id
INNER JOIN product p ON p.product_cd = account_groups.product_cd
WHERE p.product_type_cd = 'ACCOUNT' ORDER BY 1,2;

#Equivalent SQL statement. Sort is added here
SELECT p.`name` product, b.`name` branch, CONCAT(e.fname,' ',e.lname) name, SUM(avail_balance) tot_deposits 
FROM account a 
INNER JOIN employee e ON e.emp_id = a.open_emp_id
INNER JOIN branch b ON b.branch_id = a.open_branch_id
INNER JOIN product p ON p.product_cd = a.product_cd
WHERE p.product_type_cd = 'ACCOUNT'
GROUP BY p.`name`, b.`name`, e.fname, e.lname ORDER BY 1,2;

2.2. As filter condition

#Query employees with the most accounts
SELECT open_emp_id, COUNT(*) how_many FROM account 
GROUP BY open_emp_id HAVING COUNT(*) = (
	SELECT MAX(emp_cnt.how_many) FROM (
		SELECT COUNT(*) how_many FROM account GROUP BY open_emp_id
	) emp_cnt
);

2.3. As an expression builder

Single row single column subquery, that is, scalar subquery, can be used not only in filtering conditions, but also in any position where expressions can appear, including SELECT clause, ORDER BY clause and VALUES clause in INSERT statement.

Use subqueries in the SELECT clause.

#Group according to account type, account opening employee and account opening bank. Sum all savings account balances
SELECT all_prods.product, all_prods.branch, all_prods.name, all_prods.tot_deposits
FROM (
	SELECT (
		SELECT p.name FROM product p WHERE p.product_cd = a.product_cd AND  p.product_type_cd = 'ACCOUNT'
	) product,
	(
		SELECT b.name FROM branch b WHERE b.branch_id = a.open_branch_id
	) branch,
	(
		SELECT CONCAT(e.fname,' ',e.lname) name FROM employee e WHERE e.emp_id = a.open_emp_id
	) name,
	SUM(a.avail_balance) tot_deposits
	FROM account a
	GROUP BY a.product_cd, a.open_branch_id, a.open_emp_id
) all_prods
WHERE all_prods.product IS NOT NULL
ORDER BY 1,2;

Use subqueries in the ORDER BY clause.

#Query employee data and sort according to the boss's last name and employee's last name.
SELECT emp.emp_id, CONCAT(emp.fname, ' ', emp.lname) emp_name ,(
	SELECT CONCAT(boss.fname, ' ', boss.lname) FROM employee boss WHERE boss.emp_id = emp.superior_emp_id
) boss_name
FROM employee emp WHERE emp.superior_emp_id IS NOT NULL
ORDER BY (
	SELECT boss.lname FROM employee boss WHERE boss.emp_id = emp.superior_emp_id
), emp.lname;

In addition to the above associated scalar subqueries, you can also use non associated scalar subqueries to generate values for the INSERT clause.

#Generate a new account
INSERT INTO `account` (`account_id`, `product_cd`, `cust_id`, `open_date`, `last_activity_date`, `status`, `open_branch_id`, `open_emp_id`, `avail_balance`, `pending_balance`) 
VALUES (
	NULL, 
	(SELECT product_cd FROM product WHERE name = 'savings account'), 
	(SELECT cust_id FROM customer WHERE fed_id = '555-55-5555'), 
	'2000-01-15',
	'2005-01-04',
	'ACTIVE', 
	(SELECT branch_id FROM branch WHERE name = 'Quincy Branch'), 
	(SELECT emp_id FROM employee WHERE lname = 'Portman' AND fname = 'Frank'), 
	0, 
	0
);

This method has a disadvantage. When the inserted column is allowed to be NULL, even if the subquery cannot return the value, the INSERT statement will succeed. At this time, the field will be set to NULL, which may deviate from our expectation.
Another way to insert query results into a new table:
(1) If the field types and order of two tables (export table and target table) are the same:

INSERT INTO target_table_name SELECT * FROM source_table_name;

Example:

INSERT INTO `account` SELECT NULL, 'SAV', 5, '2000-01-15', NULL, '2005-01-04','ACTIVE', 3, 15, 0, 0 FROM DUAL;

(2) If you only want to import some of the specified fields, you need to specify the field name of the target table, and the exported table needs to arrange the query fields in the same field type order:

INSERT INTO target_table_name (column1, column2, ...) SELECT column1x, column2x, ... FROM source_table_name;

Example:

INSERT INTO `account` ( `product_cd`, `cust_id`, `open_date`, `last_activity_date`, `status`, `open_branch_id`, `open_emp_id`, `avail_balance`, `pending_balance`) 
SELECT 'SAV', 5, '2000-01-15', '2005-01-04','ACTIVE', 3, 15, 0, 0 FROM DUAL;

SQL inserts the query results into another table

3. Summary of sub query

  • The results returned by sub query can be single row and single column, multiple rows and single column, multiple rows and multiple columns.
  • Subqueries can be independent of include statements (non associative subqueries).
  • A subquery can refer to one or more rows in the containing statement (associated subquery).
  • Subqueries can be used in conditions that use comparison operators and other special-purpose operators (in, not in, exists, not exists).
  • Subqueries can appear in select, update, delete and insert statements.
  • The result set generated by the subquery can be connected with other tables or other subqueries.
  • A subquery can generate values to fill some columns in a table or query result set.
  • Subqueries can be used for select, from, where, having, and order by clauses in queries.

Keywords: Database MySQL SQL

Added by veveu on Thu, 10 Feb 2022 15:28:48 +0200