Tianchi Longzhu SQL training camp 04 Advanced SQL: set operation - addition and subtraction of tables, join, etc

This note is the learning content of Alibaba cloud Tianchi Longzhu plan SQL training camp. The link is: https://tianchi.aliyun.com/specials/promotion/aicampsql

1, Addition and subtraction of tables

1.1 what is set operation

A set represents "the sum of all kinds of things" in mathematics and a set of records in database Specifically, the execution results of tables, views and queries are a collection of records, in which the elements are each row in the table or query results.

In standard SQL, union, intersect, and excel are used to perform Union, intersection, and difference operations on the search results. Operators used for set operations such as union, intersect, and excel are called set operators.

The following Venn's diagram shows the basic operations of several sets


[pictures from the Internet]

In the database, all tables - and query results - can be regarded as sets, so tables can also be regarded as sets for the above set operations. In many cases, this abstraction is very helpful to give a feasible idea for complex query problems

1.2 table addition – UNION

1.2.1 UNION

For table creation code and data import, please use the code provided in Chapter 1
Since the table product2 was not imported earlier, the tutorial omitted. Here is a supplement:

CREATE TABLE `product2` (
  `product_id` char(4) NOT NULL,
  `product_name` varchar(100) NOT NULL,
  `product_type` varchar(32) NOT NULL,
  `sale_price` int DEFAULT NULL,
  `purchase_price` int DEFAULT NULL,
  `regist_date` date DEFAULT NULL,
  PRIMARY KEY (`product_id`)
) ;
INSERT INTO `product2`(`product_id`, `product_name`, `product_type`, `sale_price`, `purchase_price`, `regist_date`) VALUES ('0001', 'T shirt', 'clothes', 1000, 500, '2009-09-20');
INSERT INTO `product2`(`product_id`, `product_name`, `product_type`, `sale_price`, `purchase_price`, `regist_date`) VALUES ('0002', 'Punch', 'Office Supplies', 500, 320, '2009-09-11');
INSERT INTO `product2`(`product_id`, `product_name`, `product_type`, `sale_price`, `purchase_price`, `regist_date`) VALUES ('0003', 'motion T Shirt', 'clothes', 4000, 2800, NULL);
INSERT INTO `product2`(`product_id`, `product_name`, `product_type`, `sale_price`, `purchase_price`, `regist_date`) VALUES ('0009', 'glove', 'clothes', 800, 500, NULL);
INSERT INTO `product2`(`product_id`, `product_name`, `product_type`, `sale_price`, `purchase_price`, `regist_date`) VALUES ('0010', 'kettle', 'kitchenware', 2000, 1700, '2009-09-20');

Next, we will demonstrate the specific usage and query results of UNION:

SELECT product_id, product_name
  FROM product
 UNION
SELECT product_id, product_name
  FROM product2;

The above results include all the goods in the two tables You will find that this is the union operation in the set we learned in school, which can be seen more clearly through the venturi diagram (Fig. 7-1):

Through observation, it can be found that the three records with commodity numbers "0001" ~ "0003" exist in both tables. Therefore, you may think that duplicate records will appear in the results, but set operators such as UNION usually remove duplicate records

The above query is the union operation of two different tables For the same table, union can also be performed

**Exercise question: * * if the chain store wants to increase the inventory of goods with a gross profit margin of more than 50% or a selling price of less than 800, please use UNION to combine the query results of goods that meet the above two conditions respectively

The result should be similar to:

SELECT  product_id,product_name,product_type
       ,sale_price,purchase_price
  FROM product 
 WHERE sale_price<800
  
 UNION
 
SELECT  product_id,product_name,product_type
       ,sale_price,purchase_price
  FROM product 
 WHERE sale_price>1.5*purchase_price;


Thinking: how to write query statements if you don't use UNION?

SELECT  product_id,product_name,product_type
       ,sale_price,purchase_price
  FROM product 
 WHERE sale_price<800 or sale_price>1.5*purchase_price;

1.2.2 Union and OR predicates

For the above exercise, if you have written the query correctly, you will find that using UNION to take the UNION of two query results, using WHERE clause in a query, and then using OR predicate to connect two query conditions can get the same result

So is it unnecessary to introduce UNION? Of course not Indeed, for two different filter result sets in the same table, UNION is used to merge the two result sets, and the filter conditions of two sub queries are connected with OR predicate to get the same result. However, if you want to combine the results in two different tables, you have to use UNION

Moreover, even for the same table, UNION is sometimes used for query efficiency

Exercise:

Use UNION OR or predicates respectively to find goods with gross profit margin less than 30% OR unknown gross profit margin

OR predicate

SELECT  product_id,product_name,product_type
       ,sale_price,purchase_price
  FROM product 
 WHERE sale_price < purchase_price * 1.3 or purchase_price IS NULL;

UNION predicate

SELECT * 
  FROM product 
 WHERE sale_price < purchase_price * 1.3
 
 UNION
SELECT * 
  FROM product 
 WHERE purchase_price IS NULL;

1.2.3 set operation UNION ALL containing duplicate rows

In 1.1.1, we found that the UNION of SQL statement will merge and de duplicate the result sets of two queries. This de duplication will not only remove the duplicate rows in two result sets, but also remove the duplicate rows in one result set However, in practice, it is sometimes necessary not to duplicate the UNION. In fact, the syntax of retaining duplicate lines in the result of UNION is very simple. You only need to add the ALL keyword after UNION

For example, if you want to know the commodity types and quantity of each commodity contained in product and product2, the first step is to select the commodity type field of the two tables, and then use UNION ALL to merge without duplication Next, press product for the results of the two tables_ Type field grouping count

-- Keep duplicate lines
SELECT product_id, product_name
  FROM product
 UNION ALL
SELECT product_id, product_name
  FROM product2;

The query results are as follows:

Exercise:

If the store decides to increase the price of goods with profit less than 50% and selling price less than 1000 in the product table, please use the UNION ALL statement to combine the results that meet the above two conditions respectively The query results are similar to the following table:

SELECT *
  FROM product
	WHERE sale_price / purchase_price > 1.5
 UNION ALL
SELECT *
  FROM product
	WHERE sale_price < 1000;

1.2.4 [extended reading] bag model and set model

We learned in high school mathematics class that a remarkable feature of a set is that the elements in the set are different from each other When we regard the tables in the database as collections, there are actually some problems: whether it is intentional design or unintentional negligence, many tables in the database contain duplicate rows

Bag is a mathematical structure similar to set. The difference is that duplicate elements are allowed in bag. If the same element is added multiple times, there will be multiple elements in the bag

Through the above differences between bag and set definitions, we find that it is more appropriate to use bag model to describe the tables in the database

Whether to allow element repetition leads to some differences between set and bag Take the intersection of bags as an example. Since bags allow elements to appear repeatedly, for two bags, their union operation will be as follows: 1 Does this element appear in at least one bag
2. Calculate the maximum occurrences of this element in two bag s
Therefore, for a = {1,1,1,2,3,5,7} and B = {1,1,2,2,4,6,8}, their sum is equal to {1,1,1,2,2,3,4,5,6,7,8}

1.2.5 implicit type conversion

Generally speaking, we will use UNION to merge columns with exactly the same type and representing the same attribute. However, sometimes, even if the data types are not exactly the same, we will display two columns with different types in one column through implicit type conversion, such as string and numeric types:

SELECT product_id, product_name, '1'
  FROM product
 UNION
SELECT product_id, product_name,sale_price
  FROM product2;

The above query can be executed correctly and the following results are obtained:

Exercise:

Use the SYSDATE() function to return the current date and time, which is a date and time type of data. Try to test the compatibility of this data type with numeric, string and other types
For example, the following code can be executed correctly, indicating that the time date type and string, numerical value and missing value are compatible

SELECT SYSDATE(), SYSDATE(), SYSDATE()
 
 UNION
 
SELECT 'chars', 123,  null

Query results of the above Codes:

1.3 MySQL 8.0 does not support INTERSECT

The intersection of sets is the common part of two sets. Due to the mutual anisotropy of set elements, the significance of the intersection of sets can be seen intuitively only through Wen's graph

Although the delivery calculation of sets has appeared in the SQL standard for many years, it is a pity that MySQL still does not support INTERSECT operation as of MySQL version 8.0

SELECT product_id, product_name
  FROM product
  
INTERSECT
SELECT product_id, product_name
  FROM product2

Error:

1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT product_id,product_name
FROM product2' at line 6

1.3.1 intersection operation of bag

For two bag s, their intersection operation will be as follows:

  1. Does the element belong to two bag s at the same time
  2. The minimum number of occurrences of this element in two bag s is calculated

Therefore, for two bag s a = {1,1,1,2,3,5,7} and B = {1,1,2,2,4,6,8}, their delivery calculation results are equal to {1,1,2}

1.4 subtraction of difference set, complement set and table

The subtraction operation of finding the set difference set is somewhat different from the subtraction operation of real numbers. When subtracting another set B from one set A, the strategy of directly ignoring the elements that only exist in set B but not in set A is adopted. Therefore, subtraction of sets A and B only subtracts the elements that also belong to set B in set A.

1.4.1 MySQL 8.0 does not support excel operation

MySQL 8.0 does not support the subtraction operator excel However, with the help of the NOT IN predicate learned in Chapter 6, we can also realize table subtraction
Exercise:
Find the items that only exist in the product table but not in the product2 table

SELECT *
  FROM product
	WHERE product_id NOT IN (SELECT product_id 
                            FROM product2)

1.4.2 EXCEPT and NOT predicates

Through the MySQL solution of the above exercises, we find that using the NOT IN predicate can basically achieve the same effect as the excel operation in SQL standard syntax
Exercise:

Use the NOT predicate to subtract the set to find the goods in the product table with a selling price of more than 2000 but a profit of less than 30%. The results should be shown in the following table

SELECT *
  FROM product
	WHERE sale_price >2000 and product_id NOT in 
	(
		SELECT product_id
		FROM product
		WHERE sale_price / purchase_price < 1.3
	)

1.4.3 difference between excel all and bag

Similar to UNION ALL, EXCEPT ALL performs subtraction according to the number of occurrences and uses bag model for operation

For two bag s, their difference operation will be as follows:

  1. Whether the element belongs to a bag as a subtracted,
  2. The number of occurrences of this element in two bag s

These two aspects are calculated Only the elements belonging to the bag of the subtracted number participate in the excel all operation, and the number of times in the difference bag is equal to the difference of the occurrence times of the element in the two bags (if the difference is zero or negative, it does not appear) Therefore, for two bags a = {1,1,1,2,3,5,7} and B = {1,1,2,2,4,6,8}, their difference is equal to {1,3,5,7}

1.4.4 INTERSECT AND and predicates

For two query results of the same table, their INTERSECT can actually connect the search conditions of the two queries with AND predicate equivalently

Exercise:

Use the AND predicate to find products in the product table with a profit margin of more than 50% AND a selling price of less than 1500. The query results are as follows

SELECT *
  FROM product
	WHERE sale_price / purchase_price > 1.5 and sale_price < 1500

1.5 symmetry difference

The symmetry difference between two sets A and B refers to the set composed of elements belonging only to A or B Symmetry difference is also A very basic operation. For example, the intersection of two sets can be regarded as the intersection of two sets, and the symmetry difference of two sets can be removed The above method can also be used in other databases to simply realize the symmetric difference operation of tables or query results: first use UNION to find the UNION of two tables, then use INTERSECT to find the intersection of two tables, and then subtract the intersection from the UNION to get the symmetric difference

However, in MySQL 8.0, because the of two tables or query results cannot be calculated directly, it is not suitable to use the above idea to calculate the symmetry difference Fortunately, there are difference set operations that can be used It can be seen intuitively that the symmetry difference between the two sets is equal to A-B and B-A, so this idea can be used to calculate the symmetry difference in practice
Exercise:

Use the symmetry difference between the product table and the product2 table to query which products are in only one table. The results are similar to:

Tip: use NOT IN to realize the difference set of two tables

SELECT *
  FROM product
	WHERE product_id NOT IN (SELECT product_id FROM product2)

UNION

SELECT *
  FROM product2
	WHERE product_id NOT IN (SELECT product_id FROM product)

1.5.1 realize intersection operation INTERSECT by means of Union and difference set detour

By observing the Venn's graph of set operation, we find that the intersection of two sets can be regarded as the intersection of two sets, and the symmetry difference between the two sets is removed.

2, Join

In the previous section, we learned about set operations such as UNION and INTERSECT. The characteristic of these set operations is to operate in line direction Generally speaking, these set operations will lead to the increase or decrease of the number of record lines Using UNION will increase the number of record lines, while using INTERSECT or EXCEPT will reduce the number of record lines

However, these operations cannot change the changes of columns. Although the number of columns can be increased by using column operations such as functions or CASE expressions, some "extended columns" can only be obtained from the basic information columns provided in a table, which can not provide more information in essence If we want to obtain information from multiple tables, for example, if we want to find out the name, quantity and price of clothing products in a store, we must obtain information from the shopproduct table and the product table respectively

Note: up to now, the examples in this book (except associated sub query) basically select data from one table, but in fact, the expected data are often scattered in different tables,
At this point, you need to use links Previously, when learning the associated subquery, we found that using the associated subquery can also obtain information from other tables, but the link is more suitable for obtaining information from multiple tables

Join is to add columns from other tables by using some association condition (generally using the equality judgment predicate "=") to perform the set operation of "add columns" It can be said that the link is the core operation of SQL query. If you master the link, you can obtain columns from two or even more tables, simplify complex queries such as associated sub queries in the past into more readable forms, and carry out some more complex queries

There are many classification methods of links in SQL. We use the most basic classification methods of inner links and outer links to explain them respectively

2.1 inner join

The syntax format of the inner link is:

-- Inner link
FROM <tb_1> INNER JOIN <tb_2> ON <condition(s)>

The INNER keyword indicates the use of INNER links. As for the meaning of INNER links, there is no need to study them for the time being
For example, the same question just now:

Find out the name, quantity and price of clothes in a store

We further clarify this issue:

Find out the name, price, category and quantity of clothes in Tokyo stores

2.1.1 use the inner link to obtain information from two tables

Let's look at the tables involved first. The product table saves information such as commodity number, commodity name and commodity type. This table can provide detailed information about clothing types, but can't provide store information

Let's look at the shopproduct table, which contains the store number, name, commodity number and quantity However, if you want to obtain information such as product type, name and selling price, you must use the product table

So the key is to find a common column similar to "axis" or "bridge" and connect the two tables with this column This is what join operations do

Let's compare the above two tables. We can find that the commodity number column is a public column. Therefore, it is natural to use this commodity number column as a "bridge" to connect the two tables of product and shopproduct.

Note:
If you have used the vlookup function of excel, you will find that this function can also realize this function In fact, in terms of thinking, associated subqueries are more like
vlookup function: take table A as the main table, and then find rows with equal values in the associated columns of table B one by one according to the values of each row in the associated columns of table A
When the amount of data is small, this method will not have any performance problems, but when the amount of data is large, this method will lead to large computational overhead: for each row of data returned by external queries,
Will pass the value of an associated column to the internal sub query, and then the internal sub query will execute a query according to the incoming value, and then return its query result This makes,
For example, if the returned result of the external main query has 10000 rows, the sub query will be executed 10000 times, which will bring terrible time consumption

We decompose the above problems:
First, find out the store number, store name, commodity number, commodity name, commodity category, commodity selling price and commodity quantity of each store

According to the syntax of INNER JOIN, use INNER JOIN in the FROM clause to join the two tables, and specify the join condition for the ON clause as shopproduct product_ id=product. product_ ID to get the following query statement:

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.product_type
       ,P.sale_price
       ,SP.quantity
  FROM shopproduct AS SP
 INNER JOIN product AS P
    ON SP.product_id = P.product_id;

In the above query, we specify simple aliases for the two tables respectively. This operation is very common when using links. Using aliases will make us less typing when writing queries, and more importantly, it will make the query statements look more concise
The above query will get the following results:

Looking at the query result, we can see that the column in the result already contains all the information we need

Regarding inner links, you should pay attention to the following three points:

Point 1: when joining, you need to use multiple tables in the FROM clause

In the previous FROM clause, there was only one table. This time, we used both the shopproduct and product tables. The two tables can be linked together by using the keyword INNER JOIN:

FROM shopproduct AS SP INNER JOIN product AS P

Point 2: you must use the ON clause to specify the join condition

The ON clause is essential for inner join (you can try to get rid of the ON clause of the above query)

The ON clause is specifically used to specify the join conditions. We specify the columns and comparison conditions used for the join of the two tables after the ON of the above query. Basically, it can play the same filtering role as WHERE. We will discuss this topic further at the end of this chapter

Point 3: the columns in the SELECT clause should preferably be based on the table name The format of the column name

When the columns of the two tables do not have the same name except for the associated columns, we can also not write the table name, but the table name enables us to immediately see which table each column comes from when we read the query code at any time in the future, which can save us a lot of time

However, if two tables have other columns with the same name, you must use the above format to select the column name, otherwise the query statement will report an error

We return to the questions answered in the above query By observing the results of the above query, we find that this result is very close to our goal: to find out the basic information of clothing products in Tokyo stores Next, we only need to take the query result as a table and add a WHERE clause to it to specify the filter criteria

2.1.2 use inner links with WHERE clause

If you need to use the WHERE clause to filter the search results when using inner links, you need to write the WHERE clause after the ON clause

For example, for the above query problem, we can add the WHERE condition on the basis of the previous query

There are several ways to add a WHERE clause. Let's start with the simplest

The first way to add WEHRE clause is to take the above query as a subquery, wrap it in parentheses, and then add filter conditions to the outer query

SELECT *
  FROM (-- Step 1: query results
        SELECT SP.shop_id
               ,SP.shop_name
               ,SP.product_id
               ,P.product_name
               ,P.product_type
               ,P.sale_price
               ,SP.quantity
          FROMshopproduct AS SP
         INNER JOINproduct AS P
            ON SP.product_id = P.product_id) AS STEP1
 WHERE shop_name = 'Tokyo'
   AND product_type = 'clothes' ;


Remember when we learned sub query? The result of subquery is actually a table, just a virtual table. It does not really exist in the database, but a "view" obtained from other tables in the database after filtering, aggregation and other query operations
This writing method can clearly distinguish each operation step, which can help us when we are not very familiar with the execution order of each clause of SQL query

But in fact, if we know that the WHERE clause will be executed after the FROM clause, that is, the WHERE clause will not be executed until we get a new table after INNER JOIN... ON, then we get the standard writing method:

SELECT  SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.product_type
       ,P.sale_price
       ,SP.quantity
  FROMshopproduct AS SP
 INNER JOINproduct AS P
    ON SP.product_id = P.product_id
 WHERE SP.shop_name = 'Tokyo'
   AND P.product_type = 'clothes' ;

We first give the execution order of the above query:
FROM clause - > where clause - > select clause

In other words, the two tables are first linked according to the linked columns to get a new table, and then the WHERE clause filters the rows of the new table according to two conditions. Finally, the SELECT clause selects the columns we need

In addition, it is not very common to add the conditions in the WHERE clause directly to the ON clause. At this time, it is best to enclose the link conditions and filter conditions with parentheses after the ON clause

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.product_type
       ,P.sale_price
       ,SP.quantity
  FROM shopproduct AS SP
 INNER JOIN product AS P
    ON (SP.product_id = P.product_id
   AND SP.shop_name = 'Tokyo'
   AND P.product_type = 'clothes') ;


However, the above method of putting both filter conditions and link conditions in the ON clause is not easy to read and is not recommended
In addition, the execution order of the standard writing method of first linking and then filtering is that two complete tables are linked and then filtered. If multiple tables are to be linked or the filtering needs to be done is complex, it will be difficult to write SQL queries When using inner join in combination with WHERE clause, we can also change the task order and use the task decomposition method. First, use WHERE to filter the two tables respectively, and then join the above two sub queries

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.product_type
       ,P.sale_price
       ,SP.quantity
  FROM (-- Subquery 1:from shopproduct The table filters out the information of Tokyo stores
        SELECT *
          FROM shopproduct
         WHERE shop_name = 'Tokyo' ) AS SP
 INNER JOIN -- Subquery 2:from product The table filters out the information of clothing products
   (SELECT *
      FROM product
     WHERE product_type = 'clothes') AS P
    ON SP.product_id = P.product_id;


First, separate the complex filter conditions in two tables, and then connect the filter results (as a table) to avoid writing complex filter conditions. Therefore, this seemingly complex writing method actually has a very clear overall logic In the process of writing a query, you should first write it in the way that is most convenient for you to understand. First solve the problem, and then think about the optimization problem
Exercise:
Find out the name and price of clothes in each store The following results are expected:

SELECT SP.*,P.*
  FROM shopproduct AS SP
 INNER JOIN product AS P
 ON (SP.product_id = P.product_id AND P.product_type = 'clothes');
SELECT  SP.shop_id, SP.shop_name, SP.product_id
       ,P.product_name, P.product_type, P.purchase_price
  FROM shopproduct AS SP 
INNER JOIN -- from product Table find out the information of clothing products
  (SELECT product_id, product_name, product_type, purchase_price
     FROM product	
    WHERE product_type = 'clothes')AS P 
   ON SP.product_id = P.product_id;

Although the above-mentioned second method includes subqueries and has more lines of code, it is more suitable for reading because the purpose of each layer is very clear, and in the case of external links, it can also avoid the failure of external links caused by the wrong use of WHERE clauses. For relevant examples, see the chapter "using external links in combination with WHERE clauses" later

Exercise:

Use the method of linking two sub queries and not using sub queries to find out the commodity information with a price lower than 2000 in Tokyo stores. I hope to get the following results

SELECT  SP.shop_id, SP.shop_name, SP.product_id
       ,P.product_name, P.product_type, P.purchase_price
  FROM shopproduct AS SP 
INNER JOIN -- from product Table find out the information of clothing products
  (SELECT product_id, product_name, product_type, purchase_price
     FROM product	
    WHERE sale_price < 2000)AS P 
   ON SP.product_id = P.product_id
	 WHERE shop_name = 'Tokyo';
SELECT  SP.*,P.*
  FROM shopproduct AS SP 
INNER JOIN product AS P
   ON (SP.product_id = P.product_id
	 AND SP.shop_name = 'Tokyo'
	 AND P.sale_price<2000);

2.1.3 using inner links in combination with the GROUP BY clause

When using inner join in combination with GROUP BY clause, it needs to be treated differently according to which table the grouping column is located in

The simplest case is to use the GROUP BY clause before the inner link

However, if the grouped column and the aggregated column are not in the same table, and neither of them is used to connect the two tables, you can only connect them first and then aggregate them
Exercise:

What are the selling prices of the highest priced goods in each store?

SELECT SP.shop_id
      ,SP.shop_name
      ,MAX(P.sale_price) AS max_price
  FROM shopproduct AS SP
 INNER JOIN product AS P
    ON SP.product_id = P.product_id
 GROUP BY SP.shop_id

**Question: * * the above query obtains the highest selling price of each commodity, but I don't know which commodity is the highest selling price How to get the name and selling price of the highest selling goods in each store?

Note: an easy way to solve this problem is to use the window function in the next chapter Of course, we can also use other knowledge we have learned, for example,
After finding the price of the highest selling commodity in each store, use this price to link with the product column, but this method will have problems when the price is not unique

2.1.4 self join

The previous inner links linked two different tables But in fact, a table can also be linked to itself. This connection is called self - linking It should be noted that self links are not the third kind of links that distinguish between inner links and outer links. Self links can be outer links or inner links. It is another classification method different from inner links and outer links

2.1.5 inner link and associated sub query

Recall the problem in the associated sub query in Section 3 of Chapter 5: find out the commodities whose selling price is higher than the average selling price of each commodity category At that time, we used association sub query to implement it

SELECT product_type, product_name, sale_price
  FROM product AS P1
 WHERE sale_price > (SELECT AVG(sale_price)
                       FROM product AS P2
                      WHERE P1.product_type = P2.product_type
                      GROUP BY product_type);

Using inner links can also solve this problem:
First, use GROUP BY to calculate the average price of each category of goods

SELECT  product_type
       ,AVG(sale_price) AS avg_price 
  FROM product 
 GROUP BY product_type;

Next, compare the above query with the table product by product_type (product category) for internal links

SELECT  P1.product_id
       ,P1.product_name
       ,P1.product_type
       ,P1.sale_price
       ,P2.avg_price
  FROM product AS P1 
 INNER JOIN 
   (SELECT product_type,AVG(sale_price) AS avg_price 
      FROM product 
     GROUP BY product_type) AS P2 
    ON P1.product_type = P2.product_type;

Finally, add the WHERE clause to find out the goods whose selling price is higher than the average price of this kind of goods The complete code is as follows:

SELECT  P1.product_id
       ,P1.product_name
       ,P1.product_type
       ,P1.sale_price
       ,P2.avg_price
  FROM product AS P1
 INNER JOIN 
   (SELECT product_type,AVG(sale_price) AS avg_price 
      FROM product 
     GROUP BY product_type) AS P2 
    ON P1.product_type = P2.product_type
 WHERE P1.sale_price > P2.avg_price;


Just in terms of the amount of code, the above method seems more complex than the associated sub query, but this does not mean that the code is more difficult to understand Through the above analysis, it is easy to find that the logic of the above code is actually more in line with our thinking. Therefore, although it looks complex, the thinking is actually clearer
For comparison, try to analyze the following codes:

SELECT  P1.product_id
       ,P1.product_name
       ,P1.product_type
       ,P1.sale_price
       ,AVG(P2.sale_price) AS avg_price
  FROM product AS P1
 INNER JOIN product AS P2
    ON P1.product_type=P2.product_type
 WHERE P1.sale_price > P2.sale_price
 GROUP BY P1.product_id,P1.product_name,P1.product_type,P1.sale_price,P2.product_type

Although the subquery is removed, the level of query statement is less, and the number of code lines seems to be less, in fact, this method may be more difficult to write In practice, you must write code hierarchically according to the idea that it is easy for you to understand, instead of spending a long time writing code that may be more efficient but more difficult for you and others to understand

2.1.6 natural join

Natural join is not the third kind of join different from inner join and outer join. It is actually a special case of inner join - when two tables are naturally joined, the equivalent inner join will be made according to the column names contained in both tables. At this time, it is not necessary to use ON to specify the connection conditions

SELECT *  FROM shopproduct NATURAL JOIN product


Exercise:
Try to write the inner link equivalent to the above natural link

SELECT  SP.product_id,SP.shop_id,SP.shop_name,SP.quantity
       ,P.product_name,P.product_type,P.sale_price
       ,P.purchase_price,P.regist_date  
  FROM shopproduct AS SP 
 INNER JOIN product AS P 
    ON SP.product_id = P.product_id


Natural links can also be used to find the common parts of two tables or subqueries. For example, the problem in the section 7-1 selecting the common part in the table - INTERSECT in the textbook: finding the common part in table product and table product2 can also be realized by natural links:

SELECT * FROM product NATURAL JOIN product2


This result is not consistent with the result given in the book. There is no sports T-shirt, which is due to the register of sports T-shirts_ The date field is empty. During natural linking, the row of sports T-shirts from product and product2 is actually equivalent linking field by field. Recall the comparison method of missing values we learned in the section 6.2isnull and is not null. The two missing values are compared with the equal sign, and the result is not true The link returns only those lines that return true for the link condition

If we modify the query statement:

SELECT * 
  FROM (SELECT product_id, product_name
          FROM product ) AS A 
NATURAL JOIN 
   (SELECT product_id, product_name 
      FROM product2) AS B;

Then you can get the correct result:

2.1.7 intersection using links

We know from the addition and subtraction of the table in the previous section that there is no intersection operation in MySQL 8.0. At that time, we solved the intersection through union and difference sets Now that we've learned about links, let's try using links to find intersections

Exercise: use inner links to find the intersection of product table and product2 table

SELECT P1.*
  FROM product AS P1
 INNER JOIN product2 AS P2
    ON (P1.product_id  = P2.product_id
   AND P1.product_name = P2.product_name
   AND P1.product_type = P2.product_type
   AND P1.sale_price   = P2.sale_price
   AND P1.regist_date  = P2.regist_date)

The following results are obtained


Note that the above results are not consistent with those of P230 - product is missing_ For the row id ='0001 ', observe the source table data and find the register of the row with less data_ Date is the missing value. Recall the IS NULL predicate mentioned in Chapter 6. We know that this is because the missing value cannot be compared with the equal sign

If we just use product_id to connect:

SELECT P1.*
  FROM product AS P1
 INNER JOIN product2 AS P2
    ON P1.product_id = P2.product_id

Query results:


It's the same this time

2.2 external join

The inner join discards the rows in the two tables that do not meet the ON condition. The opposite to the inner join is the outer join The outer link will selectively reserve unmatched rows according to the type of outer link

According to the table in which the reserved row is located, there are three forms of outer links: left, right and all outer links

The left link will save the rows in the left table that cannot be matched according to the ON clause. At this time, the rows corresponding to the right table are all missing values; Right join will save the rows in the right table that cannot be matched according to the ON clause. At this time, the rows corresponding to the left table are all missing values; The full outer join will save the rows in the two tables that cannot be matched according to the ON clause at the same time, and the rows in the corresponding other table will be filled with missing values

The corresponding grammars of the three external links are:

-- Left link     
FROM <tb_1> LEFT  OUTER JOIN <tb_2> ON <condition(s)>
-- Right link     
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- Total external connection
FROM <tb_1> FULL  OUTER JOIN <tb_2> ON <condition(s)>

2.2.1 left connection and right connection

Since the positions of the left table and the right table can be exchanged during linking, there is no essential difference between left and right links Next, let's take the left link as an example All contents can get the same result after changing the front and back positions of the two tables and changing the left link to the right link We will introduce the concept of global extraneous links later

2.2.2 use the left link to get information from two tables

If you carefully observe the results before and after the internal link between shopproduct and product, you will find that there are two products in the product table that are not included in the internal link results, that is, these two products are not available in any store (this usually means important business information. For example, these two products are out of stock in all stores and need to be replenished in time) Now, let's try to convert the SELECT statement of the previous inner link into a left link

Exercise: count the stores in which each commodity is sold, including those that are out of stock in each store

The code for using the left link is as follows (note that it is different from the right link in the book):

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROM product AS P
  LEFT OUTER JOIN shopproduct AS SP
    ON SP.product_id = P.product_id;

The search results obtained from the above query are as follows (since the ORDER BY clause is not used to specify the order, the order of the results obtained by executing the above code may be different from the following figure):

By observing the above results, we can find that there are two kinds of goods: pressure cooker and ball point pen, which are not sold in all stores Because we SELECT the display order of columns in the SELECT clause and do not sort the results, this fact requires you to observe carefully

  • Key point 1: select all the information in a single table

Compared with the result of inner link, the difference is obvious, that is, the number of rows of the result is different There are 13 records in the internal link results and 15 records in the external link results. What are the two additional records? This is the key point of external connection The two extra records are pressure cooker and ball point pen. These two records do not exist in the shopproduct table, that is, these two kinds of goods are not sold in any store Because the inner link can only SELECT the data that exists in both tables, the two products that exist only in the product table do not appear in the result On the contrary, for external links, data can be read as long as it exists in a table In actual business, for example, when you want to generate a document with a fixed number of lines, you need to use external links If an inner link is used, the number of rows in the result will change according to the store inventory status when the SELECT statement is executed, and the layout of the generated document will be affected. Using an outer link can get a fixed number of rows Nevertheless, we still can't get the information that doesn't exist in the table. As a result, the store number and store name of pressure cooker and ballpoint pen are NULL (we don't know the specific information, so we have no choice) The origin of the external link name is also related to NULL, that is, the result contains information that does not exist in the original table (outside the original table) On the contrary, links that contain only information in the table are called inner links

  • External link point 2: use LEFT and RIGHT to specify the main table

The external link is also very important, that is, which table to use as the main table The final result will include all the data in the main table The keywords for specifying the main table are LEFT and RIGHT As the name suggests, when LEFT is used, the table written on the LEFT in the FROM clause is the main table, and when RIGHT is used, the table on the RIGHT is the main table RIGHT is used in code listing 7 - 11, so the table on the RIGHT, that is, the product table, is the main table We can also rewrite it like Listing 7 - 12, which means exactly the same In this way, you may be confused. Should you use LEFT or RIGHT? In fact, there is no difference in their functions. You can use either one There are usually more cases of using LEFT, but there is no reason to use it, and there is no problem with using RIGHT

By exchanging the order of the two tables and changing LEFT to RIGHT (if it was RIGHT, then LEFT), the two methods will achieve exactly the same result

2.2.3 use left join with WHERE clause

In the previous section, we learned the basic usage of the outer link and the method of using the inner link in combination with the WHERE clause. However, when using the outer link in combination with the WHERE clause, because the result of the outer link is likely to be different from that of the inner link, it will include those rows that cannot be matched in the main table, and fill in the columns in another table with missing values. Due to the existence of these rows, Therefore, when using the WHERE clause in the outer link, the situation will be somewhat different Let's take an example:

Exercise:

Use the external link to find the goods with less than 50 inventory in a store and the corresponding store from the shopproduct table and product table We hope to get the following results


Note that pressure cooker and ball point pen are out of stock in all stores, so they should also be included

According to the idea of "using inner links with WHERE clauses", we will naturally write the following code

SELECT P.product_id
       ,P.product_name
       ,P.sale_price
       ,SP.shop_id
       ,SP.shop_name
       ,SP.quantity
  FROM product AS P
  LEFT OUTER JOIN shopproduct AS SP
    ON SP.product_id = P.product_id
 WHERE quantity< 50

Unfortunately, the results are as follows:


It was observed that the returned results lacked pressure cookers and ball point pens that were not available in all stores. Smart you may easily think that by adding the judgment condition of OR``quantity IS NULL to the WHERE filter condition, you can get the expected result. However, in the actual environment, due to the large amount of data and the data quality is not as "clean" as we think, we can not easily realize the existence of problem data such as missing values. Therefore, let's think about how to rewrite our query to make it suitable for more complex real data.

Referring to the execution order of SQL queries we have mastered (from - > WHERE - > select), we found that the problem may lie in the filter criteria, because the WHERE clause will not be executed until the outer join is completed, so the rows that cannot be matched in the main table will be filtered out by the WHERE criteria.

With this in mind, we can try to move the WHERE clause before the outer join: first write a sub query to filter the goods with quantity < 50 from the shopproduct table, and then link the sub query with the main table.

We write the above ideas into SQL query statements:

SELECT P.product_id
      ,P.product_name
      ,P.sale_price
       ,SP.shop_id
      ,SP.shop_name
      ,SP.quantity 
  FROMproduct AS P
  LEFT OUTER JOIN-- Filter first quantity<50 Commodity
   (SELECT *
      FROMshopproduct
     WHERE quantity < 50 ) AS SP
    ON SP.product_id = P.product_id

The results obtained are as follows:

2.2.4 implement all external links in MySQL

With the understanding of left and right links, it is not difficult to understand the meaning of all external links In essence, all external links are reserved for all rows of the left table and the right table. If they can be associated with ON, the contents of the left table and the right table will be displayed in one row. If they cannot be associated, they will be displayed separately, and then the redundant columns will be filled with missing values.

Unfortunately, mysql8 0 currently does not support global external links, but we can unify the results of left and right links to achieve global external links.

2.3 multi table link

Usually, only two tables are connected, but sometimes more than three tables must be connected at the same time. In principle, there is no limit on the number of connected tables.

2.3.1 internal linking of multiple tables

First, create a table named Inventoryproduct. For three table joins First, we create a table to manage inventory items, assuming that the items are stored in two warehouses P001 and P002

The table creation statement is as follows:

CREATE TABLE Inventoryproduct
( inventory_id       CHAR(4) NOT NULL,
product_id         CHAR(4) NOT NULL,
inventory_quantity INTEGER NOT NULL,
PRIMARY KEY (inventory_id, product_id));

Then insert some data:

-- DML: insert data
START TRANSACTION;
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0001', 0);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0002', 120);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0003', 200);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0004', 3);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0005', 0);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0006', 99);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0007', 999);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P001', '0008', 200);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0001', 10);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0002', 25);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0003', 34);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0004', 19);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0005', 99);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0006', 0);
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0007', 0 );
INSERT INTO Inventoryproduct (inventory_id, product_id, inventory_quantity)
VALUES ('P002', '0008', 18);
COMMIT;

Next, according to the above table, the shopproduct table and the product table, we use the inner connection to find out which goods are available in each store and the total inventory of each kind of goods

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
       ,IP.inventory_quantity
  FROM shopproduct AS SP
 INNER JOIN product AS P
    ON SP.product_id = P.product_id
 INNER JOIN Inventoryproduct AS IP
    ON SP.product_id = IP.product_id
 WHERE IP.inventory_id = 'P001';

The following results are obtained


We can see that when connecting the third table, the connection conditions are also specified through the ON Clause (here, the most basic equal sign is used to connect the product table as the connection condition with the product _id in the shopproduct table). Because the product table and shopproduct table have been connected, Therefore, there is no need to link the product table and Inventoryproduct table (although you can also link, the result will not change, because no new restrictions are added in essence)

Even if you want to increase the linked tables to 4 or 5... The method of adding with INNER JOIN is exactly the same

2.3.2 external connection of multiple tables

As we have learned before, external links generally have more rows than internal links, which can give more information about the main table than internal links. Using external links in multi table links also has the same effect

For example,

SELECT P.product_id
       ,P.product_name
       ,P.sale_price
       ,SP.shop_id
       ,SP.shop_name
       ,IP.inventory_quantity
  FROM product AS P
  LEFT OUTER JOIN shopproduct AS SP
ON SP.product_id = P.product_id
LEFT OUTER JOIN Inventoryproduct AS IP
ON SP.product_id = IP.product_id

Query results

2.4 ON Clause advanced - non equivalent link

At the beginning of the introduction of links, the book mentioned that in addition to using equivalent links of equality judgment, you can also use comparison operators to connect In fact, all logical operations including comparison operators (<, < =, >, > =, between) and predicate operations (LIKE, IN, NOT, etc.) can be placed in the ON clause as a join condition

2.4.1 non equivalent self join

Ranking is achieved using non equivalent self left links.

Exercise:

You want to rank the products in the product table according to their selling prices An idea based on set theory and using left-hand links is to find out all goods with a selling price of no less than it for each commodity, and then COUNT the goods with a selling price of no less than it with the COUNT function For example, for the most expensive goods,

SELECT  product_id
       ,product_name
       ,sale_price
       ,COUNT(p2_id) AS rank_id
  FROM (-- Use the left-hand link to find out the items with a price no lower than it for each item
        SELECT P1.product_id
               ,P1.product_name
               ,P1.sale_price
               ,P2.product_id AS P2_id
               ,P2.product_name AS P2_name
               ,P2.sale_price AS P2_price 
          FROM product AS P1 
          LEFT OUTER JOIN product AS P2 
            ON P1.sale_price <= P2.sale_price 
        ) AS X
 GROUP BY product_id, product_name, sale_price
 ORDER BY rank_id; 


Note 1: when the parameter of the count function is the column name, the missing value in the column will be ignored. When the parameter is *, the missing value will not be ignored
Note 2: there are some problems in the above ranking scheme - if the prices of two commodities are equal, the ranking of two commodities will be wrong. For example, the ranking of fork and punch should be sixth, but the above query will lead to the ranking of both products being seventh Try to modify the above query so that both rank sixth
Note 3: in fact, there is a special function for ranking, which is one of the newly added window functions in MySQL 8.0 (the window function will be studied in the next chapter), but in the lower version of MySQL, the above idea of self left connection can only be used

Cumulative summation using non equivalent left link:

Exercise:

Please sum the selling prices cumulatively according to the selling prices of goods from low to high [Note: this case lacks practical significance, and unnecessary complexity is caused by the same prices of two goods, but the table structure of the example database is relatively simple, and no practical examples have been come up with for the time being]

First, according to the meaning of the question, use the left link for each commodity to find the commodity that is lower or equal to the selling price of the commodity

SELECT  P1.product_id
       ,P1.product_name
       ,P1.sale_price
       ,P2.product_id AS P2_id
       ,P2.product_name AS P2_name
       ,P2.sale_price AS P2_price 
  FROM product AS P1 
  LEFT OUTER JOIN product AS P2 
    ON P1.sale_price >= P2.sale_price
 ORDER BY P1.sale_price,P1.product_id	

View query results


There seems to be no problem

Next, follow P1 product_ ID grouping, P2_price summation:

SELECT  product_id
       ,product_name
       ,sale_price
       ,SUM(P2_price) AS cum_price 
  FROM (SELECT  P1.product_id
               ,P1.product_name
               ,P1.sale_price
               ,P2.product_id AS P2_id
               ,P2.product_name AS P2_name
               ,P2.sale_price AS P2_price 
          FROM product AS P1 
          LEFT OUTER JOIN product AS P2 
            ON P1.sale_price >= P2.sale_price
         ORDER BY P1.sale_price,P1.product_id ) AS X
 GROUP BY product_id, product_name, sale_price
 ORDER BY sale_price,product_id;

The query result is:


Observing the above query results, it is found that because the selling prices of two commodities are the same, the cumulative summation error is caused when > = is used for connection, which is caused by the same selling prices of the two commodities Therefore, in fact, we should not only use > = as the connection condition before The purpose of examining our intention to establish a left-hand link is to find out the satisfaction: 1 Lower than the price of the product, or 2 The commodity itself, and 3 If the selling prices of a and B are equal, when establishing a link, if P1 A and P2 A,P2. If B establishes a connection, P1 B and P2 A establishes a connection. Therefore, according to the above constraints and using the order of ID S, the above query is further rewritten as:

SELECT	product_id, product_name, sale_price
       ,SUM(P2_price) AS cum_price 
  FROM
        (SELECT  P1.product_id, P1.product_name, P1.sale_price
                ,P2.product_id AS P2_id
                ,P2.product_name AS P2_name
                ,P2.sale_price AS P2_price 
           FROM product AS P1 
           LEFT OUTER JOIN product AS P2 
             ON ((P1.sale_price > P2.sale_price)
             OR (P1.sale_price = P2.sale_price 
            AND P1.product_id<=P2.product_id))
	      ORDER BY P1.sale_price,P1.product_id) AS X
 GROUP BY product_id, product_name, sale_price
 ORDER BY sale_price,cum_price;


The result is correct this time

2.5 cross join - cross join (Cartesian product)

In the past, A common prerequisite for both external and internal links is the link condition - ON clause, which is used to specify the link condition If you have tried A link query without this link condition, you may have found that there are many lines in the result Removing the ON clause from the join is called cross join. Cross join is also called Cartesian product, which is A mathematical term Cartesian product of two sets is to use each element in set A and each element in set B to form an ordered combination The union, intersection and difference of database tables (or sub queries) are operations such as expanding or filtering restrictions ON the table vertically, which requires that the number of columns in the table and the data types of the columns at the corresponding positions are "compatible", so these operations will not add new columns, while cross connection (Descartes product) is to expand the table horizontally, that is, add new columns, which is consistent with the function of connection However, without the limitation of the ON clause, each row of the left table and the right table will be combined, which often leads to many meaningless rows in the search results Of course, cross - linking can also be useful in some query requirements

The syntax of cross linking has the following forms:

-- 1.Use keywords CROSS JOIN Explicit cross linking
SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROM shopproduct AS SP
 CROSS JOIN product AS P;
-- 2.Separate two tables with commas,And omit ON clause
SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROM shopproduct AS SP ,product AS P;


Please try to execute the above statement
You may be surprised at the number of lines in the result, but let's introduce the syntax first The set operator to CROSS JOIN tables that meet the same rules is CROSS JOIN (Cartesian product) The ON Clause used in inner and outer links cannot be used when cross linking. This is because cross linking is a cross combination of all records in two tables. Therefore, the number of records in the result is usually the product of the number of rows in the two tables In this example, there are 13 records in the shopproduct table and 8 records in the product table, so 13 records are included in the result × 8 = 104 records

At this time, some readers may recall that we mentioned multiplication in set operation earlier. We will study it in detail in this section. This is the cross link introduced above Inner link is a part of cross link, and "inner" can also be understood as "the part contained in the cross link result" On the contrary, the "outside" of the outer link can be understood as "the part outside the cross link result"

There are two reasons why cross - linking is not applied to real business One is that the result has no practical value. The other is that it needs a lot of computing time and the support of high-performance equipment because of the large number of rows of the result

2.5.1 relationship between [extended reading] link and Cartesian product

By examining Cartesian product and connection, it is not difficult to find that Cartesian product can be regarded as a special connection (in fact, the syntax of Cartesian product can also be written as CROSS JOIN). The ON clause of this connection is a predicate that is always true

On the other hand, after the Cartesian product is properly restricted, the inner and outer links are obtained

For example, for the shopproduct table and the product table, first establish the Cartesian product:

SELECT SP.*, P.*
  FROM shopproduct AS SP 
 CROSS JOIN product AS P;


Then add the filter condition sp.product to the above Cartesian product_ id=P.product_ ID, the result consistent with the inner link is obtained:

SELECT SP.*, P.*
  FROM shopproduct AS SP 
 CROSS JOIN product AS P
 WHERE SP.product_id = P.product_id;

The query results are as follows:

In fact, as mentioned in the book, in the above writing method, changing CROSS JOIN to comma is the old writing method of inner link. However, in the SQL-92 standards of ANSI and ISO, the writing method of inner Jeon... ON... Has been specified as the standard writing method. Therefore, it is highly recommended that you use the standard writing method when writing SQL queries at ordinary times

2.6 link specific syntax and outdated syntax

On the basis of Cartesian product, we add a WHERE clause and add the previous join condition as the filter condition. We will find that the result is just the result of using inner join directly

Try to run the following query and compare the query results with the results of the first example in the inner link section

SELECT SP.shop_id
      ,SP.shop_name
      ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROM shopproduct AS SP
 CROSS JOIN product AS P
 WHERE SP.product_id = P.product_id;


We find that the results of these two statements are the same
The syntax of inner link and outer link we learned earlier complies with the provisions of standard SQL and can be executed in all DBMS, so you can use it safely However, if we later engage in system development or read legacy SQL query statements, we will encounter the need to read and maintain the code written by others, and the programs using specific and outdated syntax will become our trouble

SQL is a language with many specific grammars and outdated grammars. Although it has been mentioned many times in this book, links are a part of specific grammars. Now many programmers and system engineers still use these specific grammars For example, the result of replacing the inline SELECT statement originally described in this section with outdated syntax is as follows

Inner links using outdated syntax (the result is the same as listing 7-9)

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROM shopproduct SP,product P
 WHERE SP.product_id = P.product_id
   AND SP.shop_id = '000A';


The result of this writing method is exactly the same as the standard syntax, and this syntax can be executed in all DBMS. It is not a specific syntax, but outdated
However, because this syntax is not only outdated, but also has many other problems, it is not recommended for the following three reasons:

First, it is impossible to immediately determine whether it is an inner link or an outer link (or other types of links) using this grammar

Second, because the link conditions are written in the WHERE clause, it is impossible to distinguish which part is the link condition and which part is the constraint for selecting records in a short time

Third, we don't know how long this grammar will last Every DBMS developer will consider abandoning outdated syntax and supporting new syntax Although not immediately out of use, but that day will come

Despite this, there are still many programs written with these outdated syntax, which can run normally so far I think you are likely to encounter such code, so I hope you can understand this knowledge

3, Exercises

3.1 exercise 1

Find out the basic information of products with a price higher than 500 in product and product2.

SELECT *
FROM product
WHERE sale_price > 500

UNION

SELECT *
FROM product2
WHERE sale_price > 500

3.2 exercise 2

With the help of the implementation of symmetric difference, find the intersection of product and product2.

SELECT *
  FROM product
	WHERE product_id IN (SELECT product_id FROM product2)

UNION

SELECT *
  FROM product2
	WHERE product_id IN (SELECT product_id FROM product)

3.3 exercise 3

Which stores sell the highest priced goods in each category?

SELECT SP.shop_id,SP.shop_name,SP.product_id
FROM shop_product AS SP
WHERE SP.product_id in 
(SELECT p1.product_id FROM product AS p1
INNER JOIN 
(SELECT product_type, MAX(sale_price) AS max_sale_price
FROM product
GROUP BY product_type) AS p2
ON p1.product_type = p2.product_type
AND p1.sale_price = p2.max_sale_price) 


Among the three types of goods, there are three maximum values. The query results are goods with ID 0003, 0005 and 0008, but in the shop_ In the product table, there are no corresponding stores selling 0005 and 0008 goods, so the result is only the information of the stores selling 0003 goods.

3.4 exercise 4

Use the inner link and the associated sub link to query the highest selling price of each category of goods.

  • Inner link
SELECT p1.product_id, p1.product_name, p1.product_type, p1.sale_price, p2.max_sale_price 
FROM product AS p1
INNER JOIN 
	(SELECT product_type, MAX(sale_price) AS max_sale_price
	FROM product
	GROUP BY product_type) AS p2
ON p1.product_type = p2.product_type
AND p1.sale_price = p2.max_sale_price;

  • Associated subquery
SELECT p1.product_id, p1.product_name, p1.product_type, p1.sale_price
FROM product AS p1
WHERE p1.sale_price = (SELECT MAX(sale_price) AS max_sale_price
											 FROM product AS p2
											 WHERE p1.product_type = p2.product_type
											 GROUP BY product_type);

3.5 exercise 5

Use the associated sub query: in the product table, retrieve the product_id, produc_name, slae_price, and sort according to the selling price of goods from low to high, and sum the selling prices cumulatively.

select product_id, product_name, sale_price, 
	(
		select sum(sale_price)
        from product p2
        where p2.sale_price <= p1.sale_price
	) sum
from product p1
order by sale_price;

4, Summary

This chapter contains a lot of content and advanced knowledge of SQL, which is not as easy to understand as before. I didn't encounter so many complex problems when I studied with my classmates. This chapter has indeed learned a lot. Of course, it's not enough to go through this chapter because it's very important. Come back and read this blog after a while, We have to learn again ~ it's a long way to learn, let's refuel ~!

Keywords: Database SQL

Added by adt2007 on Tue, 18 Jan 2022 20:49:49 +0200