Alibaba cloud Tianchi Longzhu plan sql -- stack07

Set operation of table, addition and subtraction

A set represents "the sum of all kinds of things" in the field of mathematics and a set of records in the field of database Specifically, the execution results of tables, views and queries are a collection of records, in which the elements are each row in the table or query results.
In standard SQL, union, intersect and excel are used to merge, intersect and difference the search results. Operators such as union, intersect and excel used for set operation are called set operators.
Table addition union

SELECT product_id, product_name
  FROM product
 UNION
SELECT product_id, product_name
  FROM product2;

union and or predicates
For two different filter result sets of the same table, UNION is used to merge the two result sets, and the filter conditions of the two sub queries are connected with OR predicate to get the same result. However, if you want to combine the results in two different tables, you have to use UNION
Moreover, even for the same table, UNION is sometimes used for query efficiency
Use UNION OR or predicates respectively to find out commodities with gross profit rate less than 30% OR unknown gross profit rate

-- use OR predicate
SELECT * 
  FROM product 
 WHERE sale_price / purchase_price < 1.3 
    OR sale_price / purchase_price IS NULL;
-- use UNION
SELECT * 
  FROM product 
 WHERE sale_price / purchase_price < 1.3
 
 UNION
SELECT * 
  FROM product 
 WHERE sale_price / purchase_price IS NULL;

Set operation with duplicate rows union all
The UNION of SQL statement will merge and de duplicate the result sets of two queries. This de duplication will not only remove the duplicate of two result sets, but also remove the duplicate rows in one result set However, in practice, it is sometimes necessary to combine without duplication. In fact, the syntax of retaining duplicate lines in the result of UNION is very simple. You only need to add the ALL keyword after UNION

-- Keep duplicate lines
SELECT product_id, product_name
  FROM product
 UNION ALL
SELECT product_id, product_name
  FROM product2;

bag model and set model
We learned in high school mathematics class that a remarkable feature of a set is that the elements in the set are different from each other When we regard the tables in the database as collections, there are actually some problems: whether it is intentional design or unintentional negligence, many tables in the database contain duplicate rows
Bag is a mathematical structure similar to set. The difference is that duplicate elements are allowed in bag. If the same element is added multiple times, there will be multiple elements in the bag
Through the difference between bag and set, we find that bag model is more suitable to describe the tables in the database in many cases
Whether to allow element repetition leads to some differences between set and bag Take the intersection of bags as an example. Since bags allow elements to appear repeatedly, for two bags, their union operation will be as follows: 1 Whether the element has appeared in at least one bag, 2 The maximum number of occurrences of this element in two bags is calculated Therefore, for a = {1,1,1,2,3,5,7} and B = {1,1,2,2,4,6,8}, their sum is equal to {1,1,1,2,2,3,4,5,6,7,8}

Implicit type conversion

We will use UNION to merge the columns with exactly the same type and representing the same attribute, but sometimes, even if the data types are not exactly the same, we will use implicit type conversion to display two columns with different types in one column, such as string and numeric value types

SELECT product_id, product_name, '1'
  FROM product
 UNION
SELECT product_id, product_name,sale_price
  FROM product2;

mysql 8.0 does not support intersect operation
The intersection of sets is the common part of two sets. Due to the mutual anisotropy of set elements, the significance of the intersection of sets can be seen intuitively only through Wen's graph
Although the delivery calculation of sets has appeared in the SQL standard for many years, it is a pity that MySQL still does not support INTERSECT operation as of MySQL version 8.0
[extended reading] intersection operation of bag
For two bags, their intersection operation will be as follows: 1 Whether the element belongs to two bags at the same time, 2 The minimum number of occurrences of this element in two bags is calculated Therefore, for two bags a = {1,1,1,2,3,5,7} and B = {1,1,2,2,4,6,8}, their delivery calculation result is equal to {1,1,2}

Subtraction of difference set, complement set and table

The subtraction operation of finding the set difference set is somewhat different from the subtraction operation of real numbers. When subtracting another set B from one set A, the strategy of directly ignoring the elements that only exist in set B but not in set A is adopted. Therefore, the subtraction of sets A and B is only to subtract the elements that also belong to set B in set A

MySQL 8.0 does not support excel operation yet
MySQL 8.0 does not support the subtraction operator excel However, with the help of the NOT IN predicate learned in Chapter 6, we can also realize table subtraction

Find out what only exists in product Table but does not exist in product2 Table of goods.

-- use IN Implementation method of clause
SELECT * 
  FROM product
 WHERE product_id NOT IN (SELECT product_id 
                            FROM product2)

Difference between excel all and bag
Similar to UNION ALL, EXCEPT ALL performs subtraction according to the number of occurrences and also uses bag model for operation

For two bag s, their difference will be calculated as follows:

1. Whether the element belongs to the bag as the subtracted number,

2. The number of occurrences of this element in two bag s

These two aspects are calculated Only the elements belonging to the bag of the subtracted number participate in the excel all operation, and the number of times in the difference bag is equal to the difference of the occurrence times of the element in the two bags (if the difference is zero or negative, it does not appear) Therefore, for two bags a = {1,1,1,2,3,5,7} and B = {1,1,2,2,4,6,8}, their difference is equal to {1,3,5,7}

INTERSECT AND and predicates
For two query results of the same table, their INTERSECT can actually connect the search conditions of the two queries with AND predicate equivalently

Use the AND predicate to find products in the product table with a profit margin of more than 50% AND a selling price of less than 1500

SELECT * 
  FROM product
 WHERE sale_price > 1.5 * purchase_price 
   AND sale_price < 1500

Symmetry difference

The symmetry difference between two sets A and B refers to the set composed of elements belonging only to A or B Symmetry difference is also A very basic operation. For example, the intersection of two sets can be regarded as the intersection of two sets, and the symmetry difference of two sets can be removed The above method can also be used in other databases to simply realize the symmetric difference operation of tables or query results: first use UNION to find the UNION of two tables, then use INTERSECT to find the intersection of two tables, and then subtract the intersection from the UNION to get the symmetric difference
Because the of two tables or query results can not be calculated directly, it is not suitable to use the above idea to calculate the symmetry difference Fortunately, there are also difference set operations that can be used It can be seen intuitively that the symmetry difference between two sets is equal to A-B and B-A, so this idea can be used to calculate the symmetry difference in practice

Use the symmetry difference between the product table and the product2 table to query which products are in only one table,
Using NOT IN to realize the difference set of two tables

-- use NOT IN Implement the difference set of two tables
SELECT * 
  FROM product
 WHERE product_id NOT IN (SELECT product_id FROM product2)
UNION
SELECT * 
  FROM product2
 WHERE product_id NOT IN (SELECT product_id FROM product)

The intersection operation INTERSECT is realized by means of Union and difference set detour
By observing the Venn's graph of set operation, we find that the intersection of two sets can be regarded as the intersection of two sets, and the symmetry difference between the two sets is removed

join


Adding a join condition to a set is usually done by adding a join condition to other columns It can be said that the link is the core operation of SQL query. If you master the link, you can obtain columns from two or even more tables, simplify the overly complex queries such as associated sub queries in the past into a more readable form, and carry out some more complex queries
Inner link
The syntax format of the inner link is:

-- Inner link
FROM <tb_1> INNER JOIN <tb_2> ON <condition(s)>

Find out the store number, store name, commodity number, commodity name, commodity category, commodity selling price and commodity quantity information of each store

According to the syntax of INNER JOIN, use INNER JOIN in the FROM clause to join the two tables, and specify the join condition for the ON clause as shopproduct product_ id=product. product_ ID to get the following query statement:

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.product_type
       ,P.sale_price
       ,SP.quantity
  FROMshopproduct AS SP
 INNER JOIN product AS P
    ON SP.product_id = P.product_id;

be careful:
When joining, you need to use multiple tables in the FROM clause
You must use the ON clause to specify the join condition
The columns in the SELECT clause should preferably follow the table name The format of the column name

Use inner join with WHERE clause

1. The method of adding WEHRE clause is to take the above query as a sub query, wrap it in parentheses, and then add filter conditions to the outer query

SELECT *
  FROM (-- Step 1: query results
        SELECT SP.shop_id
               ,SP.shop_name
               ,SP.product_id
               ,P.product_name
               ,P.product_type
               ,P.sale_price
               ,SP.quantity
          FROM shopproduct AS SP
         INNER JOINproduct AS P
            ON SP.product_id = P.product_id) AS STEP1
 WHERE shop_name = 'Tokyo'
   AND product_type = 'clothes' ;

--Execution sequence:
FROM clause->WHERE clause->SELECT clause

Use inner join with GROUP BY clause

When using inner join in combination with GROUP BY clause, it needs to be treated differently according to which table the grouping column is located in
The simplest case is to use the GROUP BY clause before the inner link
However, if the grouped column and the aggregated column are not in the same table, and neither of them is used to connect the two tables, you can only connect them first and then aggregate them

-- Reference answer
SELECT SP.shop_id
      ,SP.shop_name
      ,MAX(P.sale_price) AS max_price
  FROM shopproduct AS SP
 INNER JOINproduct AS P
    ON SP.product_id = P.product_id
 GROUP BY SP.shop_id,SP.shop_name

self join
In the previous internal links, two different tables were linked But in fact, a table can also be connected with itself. This connection is called self - connection It should be noted that self links are not the third kind of links that distinguish between inner links and outer links. Self links can be outer links or inner links. It is another classification method different from inner links and outer links

Inner link and associated subquery

--Previously used associated subquery
SELECT product_type, product_name, sale_price
  FROM product AS P1
 WHERE sale_price > (SELECT AVG(sale_price)
                       FROM product AS P2
                      WHERE P1.product_type = P2.product_type
                      GROUP BY product_type);

--Now use inner links

1.use group by Calculate the average price of goods by commodity category
2.Compare the above query with the table product according to product_type(Product category) for internal connection
3.increase where Clause to find goods whose selling price is higher than the average price of such goods

SELECT  P1.product_id
       ,P1.product_name
       ,P1.product_type
       ,P1.sale_price
       ,P2.avg_price
  FROM product AS P1
 INNER JOIN 
   (SELECT product_type,AVG(sale_price) AS avg_price 
      FROM product 
     GROUP BY product_type) AS P2 
    ON P1.product_type = P2.product_type
 WHERE P1.sale_price > P2.avg_price;

Natural join

Natural links are not the third kind of links that are different from inner links and outer links. They are actually a special case of inner links - when two tables are naturally linked, equivalent inner links will be made according to the column names contained in both tables. At this time, it is not necessary to use ON to specify the connection conditions

--Natural connection
SELECT *  FROM shopproduct NATURAL JOIN product
 The results of the above query, Will put the common columns of the two tables(Here is product_id, There can be multiple public columns)Put in the first column, Then follow the order of the two tables and the order of the columns in the table, List the other columns in both tables.

--Equivalent inner link
SELECT  SP.product_id,SP.shop_id,SP.shop_name,SP.quantity
       ,P.product_name,P.product_type,P.sale_price
       ,P.purchase_price,P.regist_date  
  FROM shopproduct AS SP 
 INNER JOIN product AS P 
    ON SP.product_id = P.product_id

Using natural links, you can also find the common part of two tables or subqueries

SELECT * FROM product NATURAL JOIN product2

When a field value is empty, this method needs to be modified

SELECT * 
  FROM (SELECT product_id, product_name
          FROM product ) AS A 
NATURAL JOIN 
   (SELECT product_id, product_name 
      FROM product2) AS B;

Intersection using links
Use inner join to find the intersection of product table and product2 table

SELECT P1.*
  FROMproduct AS P1
 INNER JOINproduct2 AS P2
    ON (P1.product_id  = P2.product_id
   AND P1.product_name = P2.product_name
   AND P1.product_type = P2.product_type
   AND P1.sale_price   = P2.sale_price
   AND P1.regist_date  = P2.regist_date)

When a field value is empty, this method needs to be modified

outer join

The inner join discards the rows in the two tables that do not meet the ON condition. The opposite of the inner join is the outer join The outer link will selectively reserve the unmatched rows according to the type of outer link

According to the table in which the reserved row is located, there are three forms of outer links: left, right and all outer links

The left link will save the rows in the left table that cannot be matched according to the ON clause. At this time, the rows corresponding to the right table are all missing values; The right link will save the rows in the right table that cannot be matched according to the ON clause. At this time, the rows corresponding to the left table are all missing values; The full outer join will save the rows in the two tables that cannot be matched according to the ON clause at the same time, and the rows in the corresponding other table will be filled with missing values

The corresponding grammars of the three external links are:

-- Left link     
FROM <tb_1> LEFT  OUTER JOIN <tb_2> ON <condition(s)>
-- Right link     
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- Total external connection
FROM <tb_1> FULL  OUTER JOIN <tb_2> ON <condition(s)>

Left and right links are similar

Key point 1: select all the information in a single table
External link point 2: use LEFT and RIGHT to specify the main table

Use left join with where clause

However, when using an outer link in combination with a WHERE clause, because the result of the outer link is likely to be different from that of the inner link, it will include those rows that cannot be matched in the main table, and fill in the columns in another table with missing values. Due to the existence of these rows, the situation will be somewhat different when using a WHERE clause in the outer link

SELECT P.product_id
       ,P.product_name
       ,P.sale_price
       ,SP.shop_id
       ,SP.shop_name
       ,SP.quantity
  FROMproduct AS P
  LEFT OUTER JOINshopproduct AS SP
    ON SP.product_id = P.product_id
 WHERE quantity< 50

Observation findings,The returned result is missing the pressure cooker and ballpoint pen that are out of stock in all stores. Smart you may easily think that in WHERE Add in filter condition OR``quantity IS NULL According to the judgment conditions, the expected results can be obtained.

In the actual environment, due to the large amount of data and data quality, it is not as we imagine"clean",We are not easily aware of the existence of problem data such as missing values

External link

With the understanding of left and right links, it is not difficult to understand the meaning of all external links In essence, all external links are reserved for all rows of the left table and the right table. If they can be associated with ON, the contents of the left table and the right table will be displayed in one row. If they cannot be associated, they will be displayed separately, and then the redundant columns will be filled with missing values.

Multiple intra table links

According to the above table, the shopproduct table and the product table, we use the inner link to find out which goods are available in each store and the total inventory of each kind of goods

SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
       ,IP.inventory_quantity
  FROMshopproduct AS SP
 INNER JOINproduct AS P
    ON SP.product_id = P.product_id
 INNER JOIN Inventoryproduct AS IP
    ON SP.product_id = IP.product_id
 WHERE IP.inventory_id = 'P001';

External linking of multiple tables

External links can generally have more rows than internal links, so they can give more information about the main table than internal links. The use of external links in multi table links also has the same effect

SELECT P.product_id
       ,P.product_name
       ,P.sale_price
       ,SP.shop_id
       ,SP.shop_name
       ,IP.inventory_quantity
  FROMproduct AS P
  LEFT OUTER JOINshopproduct AS SP
ON SP.product_id = P.product_id
LEFT OUTER JOIN Inventoryproduct AS IP
ON SP.product_id = IP.product_id

ON Clause advanced - non equivalent link
At the beginning of the introduction of links, the book mentioned that in addition to using equivalent links of equality judgment, you can also use comparison operators to connect In fact, all logical operations including comparison operators (<, < =, >, > =, between) and predicate operations (LIKE, IN, NOT, etc.) can be placed in the ON clause as a join condition

Non equivalent self left join
Ranking using non equivalent self left links

Hope yes product The goods in the table are ranked according to the selling price. Starting from set theory,The idea of using left-hand links is, For every commodity,Find out all goods that sell for no less than it, Then use it for goods that sell for no less than it COUNT Function count. for example, For the most expensive goods,

SELECT  product_id
       ,product_name
       ,sale_price
       ,COUNT(p2_id) AS rank_id
  FROM (--Use the left-hand link to find out the products with a price no lower than it for each product
        SELECT P1.product_id
               ,P1.product_name
               ,P1.sale_price
               ,P2.product_id AS P2_id
               ,P2.product_name AS P2_name
               ,P2.sale_price AS P2_price 
          FROM product AS P1 
          LEFT OUTER JOIN product AS P2 
            ON P1.sale_price <= P2.sale_price 
        ) AS X
 GROUP BY product_id, product_name, sale_price
 ORDER BY rank_id; 

Cross linking - ceoss join (Cartesian product)

Cartesian product of two sets is to use each element in set A and each element in set B to form an ordered combination

The union, intersection and difference of a database table (or sub query) are operations such as expanding or filtering restrictions ON the table vertically, which requires that the number of columns in the table and the data types of the columns in the corresponding position are "compatible", so these operations will not add new columns, while cross connection (Descartes product) is to expand the table horizontally, that is, add new columns, which is consistent with the function of connection However, without the limitation of the ON clause, each row of the left table and the right table will be combined, which often leads to many meaningless rows appearing in the search results

The syntax of cross linking has the following forms:

-- 1.Use keywords CROSS JOIN Cross connect explicitly
SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROMshopproduct AS SP
 CROSS JOINproduct AS P;
--2.Separate two tables with commas,And omit ON clause
SELECT SP.shop_id
       ,SP.shop_name
       ,SP.product_id
       ,P.product_name
       ,P.sale_price
  FROMshopproduct AS SP ,product AS P;

Exercises

This time the topic is simple, so I won't write it up

Keywords: SQL cloud computing Alibaba Cloud

Added by Hillu on Sun, 30 Jan 2022 20:48:34 +0200