MySQL window function definition and usage (version 8.0 +)

The document link (window function) of the official mysql English reference manual is as follows: https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html
There is no change in the overall situation.

1, Window function definition

The example table is as follows:

mysql> SELECT * FROM sales ORDER BY country, year, product;
+------+---------+------------+--------+
| year | country | product    | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer   |   1500 |
| 2000 | Finland | Phone      |    100 |
| 2001 | Finland | Phone      |     10 |
| 2000 | India   | Calculator |     75 |
| 2000 | India   | Calculator |     75 |
| 2000 | India   | Computer   |   1200 |
| 2000 | USA     | Calculator |     75 |
| 2000 | USA     | Computer   |   1500 |
| 2001 | USA     | Calculator |     50 |
| 2001 | USA     | Computer   |   1500 |
| 2001 | USA     | Computer   |   1200 |
| 2001 | USA     | TV         |    150 |
| 2001 | USA     | TV         |    100 |
+------+---------+------------+--------+

Window functions perform aggregation like operations on a set of query rows. However, the aggregation operation groups query rows into a single result row, and the window function generates results for each query row:

  • The row on which the function calculation occurs is called the current row.
  • The query rows related to the current row where the function calculation occurs form the window of the current row.
    For example, using the sales information table, these two queries perform aggregation operations to generate a single global and the sum of all rows grouped by country as a group:
mysql> SELECT SUM(profit) AS total_profit
       FROM sales;
+--------------+
| total_profit |
+--------------+
|         7535 |
+--------------+
mysql> SELECT country, SUM(profit) AS country_profit
       FROM sales
       GROUP BY country
       ORDER BY country;
+---------+----------------+
| country | country_profit |
+---------+----------------+
| Finland |           1610 |
| India   |           1350 |
| USA     |           4575 |
+---------+----------------+

In contrast, window operations do not collapse query row groups into a single output row. Instead, they generate a result for each row. Like the previous query, the following query uses SUM(), but this time as a window function:

mysql> SELECT
         year, country, product, profit,
         SUM(profit) OVER() AS total_profit,
         SUM(profit) OVER(PARTITION BY country) AS country_profit
       FROM sales
       ORDER BY country, year, product, profit;
+------+---------+------------+--------+--------------+----------------+
| year | country | product    | profit | total_profit | country_profit |
+------+---------+------------+--------+--------------+----------------+
| 2000 | Finland | Computer   |   1500 |         7535 |           1610 |
| 2000 | Finland | Phone      |    100 |         7535 |           1610 |
| 2001 | Finland | Phone      |     10 |         7535 |           1610 |
| 2000 | India   | Calculator |     75 |         7535 |           1350 |
| 2000 | India   | Calculator |     75 |         7535 |           1350 |
| 2000 | India   | Computer   |   1200 |         7535 |           1350 |
| 2000 | USA     | Calculator |     75 |         7535 |           4575 |
| 2000 | USA     | Computer   |   1500 |         7535 |           4575 |
| 2001 | USA     | Calculator |     50 |         7535 |           4575 |
| 2001 | USA     | Computer   |   1200 |         7535 |           4575 |
| 2001 | USA     | Computer   |   1500 |         7535 |           4575 |
| 2001 | USA     | TV         |    100 |         7535 |           4575 |
| 2001 | USA     | TV         |    150 |         7535 |           4575 |
+------+---------+------------+--------+--------------+----------------+

Each window operation in the query is represented by including an OVER clause, which specifies how to divide the query rows into groups for processing by the window function:

  • The first OVER clause is empty and treats the entire query rowset as a separate partition. Therefore, the window function produces a global sum, but it does so for each row.
  • The second OVER clause partitions rows by country, generating the sum of each partition (each country). This function generates this sum for each partition row.

Window functions are only allowed in the select list and ORDER BY clause.
The query result line is determined from the from clause. After WHERE, GROUP BY and HAVING, the window is executed before ORDER BY, LIMIT and SELECT DISTINCT.

Many aggregate functions allow the over clause, so you can use it as a window function or a non window function, depending on whether the over clause exists: (you can use the over clause as a list of aggregate functions for window functions)

AVG()
BIT_AND()
BIT_OR()
BIT_XOR()
COUNT()
JSON_ARRAYAGG()
JSON_OBJECTAGG()
MAX()
MIN()
STDDEV_POP(), STDDEV(), STD()
STDDEV_SAMP()
SUM()
VAR_POP(), VARIANCE()
VAR_SAMP()

Reference for detailed use of this part: https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html

MySQL also supports non aggregate functions that are used only as window functions. For these functions, the OVER clause is mandatory:

CUME_DIST()
DENSE_RANK()
FIRST_VALUE()
LAG()
LAST_VALUE()
LEAD()
NTH_VALUE()
NTILE()
PERCENT_RANK()
RANK()
ROW_NUMBER()

For detailed use of this part, please refer to: https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html

As an example of these non aggregate window functions, the query uses ROW_NUMBER(), which generates the line number of each row in its partition. In this example, the rows are numbered by country. By default, partition rows are unordered and the row number is uncertain. To sort partitioned rows, include an ORDER BY clause in the window definition. The query uses unordered partitions and ordered partitions (row_num1 and row_num2 columns) to illustrate the difference between omitting ORDER BY and including ORDER BY:

mysql> SELECT
         year, country, product, profit,
         ROW_NUMBER() OVER(PARTITION BY country) AS row_num1,
         ROW_NUMBER() OVER(PARTITION BY country ORDER BY year, product) AS row_num2
       FROM sales;
+------+---------+------------+--------+----------+----------+
| year | country | product    | profit | row_num1 | row_num2 |
+------+---------+------------+--------+----------+----------+
| 2000 | Finland | Computer   |   1500 |        2 |        1 |
| 2000 | Finland | Phone      |    100 |        1 |        2 |
| 2001 | Finland | Phone      |     10 |        3 |        3 |
| 2000 | India   | Calculator |     75 |        2 |        1 |
| 2000 | India   | Calculator |     75 |        3 |        2 |
| 2000 | India   | Computer   |   1200 |        1 |        3 |
| 2000 | USA     | Calculator |     75 |        5 |        1 |
| 2000 | USA     | Computer   |   1500 |        4 |        2 |
| 2001 | USA     | Calculator |     50 |        2 |        3 |
| 2001 | USA     | Computer   |   1500 |        3 |        4 |
| 2001 | USA     | Computer   |   1200 |        7 |        5 |
| 2001 | USA     | TV         |    150 |        1 |        6 |
| 2001 | USA     | TV         |    100 |        6 |        7 |
+------+---------+------------+--------+----------+----------+

As mentioned earlier, to use a window function (or treat an aggregate function as a window function), include the OVER clause after the function call. The OVER clause has two forms:

over_clause:
    {OVER (window_spec) | OVER window_name}

Both forms define how the window function should handle query rows. The difference between them is whether the window is defined directly in the OVER clause or provided by the reference of the specified window defined elsewhere in the query:

  • In the first case, the window specification appears directly between the parentheses of the OVER clause.
  • In the second case, window_name is the name of the window specification defined by the window clause elsewhere in the query. For details, see 12.21.4 chapter "name Windows".

For the OVER (window_spec) syntax, the window specification has several parts, all of which are optional:

window_spec:
    [window_name] [partition_clause] [order_clause] [frame_clause]

If OVER() is empty, the window consists of all query rows, and the window function uses all rows to calculate the result. Otherwise, the clause in parentheses determines which query rows are used to evaluate the function results and how to partition and sort them:

  • window_name: the name of the window defined by the window clause elsewhere in the query. If window_ If name appears alone in the OVER clause, it fully defines the window. If you also give partitioning, sorting, or box sentences, they modify the interpretation of the named window. See 12.21.4 "name Windows" for details.
  • partition_ Claim: the PARTITION BY clause indicates how the query rows are divided into groups. The result of the window function for a given row is based on the row of the partition containing the row. If PARTITION BY is omitted, there is a single partition consisting of all query rows. (the partition of window function is different from that of table, so don't confuse it.). The partition can be written as follows:
partition_clause:
    PARTITION BY expr [, expr] ...

Standard SQL only requires PARTITION BY followed by column name. MySQL extensions allow expressions, not just column names. For example, if a table contains a TIMESTAMP column named ts, standard SQL allows PARTITION BY ts, but does not allow PARTITION BY HOUR(ts), while MySQL allows both.

  • order_ Claim: the ORDER BY clause indicates how the rows in each partition are sorted. According to the ORDER BY clause, equal partitioned rows are considered equal. If ORDER BY is omitted, the partition lines are unordered, there is no implicit processing order, and all partition lines are equal.
    The syntax of order by is as follows:
order_clause:
    ORDER BY expr [ASC|DESC] [, expr [ASC|DESC]] ...

Each ORDER BY expression can be followed by ASC or DESC to indicate the sorting direction. If no direction is specified, it defaults to ASC. NULL values are sorted first in ascending order and finally in descending order. ORDER BY in the window definition is applied to each partition. To sort the result set as a whole, include an ORDER BY at the top level of the query.

  • frame_ Claim: a frame is a subset of the current partition. The frame clause specifies how to define the subset. The framework clause has many sub clauses. For details, see 12.21.3 window function box specification.

Keywords: Database MySQL

Added by Bob_PHP_Builder on Sun, 23 Jan 2022 17:35:57 +0200