1. What is a window function?
Window functions are also analysis functions. Oracle from 8.1 6 start to provide window function, which is used to calculate some aggregate value based on group,
The window function specifies the data window size of the analysis function. The data window size may change with the change of rows.
Unlike aggregate functions, which return multiple rows for each group, aggregate functions return only one row for each group
Basic grammar: ‹Analysis function› over (partition by ‹Column name for grouping› order by ‹Column name for sorting›). In grammar‹Analysis function›Mainly composed of sequence function( rank,dense_rank and row_number Equal composition) And aggregate function( sum,avg,count,max and min Etc.) as a window function.
From the perspective of window function composition, it is a function combination of group by and order by. Group by changes the number of rows in the table after grouping and summary. Each row has only one category, while partition by does not reduce the number of rows in the original table.
Just like the composition of window function, it has the functions of grouping and sorting at the same time, and does not reduce the number of rows in the original table.
The OVER keyword means that the function is treated as a window function rather than an aggregate function. The SQL standard allows all aggregate functions to be used as window functions, and the OVER keyword is used to distinguish the two usages.
2. Window function - open window
Options are often added in parentheses after the OVER keyword to change the scope of the window for aggregation. If the options in parentheses after the OVER keyword are empty, the window function aggregates all rows in the result set.
Analysis function over(partition by Listing order by Listing rows between Start position and End position)
Why open the window?
Because the partition() function in the over() bracket can open a separate window for the queried data. For example, query the ranking of students in each class, query the population of each country over the years, and so on. A window is opened in each class and country to execute orders independently.
Rows and range respectively represent the last few rows and the selected data range
Understand the meaning of rows between, also known as window clause:
- preceding: forward
- following: back
- Current row: current row
- Unbounded: borderless, unbounded preceding means starting from the front starting point, unbounded following means to the back end point
- Note: if partition by is not added, the entire dataset will be treated as a partition. If order by is not added, the statistical results of some functions will be affected, such as sum()
3. How to use some analysis functions
1. Aggregate function
Aggregate function | definition |
---|---|
sum() | Sum |
max() | Find the maximum value |
min() | Find the minimum value |
avg() | Average |
count() | Statistics |
2. Sequence function
Sequence function | definition |
---|---|
row_number() | A self incrementing number is generated when sorting by value. When the values are equal, they will not be repeated and empty bits will not be generated |
rank() | A self incrementing number is generated when sorting by value. When the values are equal, it will be repeated and empty bits will be generated |
dense_rank() | When sorting by value, a self incrementing number is generated. When the values are equal, it will be repeated without vacancy |
row_number()
select * ,row_number()over(oder by achievement desc) as ranking from Class table
Query results:
+------------+--------+------+------+------+ | full name | Gender | class | achievement | ranking | +------------+--------+------+------+------+ | Zhang San | male | 1 | 100 | 1 | | Li Si | female | 3 | 100 | 2 | | Zhang San | female | 1 | 100 | 3 | | Wang Wu | female | 2 | 99 | 4 | | Zhao Si | male | 2 | 90 | 5 | | Sun Liu | male | 2 | 90 | 6 | | Pleasant Sheep | male | 3 | 85 | 7 | | Beautiful sheep | female | 4 | 82 | 8 | | Lazy | female | 1 | 80 | 9 | | Slow sheep | female | 2 | 70 | 10 | +------------+--------+------+------+------+
rank()
select * ,rank()over(oder by achievement desc) as ranking from Class table
Query results:
+------------+--------+------+------+------+ | full name | Gender | class | achievement | ranking | +------------+--------+------+------+------+ | Zhang San | male | 1 | 100 | 1 | | Li Si | female | 3 | 100 | 1 | | Zhang San | female | 1 | 100 | 1 | | Wang Wu | female | 2 | 99 | 4 | | Zhao Si | male | 2 | 90 | 5 | | Sun Liu | male | 2 | 90 | 5 | | Pleasant Sheep | male | 3 | 85 | 7 | | Beautiful sheep | female | 4 | 82 | 8 | | Lazy | female | 1 | 80 | 9 | | Slow sheep | female | 2 | 70 | 10 | +------------+--------+------+------+------+
dense_rank()
select * ,row_number()over(oder by achievement desc) as ranking from Class table
Query results:
+------------+--------+------+------+------+ | full name | Gender | class | achievement | ranking | +------------+--------+------+------+------+ | Zhang San | male | 1 | 100 | 1 | | Li Si | female | 3 | 100 | 1 | | Zhang San | female | 1 | 100 | 1 | | Wang Wu | female | 2 | 99 | 2 | | Zhao Si | male | 2 | 90 | 3 | | Sun Liu | male | 2 | 90 | 3 | | Pleasant Sheep | male | 3 | 85 | 4 | | Beautiful sheep | female | 4 | 82 | 5 | | Lazy | female | 1 | 80 | 6 | | Slow sheep | female | 2 | 70 | 7 | +------------+--------+------+------+------+
3. Other categories
Other classes | definition |
---|---|
percent_rank() | rank value of current line in the group - 1 / total number of rows in the group - 1 |
lag() | It is used to count the value of the nth row up in the window. The first parameter is the column name, the second parameter is the nth row up (optional, the default is 1), and the third parameter is the default value (when the nth row up is NULL, the default value is taken, and if it is not specified, it is NULL |
lead() | It is used to calculate the value of the nth row down in the window. The first parameter is the column name, the second parameter is the nth row down (optional, the default is 1), and the third parameter is the default value (when the nth row down is NULL, the default value is taken, or NULL if not specified) |
ntile() | It is used to divide the grouped data into n slices in order and return the current slice value |
first_value() | Get the first value of the current row after sorting in the group |
last_value() | Get the last value of the current row after sorting in the group |
cume_dist() | Returns the number of rows less than or equal to the current value / the total number of rows in the group |
4. OVER() parameter - grouping function
partition by clause:
Window function over Keywords in parentheses can be used partition by Clause to define the partition of the row for aggregation calculation. And group by Different clauses, partition by The partition created by clause is independent of the result set, and the partition created is only for Aggregation is computed, and partitions created by different window functions do not affect each other.
5. OVER() parameter -- sort function
order by clause:
In the window function, you can over Used in options after keywords order by Clause to specify the collation, and some window functions also Requires that a collation must be specified. use order by Clause can sort the result set according to the specified collation, and Aggregate within the specified range. Syntax: ORDER BY Field name RANGE|ROWS BETWEEN Boundary rule 1 AND Boundary rule 2
- The PARTITION BY clause and ORDER BY can be used together to achieve more complex functions