ElasticSearch implements data paging (using bucket sort)

Be careful:

  • es version at least 6.1 above

Sentence:

GET 76/sessions/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "sid": {
              "value": "76e14832"
            }
          }
        },
        {
          "range": {
            "v_ymd": {
              "format": "yyyy-MM-dd", 
              "gte": "2018-02-02",
              "lte": "2018-02-02"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "pv": {
      "nested": {
        "path": "scene"
      },
      "aggs": {
        "pv2": {
          "terms": {
            "field": "scene.pid",
            "size": 1000,
            "shard_size": 10000
          },
          "aggs": {
            "pv_count": {
              "value_count": {
                "field": "scene.pid"
              }
            },
            "r_bucket_sort": {
              "bucket_sort": {
                "sort": {
                  "pv_count": {
                    "order": "desc"
                  }
                },
                "from": 10,
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

Partial explanation:

  • The outermost size=0, indicating that the query does not renege on details and only returns aggregation results;
  • query uses a list of must to filter the data;
  • terms implements the function of bucket splitting, similar to the grouping function in sql;
  • shard_size in terms indicates the amount of data returned by each segment, and size indicates the data of the returned bucket, which will be limited by the size in bucket_sort;
  • Value "count is a function of counting;
  • sort specifies the fields to be sorted and the ascending and descending order of sorting. You can use the aggregated fields;
  • Using the bucket sort function, from and size indicate how many pieces of data are retrieved from the first few pieces of data.

Special attention:

  • When using the bucket sort function in terms, the size setting of the group in terms should be greater than the size of from+size in bucket sort. Otherwise, the returned data will be limited by the size in terms.

Keywords: SQL

Added by ryanlwh on Mon, 06 Apr 2020 20:42:41 +0300