Detailed explanation of common commands for Tencent cloud Elasticsearch cluster operation and maintenance (index 3)

In the first two articles, we introduced some commands commonly used in daily cluster operation and maintenance from the cluster and node layers. Next, we will continue to introduce several commonly used cluster operation and maintenance API s from the index level.

Index related commands

1. View basic information of cluster index

GET _cat/indices?v

Return Response:

health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   wr_index_1                      cdI0seyjTF2tgV0qqSiFyg   3   1          2            0     13.9kb          6.9kb
green  open   hbase2es                        fK0TeN9zQemltXq3D29z7Q   1   1    1187578       100000    418.6mb        209.3mb
green  open   gcba0_202108                    eRkEpPP9RNeM30A7KBITzA  18   0          5            0     20.5kb         20.5kb
green  open   dcba0_202108                    q0Sq_wqEQc--K7fURhmmow  18   0          5            0     24.5kb         24.5kb

We usually use this API to check which indexes are in the cluster, the number of primary partitions, capacity, doc number of indexes, and whether replicas are set. If there are many indexes in the cluster, the API will return a lot of data, which is not easy to view. At this time, we can use kibana or cerebro and other visual interfaces to view. In addition, the API supports specifying an index, such as:

GET _cat/indices/hbase2es?v

And do some conditional filtering through grep on the Linux console, such as filtering the red index:

GET _cat/indices?v | grep red

2. View index settings

GET /{index_name}/_settings

Return Response:

{
  "hbase2es" : {
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "1",
        "translog" : {
          "sync_interval" : "5m",
          "durability" : "async"
        },
        "provided_name" : "hbase2es",
        "merge" : {
          "policy" : {
            "auto_merge_enabled" : "true",
            "inactive_merge_enabled" : "true"
          }
        },
        "creation_date" : "1630254239062",
        "number_of_replicas" : "1",
        "uuid" : "fK0TeN9zQemltXq3D29z7Q",
        "version" : {
          "created" : "7100199"
        }
      }
    }
  }
}

By viewing the settings information of the index, we can also get the number of primary partitions and set copies of the index. In addition, we can intuitively see the version, creation time, and refresh of the index_ Interval time, translog setting, etc. In fact, the most important thing is that you can get some attribute information set for the index through the API, such as whether the partition allocation policy is associated with the ILM policy, whether read-only is set, whether automatic merge is set, etc. viewing through the API has a good reference significance for troubleshooting the index red problem.

3. Set index settings

In the previous command, we introduced how to view the index settings and get a lot of meaningful information through the API. If we need to set index settings, we need to use the following API:

PUT /{index_name}/_settings

In this API, we can set many index attributes. The following focuses on several commonly used attribute setting APIs:

1) Sets the number of index copies

PUT /{index_name}/_settings
{
  "index": {
    "number_of_replicas": 0
  }
}

Through this API, we can quickly de copy an index or a batch of set indexes, or open a copy.

2) De index read only setting

PUT /{index_name}/_settings
{
  "index": {
    "blocks": {
      "read_only": "true"
    }
  }
}

There are several scenarios in which the index is set to read-only: the first is that the index is associated with ILM and is set to read-only in the Action of ILM; The second is that when the disk utilization of a node in the cluster exceeds the water mark, all indexes allocated on the node will be automatically set to read-only. For the second type of read-only, this strategy is to prevent data from being continuously written to the index, resulting in the destruction of cluster stability. In this case, businesses often see that the write of an index suddenly drops to zero. The solution is to expand the disk capacity immediately. ES at 7 After version x, the read-only setting of the index will be automatically released with the release of capacity, but 7 Before version x, you need to manually cancel the read-only API. However, how to quickly find out which indexes in the cluster are set as read-only indexes? Filter which indexes are set as read-only indexes by executing the following API on kibana:

GET /_cluster/state/blocks/indices

Return Response:

{
  "cluster_name" : "es-xxx",
  "cluster_uuid" : "2LLChSPGRgqZr1cz3b8cXw",
  "blocks" : {
    "indices" : {
      "wr_index_1" : {
        "5" : {
          "description" : "index read-only (api)",
          "retryable" : false,
          "levels" : [
            "write",
            "metadata_write"
          ]
        }
      }
    }
  }
}

In addition, if the index is set to read_ After only, any setting of the index will fail and the following error will be reported.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "cluster_block_exception",
        "reason" : "index [wr_index_1] blocked by: [FORBIDDEN/5/index read-only (api)];"
      }
    ],
    "type" : "cluster_block_exception",
    "reason" : "index [wr_index_1] blocked by: [FORBIDDEN/5/index read-only (api)];"
  },
  "status" : 403
}

In this state, you can neither delete the index nor close the index, nor set index blocks. read_ Only: false to restore, how should we restore the index to the read-write state? In this case, try the following API:

PUT _settings
{
  "index": {
    "blocks": {
      "read_only": "false"
    }
  }
}

From our extensive experience in cluster operation and maintenance, sometimes we can manually set the index to disable reading to solve some cluster performance problems. For example, we have encountered a log cluster with a very large scale of customer nodes, and the write is about 600w/s. Then, the operating students continued to query the data, directly hit the cpu and load of the cluster very high, resulting in a large number of read-write rejections and fuses, and the cluster was on the verge of collapse. But some people don't know who is inquiring. Therefore, it is impossible to directly notify relevant personnel to stop the operation. At this time, the fastest stop loss method is to immediately set the index currently being read as forbidden. The following API s can be executed:

PUT /{index_name}/_settings
{
  "index.blocks.read": true
}

At this time, the following exceptions will be reported when querying related indexes:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "cluster_block_exception",
        "reason" : "index [wr_index_1] blocked by: [FORBIDDEN/7/index read (api)];"
      }
    ],
    "type" : "cluster_block_exception",
    "reason" : "index [wr_index_1] blocked by: [FORBIDDEN/7/index read (api)];"
  },
  "status" : 403
}

3) Open index bulk_routing attribute

PUT /{index_name}/_settings
{
  "index.bulk_routing.enabled": true
}

Open the bulk of the index_ The routing attribute is usually applicable to log analysis and timing classes, and DOC is not specified_ Doc is not specified when ID is written and queried_ The scenario in which ID is queried. After this attribute is turned on, the write performance can usually be improved by about 20%. For details, please refer to Tencent cloud ES official document.

4) Turn on index auto merge policy

PUT /{index_name}/_settings
{
  "merge.policy.auto_merge_enabled":"true",
  "merge.policy.inactive_merge_enabled":"true"
}

When this attribute is enabled, segment merging can be triggered automatically. In addition, you can set the maximum number of threads in a merge.

PUT /{index_name}/_settings
{
    "index.merge.scheduler.max_thread_count": 2
}

5) Set index fragmentation allocation policy

PUT /{index_name}/_settings
 {
   "index.routing.allocation.include._tier_preference": "data_hot,data_warm,data_cold"
 }

The 7.x cluster introduces tier_preference attribute, through the node data_roles to define data_hot,data_warm and other attributes to mark the type of node. Specify the relevant tier in index settings_ Preference, which can control the allocation policy of index fragmentation. In addition, tier_preference if multiple values are set, such as data_hot,data_warm, the priority will be assigned to data_ On the hot node, if there is no data in the cluster_ Hot node, or data_ If none of the hot nodes are available, an attempt is made to set the data_ Assign on the warm node.

6) Limit the number of index shards allocated on the node

PUT /{index_name}/_settings
{
    "index": {
        "routing.allocation.total_shards_per_node": 2
    }
}

By knowing that the complete data of the index is composed of multiple partitions, and limiting the partitions allocated on each node, hot issues can be avoided. When allocating partitions, ES gives priority to allocating partitions to nodes with large disk space and less total partitions. With the unreasonable index planning and the expansion of cluster scale, it is easy to allocate multiple partitions of the same index to one node, or even some nodes are not allocated to partitions. By limiting the number of pieces allocated on each node, this problem can be avoided and uneven nodes can be prevented.

7) Index correlation ILM policy

PUT /{index_name}/_settings
{
    "index.lifecycle.name": "{ilm_policy_name}"
}

With the increase of cluster data, many log analysis clusters will try to introduce index life cycle management to dynamically maintain the cluster. First, we need to define an ILM Policy, and then specify the Policy name in the index template. In this way, the newly created index will be automatically managed by ILM, that is, the new index will be automatically associated with the defined Policy. However, for indexes before Policy is not defined, if you want to associate these stock indexes with Policy, you need to manually associate the stock indexes. The association method is to execute the above API and specify the specific Policy name in settings. For detailed usage of index lifecycle management, please refer to Principle and practice of Tencent cloud Elasticsearch index life cycle management.

8) Modify index compression algorithm

PUT /{index_name}/_settings
{
    "index.codec": "best_compression"
}

4. Freeze / unfreeze index

POST /{index_name}/_freeze

The biggest feature of frozen index is that the index no longer occupies memory space and only occupies hard disk space, but it can still be queried, but the query delay is relatively high. From our extensive experience in cluster operation and maintenance, freezing index is usually used to quickly alleviate the fuse problem caused by high cluster memory utilization. For example, we have encountered an IoT customer cluster before. They store the index data reported by each sensor in the cluster. Because each index field is up to two or three thousand and the amount of data is very large, the memory utilization of the cluster often soars to about 80%, frequently triggers OldGC, and causes a large number of read-write exceptions. At that time, the customer hoped that the cluster could be repaired as soon as possible to alleviate Kafka message accumulation and query timeout. At that time, I suggested that the customer freeze all the indexes on the hot node one month ago. This can quickly reduce the memory utilization and OldGC frequency of hot nodes. After the cluster is stable, cool down and expand the capacity. At the same time, unfreeze the previously frozen index. The execution API is as follows:

POST /{index_name}/_unfreeze

5. Reindex refactoring index

POST _reindex
{
  "source": {
    "index": "{source_index_name}"
  },
  "dest": {
    "index": "{dest_index_name}"
  }
}

Reindex is often used in scenarios such as field type change, number of primary partitions change, index migration, etc. For example, since a field of the source index is automatically mapped to a keyword type, but we expect it to be a text type, we can reindex the source index. Before the operation, we need to define the mapping of the target index. For another example, we often have some customers who want to migrate their clusters to Tencent cloud, so we can migrate the indexes directly through reindex. It should be noted here that when reindex is executed, the index needs to stop writing, otherwise data inconsistency will occur. If the source index data is large and the operation will not be completed for a while, you can add wait in the API_ for_ Completion = false parameter, so that when calling Reindex API, it will execute asynchronously and return a taskId. We can use the taskId to view the reindex status or even cancel the task.

In addition, if the source index is not open_ Source attribute, Reindex operation cannot be performed. In other words, the source index must store all the original documents.

6. Index alias correlation

1) Add index alias

POST /_aliases
{
    "actions": [
        {
            "add": {
                "index": "{index_name}", 
                "alias": "{alias_name}"
            }
        }
    ]
}

2) Remove index alias

POST /_aliases
{
    "actions": [
        {
            "remove": {
                "index": "{index_name}", 
                "alias": "{alias_name}"
            }
        }
    ]
}

Aliasing an index is usually when the client wants to read and write data to the cluster without perception. For example, in Logstash, data is written to the cluster by alias, but the specific index to which the data is finally written is not very concerned. In addition, during cluster operation and maintenance, there is another scenario, that is, during Reindex, we want to be transparent to the business side. In this way, you can add the same alias to the source index and the target index respectively.

In addition, we can use the following API s to view how many aliases are currently in the cluster, and which indexes are associated with these aliases, as well as through is_ write_ The index parameter is used to determine which index the alias points to is the written index.

GET /_cat/aliases?v

Return Response:

alias          index                 filter routing.index routing.search is_write_index
ilm-history-3  ilm-history-3-000001  -      -             -              false
.slm-history-3 .slm-history-3-000001 -      -             -              false
.security      .security-7           -      -             -              -
.slm-history-3 .slm-history-3-000002 -      -             -              true
ilm-history-3  ilm-history-3-000002  -      -             -              true

For example, in the above example, the alias ilm-history-3 is currently associated with two indexes, namely ilm-history-3-000001 and ilm-history-3-000002. The data currently written through the alias is directly written to the index ilm-history-3-000002. We can not only directly specify an index alias, but also set the index alias in the index template. In this way, the index is automatically associated with an alias after it is created. Query through this alias will automatically query these indexes:

PUT _template/{template_name}
{
    "index_patterns" : ["index_name*"],
    "settings" : {
        "number_of_shards" : 10
    },
    "aliases" : {
        "alias_index_1" : {},
        "alias_index_2" : {
            "filter" : {
                "term" : {"key_name" : "value_name" }
            },
            "routing" : "value_name"
        },
        "{index}-alias" : {} 
    }
}

Summary of common commands for cluster operation and maintenance

This paper introduces in detail some commands commonly used in the daily cluster operation and maintenance of Tencent cloud ES customers. Through this command, we can quickly locate, analyze and solve cluster performance problems. The following is summarized in the form of table:

command

API command description

GET _cat/indices?v

View basic information of cluster index

GET /{index_name}/_settings

View the settings information of the index

GET /{index_name}/_settings

Dynamically set the settings attribute of the index, such as replica fragmentation, refresh time, etc

POST /{index_name}/_freeze

Freeze index

POST _reindex

Rebuild index data

POST /_aliases

Set alias for index

Keywords: ElasticSearch Distribution index ElasticsearchService

Added by arhunter on Mon, 03 Jan 2022 20:12:56 +0200