Elasticsearch - core (10) - cluster configuration

1, Cluster management

1.1 single machine & cluster

  1. When a single Elasticsearch server provides services, it often has the maximum load capacity. If it exceeds this threshold, the server performance will be greatly reduced or even unavailable. Therefore, in the production environment, it usually runs in the specified server cluster
  2. In addition to load capacity, single point servers also have other problems
    • Limited storage capacity of a single machine
    • Single server is prone to single point of failure and cannot achieve high availability
    • The concurrent processing capability of a single service is limited
  3. When configuring a server cluster, there is no limit on the number of nodes in the cluster. If there are more than or equal to 2 nodes, it can be regarded as a cluster
  4. Generally, considering high performance and high availability, the number of nodes in the cluster is more than 3.

1.2 Cluster

  1. A cluster is organized by one or more server nodes to jointly hold the whole data and provide indexing and search functions together
  2. An elasticsearch cluster has a unique name ID, which is "elasticsearch" by default
  3. This name is very important. A node can only join a cluster by specifying the name of the cluster

1.3 Node

  1. The cluster contains many servers, and a node is one of them
  2. As a part of the cluster, it stores data and participates in the indexing and search functions of the cluster.
  3. A node is also identified by a name. By default, this name is the name of a random Marvel comic character. This name will be given to the node at startup. This name is important for management, because in this management process, you will determine which servers in the network correspond to which nodes in the Elasticsearch cluster
  4. A node can join a specified cluster by configuring the cluster name. By default, each node will be arranged to join a cluster called "elasticsearch", which means that if you start several nodes in your network and assume that they can discover each other, They will automatically form and join a cluster called "elastic search"
  5. You can have any number of nodes in a cluster. If no elasticsearch node is running in your network, start a node, and a cluster called "elasticsearch" will be created and joined by default
  6. Cluster health status
    • green: all primary allocations and replica shards are running normally
    • yellow: the primary partition is normal, but the replica partition is abnormal
    • red: the primary allocation exists and is not running normally

2, Windows cluster

  1. Copy the ES installation file and name it as follows according to the port number. Specifically, name it according to your own needs

  1. Node Description: the cluster tests a master node with two data nodes
    • 9200: master node
    • 9201: data node
    • 9202: data node
  2. Cluster operation is based on single node operation. The following clusters are configured incrementally. The pre configuration such as jdk refers to single node installation. First, ensure that the single node starts and runs normally

2.1 master node

  1. Modify config / elasticsearch YML file, add the following configuration
# Configuration information of cluster node 1
# Cluster name and nodes should be consistent
cluster.name: elasticsearch
# The node name must be unique in the cluster
node.name: node-9200
# Cluster master
node.master: true
# Data node
node.data: true
 
# ip address
network.host: localhost
# http port
http.port: 9200
# tcp listening port
transport.tcp.port: 9300

# The new node is used to join the primary node list of the cluster. Note that the port number is a tcp port, which is specified when there are multiple master nodes
#discovery.seed_hosts: ["localhost:9300", "localhost:9301","localhost:9302"]
#discovery.zen.fd.ping_timeout: 1m
#discovery.zen.fd.ping_retries: 5
 
# The list of node names in the cluster that can be selected as the primary node, which should be node master
#cluster.initial_master_nodes: ["node-9200", "node-9201","node-9202"]
 
#Cross domain configuration
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
  1. Start the master node and observe whether it starts normally

2.2 data node

  1. The configuration of the two data nodes is the same. You only need to change the http and transport ports according to their respective ports
  2. Modify config / elasticsearch YML file, add the following configuration
    • Note: node Master: false, only as a data node. If it is a master node, it needs to be enabled
    • Note: discovery seed_ Hosts needs to configure the master node list
    • elasticsearch-9201 configuration
# Configuration information of cluster node 2
# Cluster name and nodes should be consistent
cluster.name: elasticsearch
# The node name must be unique in the cluster
node.name: node-9201
# Cluster master
node.master: false
# Data node
node.data: true
 
# ip address
network.host: localhost
# http port
http.port: 9201
# tcp listening port
transport.tcp.port: 9301

# The new node is used to join the master node list of the cluster
discovery.seed_hosts: ["localhost:9300"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
 
# The list of node names in the cluster that can be selected as the primary node, which should be node master
#cluster.initial_master_nodes: ["node-9200", "node-9201","node-9202"]
 
#Cross domain configuration
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"
  • elasticsearch-9202 configuration
# Configuration information of cluster node 3
# Cluster name and nodes should be consistent
cluster.name: elasticsearch
# The node name must be unique in the cluster
node.name: node-9202
# Cluster master
node.master: true
# Data node
node.data: true
 
# ip address
network.host: localhost
# http port
http.port: 9202
# tcp listening port
transport.tcp.port: 9302

# The new node is used to join the primary node list of the cluster. Note that the port number is a tcp port, which is specified when there are multiple master nodes
discovery.seed_hosts: ["localhost:9300"]
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
 
# The list of node names in the cluster that can be selected as the primary node, which should be node master
#cluster.initial_master_nodes: ["node-9200", "node-9201","node-9202"]
 
#Cross domain configuration
#action.destructive_requires_name: true
http.cors.enabled: true
http.cors.allow-origin: "*"

  1. Start the two data nodes respectively, and then observe whether the three nodes are normal

2.3 cluster test

  1. Get cluster health information
    • GET _cluster/health
    • number_of_nodes: the cluster node is 3 at this time
    • number_of_data_nodes: the data node is also 3 at this time
{
  "cluster_name" : "elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 7,
  "active_shards" : 14,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}
  1. Write data to master node
PUT cluster/_doc/1
{
	"name": "Cluster test",
	"message": "Master node write"
}

  1. When two slave nodes query data, you can see that the data is synchronized successfully
{
	"took": 25,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 1,
			"relation": "eq"
		},
		"max_score": 1,
		"hits": [
			{
				"_index": "cluster",
				"_type": "_doc",
				"_id": "1",
				"_score": 1,
				"_source": {
					"name": "Cluster test",
					"message": "Master node write"
				}
			}
		]
	}
}

3, Linux Cluster

  1. Prepare three Linux servers
  2. Prepare an es service file and modify config / elasticsearch YML file, configure the following
    • Note that the front jdk and other environments refer to the single machine installation of the core module, and the configuration is completed first
# Cluster name
cluster.name: cluster-es
# Node name. The name of each node cannot be duplicate
node.name: node-1
# ip address. The address of each node cannot be repeated
network.host: linux1
# Master node
node.master: true
node.data: true
http.port: 9200
# Cross domain configuration
http.cors.allow-origin: "*"
http.cors.enabled: true
http.max_content_length: 200mb
# es7.x. This configuration is required to elect a master when initializing a new cluster
cluster.initial_master_nodes: ["node-1"]
# es7. New configuration after X, node discovery
discovery.seed_hosts: ["linux1:9300","linux2:9300","linux3:9300"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
# The number of data tasks started simultaneously in the cluster is 2 by default
cluster.routing.allocation.cluster_concurrent_rebalance: 16
# The number of concurrent recovery threads when adding or deleting nodes and load balancing. The default is 4
cluster.routing.allocation.node_concurrent_recoveries: 16
# When initializing data recovery, the number of concurrent recovery threads is 4 by default
cluster.routing.allocation.node_initial_primaries_recoveries: 16
  1. Distribute the configured es service to the remaining two servers without changing any configuration
  2. Start three servers respectively, and then follow the cluster test under windows cluster

Keywords: ElasticSearch

Added by karimali831 on Fri, 24 Dec 2021 07:08:45 +0200