Install ES7 stand-alone version and cluster version for centos7

ElasticSearch environment construction

1. Single machine, cluster and node concepts.

1.1 single machine

When a stand-alone ElasticSearch server provides services, it often has the maximum load capacity. If this threshold is exceeded, the server performance will be greatly reduced or even unavailable. Therefore, in the production environment, it generally runs on the specified server cluster.

In addition to load capacity, single point servers have other problems:

  • Limited storage capacity of a single machine
  • Single server is prone to single point of failure and cannot achieve high availability
  • The concurrent processing capacity of a single server is limited

When configuring a server cluster, there is no limit on the number of nodes in the cluster. If there are more than or equal to 2 nodes, it can be regarded as a cluster. Generally, considering high performance and high availability, the number of nodes in the cluster is more than 3.

1.2 cluster

A cluster is organized by one or more server nodes to jointly hold the whole data and provide it together

Index and search functions. An elasticsearch cluster has a unique name ID, which is "elasticsearch" by default. This name is important because a node can only join a cluster by specifying the name of the cluster.

1.3 nodes

The cluster contains many servers, and a node is one of them. As a part of the cluster, it stores data and participates in the indexing and search functions of the cluster.

A node can join a specified cluster by configuring the cluster name. By default, each node is scheduled to join a cluster called "elastic search", which means that if you start several nodes in your network and assume that they can find each other, they will automatically form and join a cluster called elastic search.

In a cluster, you can have as many nodes as you want. Moreover, if no elasticsearch node is running in your network, start a node at this time, and a cluster called "elasticsearch" will be created and joined by default.

2. Installation of Linux stand-alone version

tar.gz can be installed or RPM can be used. It is recommended to use rpm to register in the system service to facilitate start and stop.

2.1 installing tar.gz package

Software download address

Decompression software

tar -zxvf elasticsearch-7.4.2-linux-x86_64.tar.gz -C /usr/local/

Modify profile

/usr/local/elasticsearch-7.4.2/config
vi elasticsearch.yml
# Add the following configuration
# Modify current node name
cluster.name: elasticsearch
# Modify cluster name
node.name: node-1
# Modify data storage address
path.data=/usr/local/elasticsearch-7.4.2/data
# Modify log data storage address
path.log=usr/local/elasticsearch-7.4.2/log
# Modify ip allowed access address
network.host: 0.0.0.0
# Modify access port
http.port: 9200 
# Cluster initialization node
cluster.initial_master_nodes: ["node-1"]

Modify JVM parameters

-Xms128m
-Xmx256m

Create user

Because of security issues, Elasticsearch does not allow the root user to run directly, so create a new user.

# Add esuser user
useradd esuser
# Empowerment 
chown -R esuser:esuser /usr/local/elasticsearch-7.4.2/

Modify / etc/security/limits.conf

# Limit on the number of files that can be opened per process
* soft nofile 65536
* hard nofile 131072
# Operating system level limit on the number of processes created per user
* soft nproc 2048
* hard nproc 4096

Modify / etc/sysctl.conf

# Add the following content to the file
# The number of Vmas (virtual memory areas) that a process can own. The default value is 65536 
vm.max_map_count=655360

Reload

sysctl -p

execute

su esuser
# Enter $ES_HOME/bin
cd /usr/local/elasticsearch-7.4.2/bin
# Background start es
./elasticsearch -d

Test whether the startup is normal

Access virtual machine ip + port 9200. The display is normal as shown in the figure

Out of Service

# Find es service
ps -ef |grep elastic
# Kill es process
kill -9 xxx(Process number)

2.2 installing rpm package

The download address is the same as above, and select rpm package

rpm -ivh elasticsearch-7.6.2.rpm

Modify profile

# Add the following configuration
# Modify current node name
cluster.name: elasticsearch
# Modify cluster name
node.name: node-1
# Modify data storage address
path.data=/usr/local/elasticsearch-7.4.2/data
# Modify log data storage address
path.log=usr/local/elasticsearch-7.4.2/log
# Modify ip allowed access address
network.host: 0.0.0.0
# Modify access port
http.port: 9200 
# Cluster initialization node
cluster.initial_master_nodes: ["node-1"]

Modify file owner

chown elasticsearch:elasticsearch -R /usr/local/elasticsearch-7.4.2/

Start elasticsearch

 systemctl start elasticsearc

View es startup status

systemctl status elasticsearch

If the startup failure error message is not easily observed from the system service, you can also start es through the installation path first

/usr/share/elasticsearch/bin/elasticsearch

2.3 Linux Cluster build

It is basically the same as the stand-alone version, but the configuration file is slightly different

#Cluster name
cluster.name: cluster-es #Node name. The name of each node cannot be duplicate. node.name: node-1
#ip address. The address of each node cannot be repeated
network.host: node107 #Is it qualified to master node.master: true node.data: true http.port: 9200
# The head plug-in needs to open these two configurations
http.cors.allow-origin: "*"
http.cors.enabled: true
http.max_content_length: 200mb
#The configuration added after es7.x is required to elect master cluster.initial when initializing a new cluster_ master_ nodes: ["node-1"]
#New configuration after es7.x, node discovery
discovery.seed_hosts: ["node107:9300","node108:9300","node109:9300"] 
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
#The number of data tasks started simultaneously in the cluster is 2 by default cluster.routing.allocation.cluster_concurrent_rebalance: 16 #The number of concurrent recovery threads when adding or deleting nodes and load balancing. The default is 4 cluster.routing.allocation.node_concurrent_recoveries: 16 #When initializing data recovery, the number of concurrent recovery threads is 4 cluster.routing.allocation.node by default_ initial_ primaries_ recoveries: 16

Status indicates whether the current cluster is working normally in general. The three colors have the following meanings

colour meaning
green All primary and replica partitions work normally
yellow All primary shards are running normally, but not all replica shards are running normally
red The main partition does not operate normally

Keywords: ElasticSearch

Added by lrsdsout on Wed, 03 Nov 2021 04:26:25 +0200