One sentence a day
Medalist don't grow on trees, you have to nurture them with love, with hard work, with dedication.
Gold medal players will not fall from the sky. You must water them with love, hard work and investment.
summary
Sharding is a method to distribute data across multiple machines. MongoDB uses sharding to support deployment with very large data sets and high-throughput operations.
Sharding refers to the process of splitting data and distributing it on different machines. Partitioning is sometimes used to represent this concept. By distributing data to different machines, we can store more data and handle more load without a powerful mainframe computer.
Database systems with large datasets or high-throughput applications can challenge the capacity of a single server. For example, a high query rate will deplete the CPU capacity of the server. Working set sizes larger than the RAM of the system emphasize the I / O capacity of disk drives.
There are two ways to solve system growth: vertical expansion and horizontal expansion.
- Vertical expansion means increasing the capacity of a single server, such as using a more powerful CPU, adding more RAM or increasing the amount of storage space. The limitations of available technologies may limit the ability of a single machine to be powerful enough for a given workload. In addition, cloud based providers have a hard upper limit based on the available hardware configurations. As a result, vertical scaling has an actual maximum.
- Horizontal expansion means dividing the system dataset and loading multiple servers, and adding other servers to increase capacity as needed. Although the overall speed or capacity of a single machine may not be high, each machine processing a subset of the entire workload may provide higher efficiency than a single high-speed high-capacity server. Expanding deployment capacity requires only adding additional servers as needed, which may be lower than the overall cost of high-end hardware for a single machine. MongoDB supports horizontal expansion through sharding.
assembly
The MongoDB sharded cluster consists of the following components:
- Sharding (storage): each shard contains a subset of shard data. Each shard can be deployed as a replica set.
- Mongos (routing): mongos acts as a query router and provides an interface between client applications and fragmented clusters.
- config servers: configure the metadata and configuration settings of the server storage cluster. From mongodb3 Starting with 4, you must deploy the configuration server as a replica set (CSRS)
The following figure describes the interaction of components in a fragmented cluster:
MongoDB partitions the data at the set level and distributes the set data on the partitions in the cluster.
example
Two fragment node replica sets (3 + 3) + one configuration node replica set (3) + two routing nodes (2), a total of 11 service nodes.
Creating a replica set of sharded nodes
All configuration files are placed directly in sharded_ Under the corresponding subdirectory of cluster, the default configuration file name is mongod conf
First replica set
1. Prepare the directory for storing data and logs
#-----------myshardrs01 mkdir -p /mongodb/sharded_cluster/myshardrs01_27018/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27018/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27118/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27118/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27218/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27218/data/db
2. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myshardrs01_ 27018/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myshardrs01_27018/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myshardrs01_27018/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27018/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27018 replication: #The name of the replica set replSetName: myshardrs01 sharding: #Slice role clusterRole: shardsvr
sharding.clusterRole:
Value | Description |
configsvr | Start this instance as a config server. The instance starts on port 27019 by default. |
shardsvr | Start this instance as a shard. The instance starts on port 27018 by default. |
be careful:
Setting sharding Clusterrole requires mongod instance to run replication. To deploy an instance as a replica set member, use
replSetName sets and specifies the name of the replica set.
3. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myshardrs01_ 27118/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myshardrs01_27118/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myshardrs01_27118/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27118/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27118 replication: #The name of the replica set replSetName: myshardrs01 sharding: #Slice role clusterRole: shardsvr
4. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myshardrs01_ 27218/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myshardrs01_27218/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myshardrs01_27218/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27218/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27218 replication: #The name of the replica set replSetName: myshardrs01 sharding: #Segmented role clusterRole: shardsvr
5. Start the first set of replica sets: one master, one replica and one arbitration
Start three mongod services in sequence:
/usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myshardrs01_27018/mongod.conf /usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myshardrs01_27118/mongod.conf /usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myshardrs01_27218/mongod.conf
6 initialize replica set and create master node
Use the client command to connect to any node, but try to connect to the master node: / usr/yltrcc/mongodb/bin/mongo --host 180.76.159.126 --port 27018
Execute command:
# Initialize replica set > rs.initiate() # View replica sets myshardrs01:SECONDARY> rs.status() # Master node configuration view myshardrs01:PRIMARY> rs.conf()
7 add replica node and arbitration node
# Add slave node myshardrs01:PRIMARY> rs.add("180.76.159.126:27118") # Add quorum node myshardrs01:PRIMARY> rs.addArb("180.76.159.126:27218") # View configuration myshardrs01:PRIMARY> rs.conf()
Second replica set
1. Prepare the directory for storing data and logs
#-----------myshardrs01 mkdir -p /mongodb/sharded_cluster/myshardrs01_27318/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27318/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27418/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27418/data/db \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27518/log \ & mkdir -p /mongodb/sharded_cluster/myshardrs01_27518/data/db
2. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myshardrs01_ 27318/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myshardrs01_27318/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myshardrs01_27318/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27318/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27318 replication: #The name of the replica set replSetName: myshardrs01 sharding: #Slice role clusterRole: shardsvr
3. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myshardrs01_ 27418/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myshardrs01_27418/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myshardrs01_27418/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27418/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27418 replication: #The name of the replica set replSetName: myshardrs01 sharding: #Slice role clusterRole: shardsvr
4. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myshardrs01_ 27518/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myshardrs01_27518/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myshardrs01_27518/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myshardrs01_27518/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27518 replication: #The name of the replica set replSetName: myshardrs01 sharding: #Slice role clusterRole: shardsvr
5. Start the first set of replica sets: one master, one replica and one arbitration
Start three mongod services in sequence:
/usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myshardrs01_27318/mongod.conf /usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myshardrs01_27418/mongod.conf /usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myshardrs01_27518/mongod.conf
6 initialize replica set and create master node
Use the client command to connect to any node, but try to connect to the master node: / usr/yltrcc/mongodb/bin/mongo --host 180.76.159.126 --port 27318
Execute command:
# Initialize replica set > rs.initiate() # View replica sets myshardrs01:SECONDARY> rs.status() # Master node configuration view myshardrs01:PRIMARY> rs.conf()
7 add replica node and arbitration node
# Add slave node myshardrs01:PRIMARY> rs.add("180.76.159.126:27418") # Add quorum node myshardrs01:PRIMARY> rs.addArb("180.76.159.126:27518") # View configuration myshardrs01:PRIMARY> rs.conf()
Configure node replica set creation
1. Prepare the directory for storing data and logs
#-----------myshardrs01 mkdir -p /mongodb/sharded_cluster/myconfigrs_27019/log \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27019/data/db \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27119/log \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27119/data/db \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27219/log \ & mkdir -p /mongodb/sharded_cluster/myconfigrs_27219/data/db
2. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myconfigrs_ 27019/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myconfigrs_27019/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myconfigrs_27019/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myconfigrs_27019/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27019 replication: #The name of the replica set replSetName: myconfigrs sharding: #Slice role clusterRole: configsvr
3. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myconfigrs_ 27119/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myconfigrs_27119/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myconfigrs_27119/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myconfigrs_27119/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27119 replication: #The name of the replica set replSetName: myconfigrs sharding: #Slice role clusterRole: configsvr
4. Create or modify a configuration file: VIM / mongodb / shared_ cluster/myconfigrs_ 27219/mongod. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/myconfigrs_27219/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/myconfigrs_27219/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/myconfigrs_27219/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27219 replication: #The name of the replica set replSetName: myconfigrs sharding: #Slice role clusterRole: configsvr
5. Start the first set of replica sets: one master, one replica and one arbitration
Start three mongod services in sequence:
/usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myconfigrs_27019/mongod.conf /usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myconfigrs_27119/mongod.conf /usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/myconfigrs_27219/mongod.conf
6 initialize replica set and create master node
Use the client command to connect to any node, but try to connect to the master node: / usr/yltrcc/mongodb/bin/mongo --host 180.76.159.126 --port 27219
Execute command:
# Initialize replica set > rs.initiate() # View replica sets myshardrs01:SECONDARY> rs.status() # Master node configuration view myshardrs01:PRIMARY> rs.conf()
7 add two replica nodes
# Add slave node myshardrs01:PRIMARY> rs.add("180.76.159.126:27119") myshardrs01:PRIMARY> rs.add("180.76.159.126:27219") # View configuration myshardrs01:PRIMARY> rs.conf()
Creation of routing node
First routing node
1. Prepare the directory for storing data and logs
#-----------mongos01 mkdir -p /mongodb/sharded_cluster/mymongos_27017/log
2. Create or modify a configuration file: VI / mongodb / shared_ cluster/mymongos_ 27017/mongos. conf
systemLog: #The destination of all log output sent by MongoDB is specified as a file destination: file #The path of the log file to which mongod or mongos should send all diagnostic logging information path: "/mongodb/sharded_cluster/mymongos_27017/log/mongod.log" #When the mongos or mongod instance restarts, mongos or mongod appends a new entry to the end of the existing log file. logAppend: true storage: #The directory where the mongod instance stores its data. storage. The dbpath setting applies only to mongod. dbPath: "/mongodb/sharded_cluster/mymongos_27017/data/db" journal: #Enable or disable persistent logging to ensure that data files remain valid and recoverable. enabled: true processManagement: #Enable daemon mode for running mongos or mongod processes in the background. fork: true #Specifies the file location where the process ID of the mongos or mongod process is saved, where mongos or mongod will write its PID pidFilePath: "/mongodb/sharded_cluster/mymongos_27017/log/mongod.pid" net: #The service instance binds all IPS, which has side effects. When the replica set is initialized, the node name will be automatically set to the local domain name instead of IP #bindIpAll: true #IP address bound by the service instance bindIp: localhost,192.168.0.2 #bindIp #Bound port port: 27017 sharding: #Specifies the configuration node replica set configDB: myconfigrs/180.76.159.126:27019,180.76.159.126:27119,180.76.159.126:27219
3. Start mongod service:
/usr/yltrcc/mongodb/bin/mongod -f /mongodb/sharded_cluster/mymongos_27017/mongos.conf
4 client login mongos: /usr/yltrcc/mongodb/bin/mongo --host 180.76.159.126 --port 27017
Through the routing node operation, only the configuration node is connected, and the fragmented data node is not connected, so the service data cannot be written.
5. Perform slice configuration on the routing node
Add shard: sh.addShard("IP:Port")
# Add first replica set sh.addShard("myshardrs01/192.168.0.2:27018,180.76.159.126:27118,180.76.159.126:2 7218") # Add second replica set sh.addShard("myshardrs02/192.168.0.2:27318,180.76.159.126:27418,180.76.159.126:2 7518")
Tip: if adding shards fails, you need to manually remove the shards, check the correctness of the information about adding shards, and then add the shards again.
Remove slice reference (understand):
use admin db.runCommand( { removeShard: "myshardrs02" } )
Enable sharding functions: sh.enableSharding("library name"), sh.shardCollection("library name. Collection name", {"key":1})
Configuring sharding in the articledb database on mongos
sh.enableSharding("articledb") # View slice status sh.status()
Set sharding: for set sharding, you must use the sh.shardCollection() method to specify the set and sharding keys.
Syntax format: sh.shardCollection(namespace, key, unique)
Parameter Description:
Parameter | Type | Description |
namespace | string | Namespace of the target collection to be shared (sharded), format: < Database >< collection> |
key | document | An index specification document used as a shard key. The shard key determines how MongoDB distributes documents between Shards. Unless the collection is empty, the index must exist before the shard collection command. If the set is empty, MongoDB creates an index before sharding the set, provided that the index supporting sharding key does not exist. Simply put: it consists of documents containing a field and the index traversal direction of the field. |
unique | boolean | When the value is true, the slice key field is restricted to ensure that it is a unique index. Hash policy slice keys do not support unique indexes. The default is false. |
6 insert data test after slicing
7 add another routing node
Beautiful sentences
When the film ended, the audience in the cinema all sobbed and left. My wife and I saw the end until the music stopped and the subtitles were pulled to the bottom. I can't seem to remember the dangerous plot completely, but I like and aftertaste the beautiful music of Indian films and the love that never gives up chasing although it is in the hidden line.
Hello, I am yltrcc, sharing technical points everyday. Welcome to my official account: ylcoder