1. Operating system
Deploy on Linux
- I/O model
- Data network transmission efficiency
1.1 I/O model level
The I/O model is the method by which the operating system executes IO instructions.
There are five types
- Blocking IO
- Non blocking IO
- IO multiplexing
- Signal driven IO
- Asynchronous IO
We can simply think that the latter model is more efficient than the previous model. epoll model is between the third and fourth, and select belongs to the third.
The underlying layer of Kafka's client uses Java's selector, which is implemented in epoll on Linux and select on Windows. Therefore, Kafka deployed on Linux will have more efficient I/O performance.
1.2 network transmission efficiency
When data is transmitted between disk and network, you can enjoy the quickness, convenience and efficiency brought by zero copy mechanism on Linux, but not on Windows.
2. Disk
Ordinary mechanical hard disk is enough.
If you have sufficient funds, you can Raid the hard disk group;
If the funds are tight, the read-write performance can be guaranteed without using the Raid scheme.
2.1 choose ordinary mechanical hard disk or solid state hard disk
Ordinary mechanical hard disk can be used. Kafka storage mode is sequential reading and writing. The biggest disadvantage of mechanical hard disk is slow random reading and writing. Therefore, the use of mechanical hard disk will not cause low performance.
2.2 raid of disk group
raid has the advantages of providing redundant disk storage space and load balancing.
However, Kafka has its own redundancy mechanism, and realizes the function of load balancing through the design of zoning. Therefore, if you have the economic ability, you can put it on the storage space of the group raid. If you consider the cost performance, you can directly do no raid.
3. Disk capacity
- Number of messages per day
- Size of each message
- Number of copies
- Message retention time
- Enable compression
How much storage space does Kafka need
Design scenario: log data sends 100 million pieces of data to kafka every day. There are two copies of each data to prevent data loss. The data is saved for two weeks, and the average size of each message is 1KB.
- If 100 million 1KB messages are saved every day for two weeks, the total daily size is: 100 million 1KB2/1000/1000=200GB
- kafka has other types of data besides message data, so it needs 220GB to increase the redundancy space by 10%
- 220GB*14 ≈ 3TB in two weeks
- If compression is enabled and the compression ratio is about 0.75, the total storage space is planned to be 3TB*0.75=2.25TB
4. Bandwidth
-
If the network is 10 Gigabit bandwidth, there will be no network bottleneck. If the amount of data is particularly large, calculate according to the design scenario below.
-
If the network is 100m or Gigabit bandwidth, there will be network bottlenecks in the scenario of processing a large amount of data. It can be calculated and processed according to the following traditional empirical formula, or it can be designed according to the following scenario and the actual situation of its own production.
Empirical formula: number of servers = 2 × (producer peak production rate) × Number of copies ÷ 100) + 1
Design scenario: if the computer room has Gigabit bandwidth, we need to process 1TB of data in one hour. How many kafka servers do we need?
- Since the bandwidth is gigabit network, 1000Mbps=1Gbps, the amount of data each server can receive per second is 1Gb=1000Mb
- Assuming that Kafka occupies 70% of the whole server network (the other 30% is reserved for other services), Kafka can use 700Mb bandwidth. However, from a conventional point of view, we can't always let Kafka top the peak bandwidth, so we need to reserve 2 / 3 or even 3 / 4 of the resources, that is, the bandwidth used by a single Kafka server should actually be 700Mb/3=240Mb
- 1TB data needs to be processed in one hour, 1TB = 102410248mb = 800000mb, then the data processing volume in one second is 8000000Mb/3600s=2330Mb data.
- The number of servers required is 2330Mb/240Mb ≈ 10.
- Considering the number of copies of the message, if it is 2, 20 servers are required, and if it is 3, 30 servers are required.
5. Memory
It is recommended that the memory of the server node where Kafka is installed should be at least 16G.
Kafka's memory consists of heap memory + page cache.
5.1 heap memory configuration
10G-15G per node is recommended
It needs to be in Kafka server start SH modify
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then export KAFKA_HEAP_OPTS="-Xmx10G -Xms10G" fi
5.2 query of Kafka GC
Check the process number of Kafka through the jps command
[atguigu@hadoop102 kafka]$ jps 2321 Kafka 5255 Jps 1931 QuorumPeerMain
Check the GC status of Kafka according to the Kafka process number
[atguigu@hadoop102 kafka]$ jstat -gc 2321 1s 10 S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT 0.0 7168.0 0.0 7168.0 103424.0 60416.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 60416.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 60416.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 60416.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 60416.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 61440.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 61440.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 61440.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531 0.0 7168.0 0.0 7168.0 103424.0 61440.0 1986560.0 148433.5 52092.0 46656.1 6780.0 6202.2 13 0.531 0 0.000 0.531
S0C: size of the first surviving area
S1C: size of the second surviving area
S0U: use size of the first surviving area
S1U: use size of the second surviving area
EC: the size of Eden Park
EU: use size of Eden Park
OC: old age size
OU: size used in old age
MC: method area size
MU: size of method area
CCSC: compressed class space size
CCSU: compressed class space usage size
YGC: garbage collection times of young generation
YGCT: waste collection time of young generation
FGC: garbage collection times in old age
FGCT: waste collection time in old age
GCT: total time consumed by garbage collection
5.3 Kafka heap memory usage query
Check the heap memory of Kafka according to the Kafka process number
[atguigu@hadoop102 kafka]$ jmap -heap 2321 Attaching to process ID 2321, please wait... Debugger attached successfully. Server compiler detected. JVM version is 25.212-b10 using thread-local object allocation. Garbage-First (G1) GC with 8 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 2147483648 (2048.0MB) NewSize = 1363144 (1.2999954223632812MB) MaxNewSize = 1287651328 (1228.0MB) OldSize = 5452592 (5.1999969482421875MB) NewRatio = 2 SurvivorRatio = 8 MetaspaceSize = 21807104 (20.796875MB) CompressedClassSpaceSize = 1073741824 (1024.0MB) MaxMetaspaceSize = 17592186044415 MB G1HeapRegionSize = 1048576 (1.0MB) Heap Usage: G1 Heap: regions = 2048 capacity = 2147483648 (2048.0MB) used = 246367744 (234.95458984375MB) free = 1901115904 (1813.04541015625MB) 11.472392082214355% used G1 Young Generation: Eden Space: regions = 83 capacity = 105906176 (101.0MB) used = 87031808 (83.0MB) free = 18874368 (18.0MB) 82.17821782178218% used Survivor Space: regions = 7 capacity = 7340032 (7.0MB) used = 7340032 (7.0MB) free = 0 (0.0MB) 100.0% used G1 Old Generation: regions = 147 capacity = 2034237440 (1940.0MB) used = 151995904 (144.95458984375MB) free = 1882241536 (1795.04541015625MB) 7.471886074420103% used 13364 interned Strings occupying 1449608 bytes
5.4 Kafka page cache
The page cache is the memory of the Linux system server. We only need to ensure that 25% of the data in a segment (the default value is 1G) is in memory.
5.5 summary
Based on the above, Kafka needs at least 11G to run smoothly and stably in the big data scenario. It is recommended that the memory of the server node where Kafka is installed should be at least 16G.
6.CPU
It is recommended that the number of CPU cores of Kafka server should be more than 32.
Observe all Kafka thread related configurations. There are the following
Parameter name | remarks | Default value |
---|---|---|
num.network.threads | The number of threads used by the server to receive requests from the network and send responses to the network | 3 |
num.io.threads | The number of threads the server uses to process requests, which may include disk I/O | 8 |
num.replica.fetchers | The number of replica pull threads. Increasing this value can increase the parallelism of replica node pull | 1 |
num.recovery.threads.per.data.dir | The number of threads per data directory used for log recovery at startup and refresh at shutdown | 1 |
log.cleaner.threads | Number of background threads used for log cleanup | 1 |
background.threads | Number of threads used for various background processing tasks | 10 |
Among them, the fourth parameter is only used during startup and shutdown, and log cleaning is only available at a certain time interval. Therefore, there should be at least 22 resident threads.
In the production environment, it is recommended that the number of CPU cores should be at least 16 cores and more than 32 cores to ensure the normal processing and operation of Kafka cluster in big data environment.