1. Kafka installation and configuration
1.1 download Kafka
Download Kafka, kafka_2.11-2.1.1.tgz on the official website.
1.2 decompression
Extract to the specified directory. tar -zxvf -C your directory
1.3 start kafka
-
Switch to the installation path of kafka and start zookeeper
-
bin/zookeeper-server-start.sh config/zookeeper.properties &
Start kafka
bin/kafka-server-start.sh config/server.properties &
1.4 create topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic flumeTopic
2. Flume installation and configuration
2.1 download Flume
Download apache-flume-1.9.0-bin.tar.gz from Flume website.
2.2 decompression
tar -zxvf -C your directory
2.3 configure Flume
# Define a1 agent a1.sources = src1 a1.channels = ch1 a1.sinks = k1 # Define source a1.sources.src1.type = exec # Listening to log files a1.sources.src1.command = tail -F /home/hadoop001/code/iqiyiLog/logdata/logs a1.sources.src1.channels=ch1 #Define sinks a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink #Specify the topic of flume flow direction a1.sinks.k1.topic = flumeTopic a1.sinks.k1.brokerList = localhost:9092 a1.sinks.k1.batchSize = 20 a1.sinks.k1.requiredAcks = 1 a1.sinks.k1.channel = ch1 # Define channels a1.channels.ch1.type = memory a1.channels.ch1.capacity = 1000
2.4 start Flume
Switch to Flume path, execute
./bin/flume-ng agent --conf conf --conf-file conf/a1.conf --name a1 -Dflume.root.logger=INFO,console
3. Flume and Kafka connectivity test
3.1 data flow
Log data
Flume
kafka
3.2 use kafka for consumption
Switch to the path of kafka and execute the consumption command.
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic flumeTopic --from-beginning
The sample data is shown as follows:
124.29.143.100 2019-04-02 08:02:01 "GET www/2 HTTP/1.0" _ 200 100.187.29.143 2019-04-02 08:02:01 "GET toukouxu/821 HTTP/1.0" https://www.sogou.com/web?query = hunting ground 156.143.124.167 2019-04-02 08:02:01 "GET www/4 HTTP/1.0" _ 200 100.167.187.29 2019-04-02 08:02:01 "GET www/1 HTTP/1.0" _ 302 167.29.10.30 2019-04-02 08:02:01 "GET www/6 HTTP/1.0" _ 302 30.124.187.156 2019-04-02 08:02:01 "GET toukouxu/821 HTTP/1.0" https://search.yahoo.com/search?p = my PE teacher 302 30.167.10.156 2019-04-02 08:02:01 "GET www/1 HTTP/1.0" _ 200 143.132.29.124 2019-04-02 08:02:01 "GET www/2 HTTP/1.0" _ 302 132.156.29.187 2019-04-02 08:02:01 "GET www/1 HTTP/1.0" _ 302 29.100.124.132 2019-04-02 08:02:01 "GET www/3 HTTP/1.0" _ 200