background
The business purpose is to analyze the daily logs generated by nginx and apache, monitor the url, ip, rest interface and other information, and send the data to the elastic search service.
Contrast flume
No repeated consumption, no data loss At present, flume supports hdfs better (personal understanding)
Off line installation
Java home must be configured above java8
Standard I / O
bin/logstash -e 'input { stdin {} } output { stdout{} }'
File to standard output
First of all logstash in mkdir conf & touch file-stdout.conf vim file-stdout.conf input { file { path => "/home/bingo/data/test.log" start_position => "beginning" ignore_older => 0 } } output { stdout{} } //Last boot bin/logstash -f conf/file-stdout.conf #Multi file path = > "/ home / bingo / data / *. Log" #Multi directory path = > "/ home / bingo / data / * / *. Log" #Parameter description start_position: default end,It starts from the end of the file ignore_older: By default, logs over 24 hours are not parsed. 0 means no overdue logs are ignored
After executing the command, you will see the contents of the console output log file
- This mode can continuously monitor the input of a file
File to file
- The startup method is the same as file to standard output, but the difference is that the configuration file:
touch file-file.conf vim file-file.conf input { file { path => "/home/connect/install/data/test.log" start_position => "beginning" ignore_older => 0 } } output { file { path => "/home/connect/install/data/test1.log" } stdout{ codec => rubydebug } }
Upstream to elastic search
touch file-es.conf vim file-es.conf input { file { type => "flow" path => "/home/bingo/data/logstash/logs/*/*.txt" discover_interval => 5 start_position => "beginning" } } output { if [type] == "flow" { elasticsearch { index => "flow-%{+YYYY.MM.dd}" hosts => ["master01:9200", "worker01:9200", "worker02:9200"] } } }
Upstream to kafka
kafka to es
touch kafka-es.conf vim kafka-es.conf input { kafka { zk_connect => "kafkasit02zk01.cnsuning.com:2181,kafkasit02zk02.cnsuning.com:2181,kafkasit02zk03.cnsuning.com:2181" auto_offset_reset => "smallest" group_id => "bdes_clm_bs_tracking_log_json" topic_id => "clm_bs_tracking_log_json" consumer_threads => 2 codec => "json" queue_size => 500 fetch_message_max_bytes => 104857600 } } output { elasticsearch { hosts => ["olap01-sit.cnsuning.com:9900","olap02-sit.cnsuning.com:9900","olap03-sit.cnsuning.com:9900"] document_type => "bs_tracking_log" #document_id => "%{[mblnr]}%{[mjahr]}" flush_size => 102400 index => "clm" timeout => 10 } }
Reference resources: Getting started with Logstash Basics