Fluent D (TD agent) log processing

1. What is TD agent

TD agent is a log collector, which provides rich plug-ins to adapt to different data sources, output destinations and so on

In use, we can collect information from different sources to different places through simple configuration, and first send it to fluent D, and then fluent D forwards the information to different places through different plug-ins according to the configuration, such as files, SaaS Platform, databases, or even to another fluent D.

2. How to install TD agent

Linux system: centos

2.1 execution script

curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh

2.2 check whether it is installed

rpm -qa|grep td-agent

2.3 start command

start-up td-agent   systemctl start td-agent
Start service   /etc/init.d/td-agent start
 View service status   /etc/init.d/td-agent status
 Out of Service  /etc/init.d/td-agent stop
 Restart service  /etc/init.d/td-agent restart

2.4 default profile path

/etc/td-agent/td-agent.conf

2.5 default log file path:

/var/log/td-agent/td-agent.log

3. Noun interpretation

Source: Specifies the data source
match: Specifies the output address
filter: specifies an event handling procedure
System: used to set the system configuration
label: group output and filter
@Include: use it to include other configuration files in the configuration file
Plug ins: when collecting and sending logs, fluent D uses plug-ins. Some plug-ins are built-in. To use non built-in plug-ins, you need to install plug-ins

4. Profile resolution

# Receive events from 20000/tcp
# This is used by log forwarding and the fluent-cat command
<source>
  @type forward
  port 20000
</source>

# http://this.host:8081/myapp.access?json={"event":"data"}
<source>
  @type http
  port 8081
</source>

<source>
  @type tail
  path /root/shell/test.log
  tag myapp.access
</source>

# Match events tagged with "myapp.access" and
# store them to /var/log/td-agent/access.%Y-%m-%d
# Of course, you can control how you partition your data
# with the time_slice_format option.
<match myapp.access>
  @type file
  path /var/log/td-agent/access
</match>

sources configure the source of the log file

@type: specifies where the configuration file comes from

forward: from another fluent

http: parameter passed from an http request

tail: from a log file

Port: the developed data transmission port when reading data transmitted by other machines

path: read data location

Tag: the tag of the data, which matches the tag configured by match

match data forwarding configuration

myapp.access: the output label matches the input label

@type: output location, which can be output to kafak, local file, database and monggo

Path: when outputting to a file, the path of the file. If outputting to other locations, there will be other special configurations, such as the following configuration. Because it is forwarded to Kafka, many Kafka configurations are also configured in the match tag,

[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-wU9d5qLk-1642839362691)(img/1642670826444.png)]

The store tag in the tag. Each store indicates the storage direction of the data of the tag attribute. We can configure storage to Kafka and local file storage

<source>
 type tail
 format none

 path /var/log/apache2/access_log
 pos_file /var/log/apache2/access_log.pos
 tag mytail
</source>

5. Partial parameter interpretation

format:Configure the expression to filter the data. Only when the format The string of an expression can only be match In progress store Storage.

type tail: tail The way is Fluentd The principle of the built-in input method is to continuously obtain the incremental log from the source file, and linx command tail Similarly, other input methods can be used, such as http,forward And so on. You can also use the input plug-in to tail Change to the corresponding plug-in name, such as: type tail_ex  ,be careful tail_ex Underline.

format apache: Specify use Fluentd Built in Apache Log parser. You can configure the expression yourself.

path /var/log/apache2/access_log: Specify the location of the collection log file.

Pos_file /var/log/apache2/access_log.pos:This parameter is strongly recommended, access_log.pos Files can be generated automatically. Note that access_log.pos Write permission to file because access_log The last read length is written to the file, which is mainly guaranteed in fluentd After the service is down and restarted, it can continue to collect, avoid the loss of log data collection and ensure the integrity of data collection.
 

6. Profile case

6.1. Transmit data to the log and Kafka simultaneously via http

 

# http://this.host:8888/mytail?json={"event":"data"}
<source>
  @type http
  port 8081 
 </source>



<match mytail>
 @type copy 
  <store>
  @type kafka
  brokers localhost:9092
  default_topic test1
  default_message_key message
  ack_timeout 2000
  flush_interval 1
  required_acks -1
  </store>
  <store>
   @type file
   path /var/log/td-agent/access
  </store>
</match>

6.2 transfer data to log and Kafka at the same time by reading files


<source>
type tail
 format none 
 path /var/log/apache2/access_log
pos_file /var/log/apache2/access_log.pos
 tag mytail
</source>

<match mytail>
 @type copy 
  <store>
  @type kafka
  brokers localhost:9092
  default_topic test1
  default_message_key message
  ack_timeout 2000
  flush_interval 1
  required_acks -1
  </store>
  <store>
   @type file
   path /var/log/td-agent/access
  </store>
</match>

Keywords: Operation & Maintenance server IDE

Added by josephferris on Tue, 25 Jan 2022 04:27:25 +0200