ELK Unified Log Management Platform Part 3 - The Use of Logstash Group Plug-ins

1. ELK Unified Log Management Platform Part 3 - The Use of Logstash Group Plug-ins

In this blog, I will mainly explain the following knowledge points and practical experience for your reference:

_1. Standard Specification for Log Content of JAVA Applications:

_2. How to use the grok plug-in of logstash to split the message field?

_3. Delete the index of Es regularly:

1. Standard specification for log content of JAVA applications:

Recently, the company has been promoting the ELK project, and I am the operator of the ELK project. So there will be a lot of experience output for ELK project; because our company's business system is mainly developed in JAVA language, especially Spring Cloud, Spring Book and other frameworks. How to standardize the logs of business systems is a problem that R&D architects need to consider. At present, our ELK log specification is defined as follows:

<pattern>[%date{ISO8601}][%level] %logger{80} [%thread] Line:%-3L [%X{TRACE_ID}] ${dev-group-name}
${app-name} - %msg%n</pattern>

| Time | Log Level | Class File | Thread Number | Line of Code Occurring | Global Pipeline Number | Development Team | System Name | Log Information

Time: Record log generation time;
Log level: ERROR, WARN, INFO, DEBUG;
Class file: Print class file name;
Thread name: the name of the thread executing the operation;
Line of code occurrence: Log events occur in the location of the code;
Global Flow Number: Global Flow Number that runs through a business process;
Development Team: Team Name for System Development
 System Name: Project Name Group Name
 INFO: Record detailed log information

For example, the standard format of log output for a business system is as follows:

[2019-06-2409:32:14,262] [ERROR] com.bqjr.cmm.aps.job.ApsAlarmJob [scheduling-1] []
tstteam tst Line:157 - ApsAlarmJob class execute method,'[test system early warning] check index abnormal three early warning'early warning error: nested
exception is org.apache.ibatis.exceptions.PersistenceException: ### Error
querying database. Cause: java.lang.NullPointerException ### Cause:
java.lang.NullPointerException org.mybatis.spring.MyBatisSystemException:
nested exception is

2. How to use the grok plug-in of logstash to split the message field?

_Now our logs are output according to standard fields, but in the kibana interface it is still a message field. Now we need to decompose the message into each field, which can be searched by each field.

_Our ELK log platform architecture is: all business systems install filebeat log collection software, collect logs intact to KAFKA cluster, send logstash cluster from Kafka cluster, export logstash cluster to ES cluster, export ES cluster to kibana display and search. The reason why logstash software is used in the middle is that logstash software has powerful text processing functions, such as grok plug-in. It can realize the formatted output of text.

logstash software has built-in many regular expression templates, which can match logs such as nginx, httpd, syslog and so on.

#logstash default group syntax template path:
/usr/local/logstash-6.2.4/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns

#The grok grammar template comes with logstash:
[root@SZ1PRDELK00AP005 patterns]# ll
total 116
-rw-r--r-- 1 root root   271 Jun 24 16:05 application
-rw-r--r-- 1 root root  1831 Apr 13  2018 aws
-rw-r--r-- 1 root root  4831 Apr 13  2018 bacula
-rw-r--r-- 1 root root   260 Apr 13  2018 bind
-rw-r--r-- 1 root root  2154 Apr 13  2018 bro
-rw-r--r-- 1 root root   879 Apr 13  2018 exim
-rw-r--r-- 1 root root 10095 Apr 13  2018 firewalls
-rw-r--r-- 1 root root  5338 Apr 13  2018 grok-patterns
-rw-r--r-- 1 root root  3251 Apr 13  2018 haproxy
-rw-r--r-- 1 root root   987 Apr 13  2018 httpd
-rw-r--r-- 1 root root  1265 Apr 13  2018 java
-rw-r--r-- 1 root root  1087 Apr 13  2018 junos
-rw-r--r-- 1 root root  1037 Apr 13  2018 linux-syslog
-rw-r--r-- 1 root root    74 Apr 13  2018 maven
-rw-r--r-- 1 root root    49 Apr 13  2018 mcollective
-rw-r--r-- 1 root root   190 Apr 13  2018 mcollective-patterns
-rw-r--r-- 1 root root   614 Apr 13  2018 mongodb
-rw-r--r-- 1 root root  9597 Apr 13  2018 nagios
-rw-r--r-- 1 root root   142 Apr 13  2018 postgresql
-rw-r--r-- 1 root root   845 Apr 13  2018 rails
-rw-r--r-- 1 root root   224 Apr 13  2018 redis
-rw-r--r-- 1 root root   188 Apr 13  2018 ruby
-rw-r--r-- 1 root root   404 Apr 13  2018 squid

#Among them is a java template, which has built-in many java classes, timestamps, etc.
[root@SZ1PRDELK00AP005 patterns]# cat java
JAVACLASS (?:[a-zA-Z$_][a-zA-Z$_0-9]*\.)*[a-zA-Z$_][a-zA-Z$_0-9]*
#Space is an allowed character to match special cases like 'Native Method' or 'Unknown Source'
JAVAFILE (?:[A-Za-z0-9_. -]+)
#Allow special <init>, <clinit> methods
JAVAMETHOD (?:(<(?:cl)?init>)|[a-zA-Z$_][a-zA-Z$_0-9]*)
#Line number is optional in special cases 'Native method' or 'Unknown source'
JAVASTACKTRACEPART %{SPACE}at %{JAVACLASS:class}\.%{JAVAMETHOD:method}\(%{JAVAFILE:file}(?::%{NUMBER:line})?\)
# Java Logs
JAVATHREAD (?:[A-Z]{2}-Processor[\d]+)
JAVACLASS (?:[a-zA-Z0-9-]+\.)+[A-Za-z0-9$]+
JAVAFILE (?:[A-Za-z0-9_.-]+)
JAVALOGMESSAGE (.*)
# MMM dd, yyyy HH:mm:ss eg: Jan 9, 2014 7:13:13 AM
CATALINA_DATESTAMP %{MONTH} %{MONTHDAY}, 20%{YEAR} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) (?:AM|PM)
# yyyy-MM-dd HH:mm:ss,SSS ZZZ eg: 2014-01-09 17:32:25,527 -0800
TOMCAT_DATESTAMP 20%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND}) %{ISO8601_TIMEZONE}
CATALINALOG %{CATALINA_DATESTAMP:timestamp} %{JAVACLASS:class} %{JAVALOGMESSAGE:logmessage}
# 2014-01-09 20:03:28,269 -0800 | ERROR | com.example.service.ExampleService - something compeletely unexpected happened...
TOMCATLOG %{TOMCAT_DATESTAMP:timestamp} \| %{LOGLEVEL:level} \| %{JAVACLASS:class} - %{JAVALOGMESSAGE:logmessage}
[root@SZ1PRDELK00AP005 patterns]#

#But the default template alone can't match our company's custom log content, so I wrote one myself.
[root@SZ1PRDELK00AP005 patterns]# cat application
APP_DATESTAMP 20%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND})
THREADS_NUMBER (?:[a-zA-Z0-9-]+)
GLOBAL_PIPELINE_NUMBER (?:[a-zA-Z0-9-]+)
DEV_TEAM (?:[a-zA-Z0-9-]+)
SYSTEM_NAME (?:[a-zA-Z0-9-]+)
LINE_NUMBER (Line:[0-9]+)
JAVALOGMESSAGE (.*)
APPLOG \[%{APP_DATESTAMP:timestamp}\] \[%{LOGLEVEL:loglevel}\] %{JAVACLASS:class} \[%{THREADS_NUMBER:threads_number}\] \[%{GLOBAL_PIPELINE_NUMBER:global_pipeline_number}\] %{DEV_TEAM:team} %{SYSTEM_NAME:system_name} %{LINE_NUMBER:linenumber} %{JAVALOGMESSAGE:logmessage}

# Then you configure logstash

[root@SZ1PRDELK00AP005 patterns]# cat /usr/local/logstash/config/yunyan.conf
input {
  kafka {
    bootstrap_servers => "192.168.1.12:9092,192.168.1.14:9092,192.168.1.15:9092"
    topics_pattern => "elk-tst-tst-info.*"
    group_id => "test-consumer-group"
    codec => json
    consumer_threads => 3
    decorate_events => true
    auto_offset_reset => "latest"
  }
}

filter {
    grok {
             match => {"message" => ["%{APPLOG}","%{JAVALOGMESSAGE:message}"]}  #Note that APPLOG here is the name I defined above.
             overwrite => ["message"]

}
}

output {
  elasticsearch {
     hosts => ["192.168.1.19:9200","192.168.1.24:9200"]
     user => "elastic"
     password => "111111"
     index => "%{[@metadata][kafka][topic]}-%{+YYYY-MM-dd}"
     workers => 1
  }
}

#output {
#   stdout{
#      codec => "rubydebug"
#  }
#}

#It is generally recommended that when debugging, first output to stdout standard output, not directly to es. When standard output confirms that OK has been achieved, all formatted fields can be output separately, and then output to ES.
# How to write the regular expression of grok, there is an online grok expression test address: http://grokdebug.herokuapp.com/

_After the standard log content is output, the search can be queried according to the key:value format. For example, in the search bar, the log level: ERROR only searches for the log content with the ERROR level.

3. Delete Es index regularly:

_Index is defined according to the output plug-in configuration in logstash, such as the index by day, which is followed by the index name -%{+YYYY-MM-dd}. If you want to change the index by month, that is -%{+YYYY-MM}. Indexes of different contents should be defined in different ways. For example, the logs of operating system classes can be indexed by month if there are few changes every day. But the program log of the business system itself is more suitable to use the index by day because there are more logs produced every day. Because for elastic search, too large an index can also affect performance, and too many indexes can also affect performance. The main performance bottleneck of elastic search is in CPU
In the process of operation and maintenance ELK project, I found that because the index files are too large and the number of indexes is too large, but our es data node cpu configuration is too low, causing ES cluster crash. There are several ways to solve this problem. The first is to delete the useless index regularly, and the second is to optimize the index parameters of ES. The second point is that I haven't practiced it yet. After that, I summarize the document. First, I write out the methods of deleting the index regularly and manually.

#/bin/bash
#Designated date (7 days ago)
DATA=`date -d "1 week ago" +%Y-%m-%d`

#current date
time=`date`

#Delete logs 7 days ago
curl -u elastic:654321 -XGET "http://192.168.1.19:9200/_cat/indices/?v"|grep $DATA
if [ $? == 0 ];then
  curl -u elastic:654321 -XDELETE "http://127.0.0.1:9200/*-${DATA}"
  echo "to $time Clear $DATA Indexes!"
fi

curl -u elastic:654321 \-XGET "http://192.168.1.19:9200/_cat/indices/?v"|awk '{print $3}'|grep elk >> /tmp/es.txt
#Delete the index manually, output the index name to a text file, and then delete it through a loop
for i in `cat /tmp/es.txt`;do curl -u elastic:654321 -X DELETE "192.168.1.19:9200/$i";done

OK, that's all for the time being. Recently, I've been very busy with my work. It's hard to find time to update my technology blog. Basically, I work late in the evening or wake up early in the morning to update my blog. It's really hard to find time to update my blog because of many tasks during working hours. Thank you for your continued attention.

For more details, please pay attention to my personal Wechat public number "IT Operations and Maintenance in the Cloud Age". This public number aims to share new technologies and trends of Internet Operations and Maintenance, including IT Operations and Maintenance Industry consultation, Operations and Maintenance Technology Document Sharing. Focus on devops, jenkins, zabbix monitoring, kubernetes, ELK, the use of various middleware, such as redis, MQ, shell and python and other operational and maintenance programming languages; I have been engaged in IT operations and maintenance related work for more than 10 years. Since 2008, I have been engaged in Linux/Unix system operation and maintenance, and have a certain understanding of operation and maintenance related technologies. All the blog posts of this public number are the summary of my practical work experience, basically original blog posts. I would like to share my accumulated experience, experience and technology with you. I hope to grow and progress with you on the IT operation and maintenance career path.

Keywords: Linux Java curl kafka Spring

Added by new7media on Thu, 27 Jun 2019 00:34:23 +0300