GoAccess lightweight nginx log analysis tool

What is GoAccess

GoAccess is an open source, real-time Web log analysis tool running under the command line terminal.

The tool provides fast and diverse HTTP status statistics.

The analysis results can be viewed through client tools such as XShell, and Html reports can be generated.

GitHub address: https://github.com/allinurl/goaccess

Official website address: http://goaccess.io/

Install GoAccess

Test environment, centos7

#  yum -y install glib2 glib2-devel ncurses ncurses-devel GeoIP GeoIP-devel
#  wget http://tar.goaccess.io/goaccess-1.2.tar.gz
#  tar -xzvf goaccess-1.2.tar.gz
#  cd goaccess-1.2/
#  ./configure --enable-geoip --enable-utf8
#  make && make install

Default profile in

vi /usr/local/etc/goaccess.conf

time-format %H:%M:%S
date-format %d/%b/%Y
log-format

Next, let's test it. Now you need to write the rules of goaccess according to the log format of nginx

goaccess  -f /usr/local/nginx/logs/access.log -a > /root/test/report.html

Most of the articles and introductions on the network are only suitable for the nginx log format without any modification, and do not involve much in the custom log format. If you use the custom nginx log format, you need to pay special attention here. Once the log format is not configured correctly, the results of goaccess analysis will be very poor.

Take my nginx log format as an example:

log_format main      '$server_name $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" $upstream_addr $request_time $upstream_response_time;

According to the log format preset by goaccess, such logs cannot be analyzed, so we need to customize the log format.
My log format is:

log-format %^ %h %^ %^ [%d:%t %^] "%r" %s %b "%R" "%u" "%^" %^ %T %^
$server_name  			---> 	%h  		--->	host(client IP address,IPv4 and IPv6)
[$time_local] 			---> 	[%d:%t %^] 	--->	time	
$request				--->	"%r"			--->    Client requests,This requires a specific delimiter in the request(Single quote, double quote, or other)Analysis. without,We must use special combination specifiers%m %U %H. 
$status					--->	%s			--->	Send the status code of the client
$body_bytes_sent		--->	%b			--->	Return size to client
$http_referer			--->	%R			--->	"Referrer"HTTP Request header
$http_user_agent		--->	"%u"		--->	UA
$http_x_forwarded_for	--->	
$request_time			--->	%T			--->	Time of service request,In seconds or milliseconds. be careful:%D Will take precedence over%T If both are used.

In order to set the correct log format, I stepped on many pits and listed them first to avoid repeated encounters.
(1) By default, log format separates log information by spaces. Therefore, fields containing special characters such as spaces must be included in "". Such as field request http_user_agent, etc
(2) The nginx log format is separated by spaces, but it must be noted here that only one space can be used. At that time, I used two spaces in one place, which directly led to an error in the goaccess result.
(3) Every field in the nginx log must correspond to one in the log format. If a certain information in nginx is not required in the log format, use% ^ to skip the information.
(4) For each - log format in nginx log, a% ^ is required to skip. If it is "-", use "% ^"
(5) If there is:, in nginx log information, it should also be displayed in log format. For example, $time in nginx log_ Local contains:, so it is also [% d:%t% ^] in the corresponding position of log format

I hope these can help friends who use goaccess.

goaccess -f log [-c][-r][-m][-h][-q][-d][-g][-a][-o csv|json][-e IP_ADDRESS][...] 

User defined parameters and their corresponding fromat in the accesslog of nginx

%x match substitution time_format and date_ The global settings (timestamp) of two can be called at the same time
%t matches the setting of the override time format
%d matches the setting of the override date forma
%h client ip $remote_addr
%r request method r e q u e s t Request%m request algorithm is equivalent to Match of post or get in requestrequest
%The URL path of the U request (including any query string) is equivalent to r e q u e s t in of U R L Horse match The URL in the request matches the protocol of the% H request, which is equivalent to The URL in the request matches HTTP/1.1 in the request
%The s server returns the status code $status of the client
%b return the client's body size $body_bytes_sent
%R refer $http_referer
%u user-agent $http_user_agent
%D service request time, in microseconds $request_time
%T service request time, in seconds $request_time
%L time of service request, in milliseconds $request_time
%^Ignore areas without corresponding parameters

The above are all the matching parameters given by the official. For the original version, see

http://www.goaccess.io/man

The following is my customized nginx log format:

log_format  main_zdy  '$request_time - IP:$remote_addr - RealIP:$http_x_forwarded_for - [$time_local] $request - $status - $http_user_agent - $host - from:$http_referer';
Log:
0.000 - IP:3.3.3.3 - RealIP:1.1.1.1, 2.2.2.2 - [28/Jul/2017:16:04:15 +0800] POST /site/index.html HTTP/1.1 - 200 - Apache-HttpClient/UNAVAILABLE (java 1.4) - www.111111111.com - from:http://www.111111111.com
0.216 - IP:4.4.4.4 - RealIP:5.5.5.5, 6.6.6.6 - [28/Jul/2017:15:53:04 +0800] GET /client/serverlist?jsonpCallback=jQuery18206177038959697163_1501228347875&gid=163&wid=196&_=1501228353156 HTTP/1.1 - 200 - Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727) - www.111111111.com - from:http://www.1111111111111.com/

goaccess Format:
log-format %T %^ IP:%^ %^ RealIP:~h{," } %^ [%d:%t %^] %m %U %H %^ %s %^ %u %^ %^ %^ from:%R

%x a date and time field and date format variable that match the time format. This is when the timestamp is given a date and time, not in two independent variables.
%The t time field matches a variable in time format.
%The d date field matches the variables in the date format.
%v The server name sets the block (server or virtual host) according to the specification name.
%e the person identified by the user requests the document to be authenticated by HTTP.
%H host (client IP address, IPv4 and IPv6)
%R request from client. This needs to be resolved in the request specific delimiters (single quotation marks, double quotation marks, etc.). Otherwise, a single field is parsed using a combination of special format specifiers such as% m% U% Q and% H. Note: use% r to get the complete request or% m% U% Q and% h to form your request, not at the same time. Request method let

%m requested method.
%U request URL path. Note: if the query string is in% u, it is not necessary to use% q. However, if the URL path does not include any query string, you can use% q and the query string will be attached to the request.

%q query string.

%H request agreement.
%The status code that the server sends to the client.
%b the size that the server sends to the client.
%R source
%u user agent HTTP request header.
%D service request time, in microseconds $request_time
%T service request time, in seconds $request_time
%L the time of the service request, in milliseconds, as a decimal number.
%^Ignore this area.
%~Advance through the log string until you find the (! isspace) character for technical transformation.
~h host (client IP address, IPv4 and IPv6) is in the x-forward-for (XFF) domain.

For XFF, GoAccess uses a special specifier which consists of a tilde before the host specifier, followed by the character(s) that delimit the XFF field, which are enclosed by curly braces (i.e., ~h{,"}).
For example, ~h{," } is used in order to parse "11.25.11.53, 17.68.33.17" field which is delimited by a double quote, a comma, and a space.

XFF,GoAccess uses a special specifier to delimit the character (s) before the tilde host into the XFF field, which is enclosed by curly braces (i.e. ~ h {}).
For example, h ~ {} is used to parse "11.25.11.53,17.68.33.17" fields separated by a double quotation mark, comma, and space.

goaccess  -f /www/logs/nginx.log  -a > /data/wwwroot/web/test/report1.html


-f appoint nginx log file
-p Specify log format file
-o Output to specified html file
--real-time-html Real time refresh
--ws-url Bind a domain name

Generate HTML presentation

The generated file is displayed on the browser with a web server

goaccess -f /root/www.7477.com-access1000.log -a > /data/wwwroot/web/zabbix/1111/reporta1.html

goaccess -f /root/test.log -a > /data/wwwroot/web/zabbix/1111/reporta6.html

goaccess -f /root/www.7477.com-access181.log -a > /data/wwwroot/web/zabbix/1111/report.html
goaccess -f /root/www.7477.com-access1000.log -a > /data/wwwroot/web/zabbix/1111/report1.html

goaccess -f /root/www.funet8.com-access.log -a > /data/wwwroot/web/zabbix/1111/funet1.html

goaccess -f /root/1000.log -a > /data/wwwroot/web/test/report1.html

Keywords: Linux Operation & Maintenance Nginx

Added by Nomaad on Sun, 20 Feb 2022 06:37:29 +0200