brief introduction
Introduction to Cat
- CAT is a real-time application monitoring platform developed based on Java, which provides a comprehensive real-time monitoring and alarm service for meituan review.
- As the basic component of the server-side project, CAT provides Java, C / C + +, node JS, python, go and other multilingual clients have been deeply integrated in the infrastructure middleware framework (MVC framework, RPC framework, database framework, cache framework, message queue, configuration system, etc.) of meituan review, providing rich performance indicators, health status, real-time alarm, etc. for each business line of meituan review.
- The great advantage of CAT is that it is a real-time system. Most CAT systems are minute level statistics, but from data generation to the end of server processing, it is second level. The definition of second level is 48 minutes and 40 seconds. Basically, you can see 48 minutes and 38 seconds of data. The statistical granularity of the overall report is minute level; The second advantage is that the monitoring data is full statistics and pre calculated by the client; Link data is calculated by sampling.
Github address: https://github.com/dianping/cat/wiki
Cat product value
- Reduce fault discovery time
- Reduce the cost of fault location
- Secondary application optimization
Cat advantages
- Real time processing: the value of information will decrease sharply over time, especially in the process of accident handling
- Full data: full collection of index data for in-depth analysis of fault cases
- High availability: fault restoration and problem location need high availability monitoring to support
- Fault tolerance: the fault does not affect the normal operation of the business and is transparent to the business
- High throughput: the collection of massive monitoring data requires high throughput
- Scalable: it supports distributed, cross IDC deployment and horizontal expansion of the monitoring system
Compared with Skywalking, APM and link tracking, it feels like APM+metrics
Multilingual client: Java, C/C + +, node js,Python,Go
It supports four message models: Transaction, Event, Heartbeat and Metric.
Server deployment
Preparation stage
CAT installation environment
- Linux 2.6 and above (epoll can only be supported with 2.6 kernel). Please use Linux environment for online server deployment. Mac and Windows environment can be used as development environment. Meituan comments internal CentOS 6.5
- Java 6, 7 and 8. jdk7 is recommended for the server, and jdk6, 7 and 8 are supported for the client
- Maven 3 and above
- MySQL 5.6, 5.7 and later versions are not recommended, and the compatibility is unclear
- It is recommended to use tomcat for J2EE container and the recommended version 7. * Or 8.0
- Hadoop environment is optional. Generally, it is recommended that smaller companies directly use disk mode. You can apply for CAT server, 500GB disk or larger disk, which is attached to the / data / directory
Obviously, when we directly use the disk mode, the deployment becomes very easy, which is one of the reasons why I like CAT.
Overview of the steps to install the CAT cluster
- To initialize Mysql database, a set of CAT cluster needs to deploy a database, and the database script script / catapplication sql
- Several CAT servers are prepared to build clusters. Suppose there are three servers with IP addresses of 10.1.1.1, 10.1.1.2 and 10.1.1.3. The following deployment methods will take these IP addresses as examples
- Initialize the / data / directory and configure several configuration files / data / appdata / cat / * XML several configuration files, which are described in detail below
- Package and rename to cat War, put it into the root directory of webapps in the tomcat container, and start tomcat
- Modify the server configuration, routing configuration, and restart tomcat
Specific practice
1. Initialize the database
Create a cat database in MySQL, and then execute script / cat SQL statements in SQL. Address: https://github.com/dianping/cat/blob/master/script/CatApplication.sql
I posted it here, too
Note 1: only one database is required for a set of independent cat clusters (I have met some students who have installed a database on the server node of each cat before)
Note 2: use utf8mb4 for database coding, otherwise it may cause problems such as Chinese garbled code
Note 3: a MySQL system parameter: max_allowed_packet, whose default value is 1048576(1M), is modified to 1000M. After modification, you need to restart mysql
Or set global max_allowed_packet = 1000 * 1024 * 1024, but mysql restart fails
CREATE TABLE `dailyreport` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL COMMENT 'Report name, transaction, problem...', `ip` varchar(50) NOT NULL COMMENT 'Which machine does the report come from cat-consumer machine', `domain` varchar(50) NOT NULL COMMENT 'Report processing Domain information', `period` datetime NOT NULL COMMENT 'Report period', `type` tinyint(4) NOT NULL COMMENT 'Report data format, 1/xml, 2/json, Default 1', `creation_date` datetime NOT NULL COMMENT 'Report creation time', PRIMARY KEY (`id`), UNIQUE KEY `period` (`period`,`domain`,`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Day report'; CREATE TABLE `weeklyreport` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL COMMENT 'Report name, transaction, problem...', `ip` varchar(50) NOT NULL COMMENT 'Which machine does the report come from cat-consumer machine', `domain` varchar(50) NOT NULL COMMENT 'Report processing Domain information', `period` datetime NOT NULL COMMENT 'Report period', `type` tinyint(4) NOT NULL COMMENT 'Report data format, 1/xml, 2/json, Default 1', `creation_date` datetime NOT NULL COMMENT 'Report creation time', PRIMARY KEY (`id`), UNIQUE KEY `period` (`period`,`domain`,`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Weekly report'; CREATE TABLE `monthreport` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL COMMENT 'Report name, transaction, problem...', `ip` varchar(50) NOT NULL COMMENT 'Which machine does the report come from cat-consumer machine', `domain` varchar(50) NOT NULL COMMENT 'Report processing Domain information', `period` datetime NOT NULL COMMENT 'Report period', `type` tinyint(4) NOT NULL COMMENT 'Report data format, 1/xml, 2/json, Default 1', `creation_date` datetime NOT NULL COMMENT 'Report creation time', PRIMARY KEY (`id`), UNIQUE KEY `period` (`period`,`domain`,`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Monthly report'; CREATE TABLE `hostinfo` ( `id` int(11) NOT NULL AUTO_INCREMENT, `ip` varchar(50) NOT NULL COMMENT 'Deploy machine IP', `domain` varchar(200) NOT NULL COMMENT 'Project name corresponding to the deployment machine', `hostname` varchar(200) DEFAULT NULL COMMENT 'Machine domain name', `creation_date` datetime NOT NULL, `last_modified_date` datetime NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `ip_index` (`ip`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='IP Correspondence with project name'; CREATE TABLE `hourlyreport` ( `id` int(11) NOT NULL AUTO_INCREMENT, `type` tinyint(4) NOT NULL COMMENT 'Report type, 1/xml, 9/binary Default 1', `name` varchar(20) NOT NULL COMMENT 'Report name', `ip` varchar(50) DEFAULT NULL COMMENT 'Which machine does the report come from', `domain` varchar(50) NOT NULL COMMENT 'Report item', `period` datetime NOT NULL COMMENT 'Report period', `creation_date` datetime NOT NULL COMMENT 'Report creation time', PRIMARY KEY (`id`), KEY `IX_Domain_Name_Period` (`domain`,`name`,`period`), KEY `IX_Name_Period` (`name`,`period`), KEY `IX_Period` (`period`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED COMMENT='It is used to store real-time report information and processing results'; CREATE TABLE `hourly_report_content` ( `report_id` int(11) NOT NULL COMMENT 'report form ID', `content` longblob NOT NULL COMMENT 'Binary report content', `period` datetime NOT NULL COMMENT 'Report period', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`report_id`), KEY `IX_Period` (`period`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED COMMENT='Binary content of hourly Report'; CREATE TABLE `daily_report_content` ( `report_id` int(11) NOT NULL COMMENT 'report form ID', `content` longblob NOT NULL COMMENT 'Binary report content', `period` datetime COMMENT 'Report period', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`report_id`), KEY `IX_Period` (`period`) )ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED COMMENT='Binary content of daily report'; CREATE TABLE `weekly_report_content` ( `report_id` int(11) NOT NULL COMMENT 'report form ID', `content` longblob NOT NULL COMMENT 'Binary report content', `period` datetime COMMENT 'Report period', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`report_id`), KEY `IX_Period` (`period`) )ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED COMMENT='Binary content of weekly report'; CREATE TABLE `monthly_report_content` ( `report_id` int(11) NOT NULL COMMENT 'report form ID', `content` longblob NOT NULL COMMENT 'Binary report content', `period` datetime COMMENT 'Report period', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`report_id`), KEY `IX_Period` (`period`) )ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED COMMENT='Binary content of monthly report'; CREATE TABLE `businessReport` ( `id` int(11) NOT NULL AUTO_INCREMENT, `type` tinyint(4) NOT NULL COMMENT 'Report type report data format, 1/Binary, 2/xml , 3/json', `name` varchar(20) NOT NULL COMMENT 'Report name', `ip` varchar(50) NOT NULL COMMENT 'Which machine does the report come from', `productLine` varchar(50) NOT NULL COMMENT 'Which product group does the indicator come from', `period` datetime NOT NULL COMMENT 'Report period', `content` longblob COMMENT 'It is used to store the specific contents of the report', `creation_date` datetime NOT NULL COMMENT 'Report creation time', PRIMARY KEY (`id`), KEY `IX_Period_productLine_name` (`period`,`productLine`,`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED COMMENT='It is used to store business monitoring real-time report information and processing results'; CREATE TABLE `task` ( `id` int(11) NOT NULL AUTO_INCREMENT, `producer` varchar(20) NOT NULL COMMENT 'Task Creator ip', `consumer` varchar(20) NULL COMMENT 'Task executor ip', `failure_count` tinyint(4) NOT NULL COMMENT 'Number of task failures', `report_name` varchar(20) NOT NULL COMMENT 'Report name, transaction, problem...', `report_domain` varchar(50) NOT NULL COMMENT 'Report processing Domain information', `report_period` datetime NOT NULL COMMENT 'Report time', `status` tinyint(4) NOT NULL COMMENT 'Execution status: 1/todo, 2/doing, 3/done 4/failed', `task_type` tinyint(4) NOT NULL DEFAULT '1' COMMENT '0 Represents an hour task, and 1 represents a day task', `creation_date` datetime NOT NULL COMMENT 'Task creation time', `start_date` datetime NULL COMMENT 'start time, Start time of this execution', `end_date` datetime NULL COMMENT 'End time, End time of this execution', PRIMARY KEY (`id`), UNIQUE KEY `task_period_domain_name_type` (`report_period`,`report_domain`,`report_name`,`task_type`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Background task'; CREATE TABLE `project` ( `id` int(11) NOT NULL AUTO_INCREMENT, `domain` varchar(200) NOT NULL COMMENT 'entry name', `cmdb_domain` varchar(200) DEFAULT NULL COMMENT 'cmdb entry name', `level` int(5) DEFAULT NULL COMMENT 'Project level', `bu` varchar(50) DEFAULT NULL COMMENT 'CMDB Division', `cmdb_productline` varchar(50) DEFAULT NULL COMMENT 'CMDB Product line', `owner` varchar(50) DEFAULT NULL COMMENT 'Project Leader', `email` longtext DEFAULT NULL COMMENT 'Project group mail', `phone` longtext DEFAULT NULL COMMENT 'contact number', `creation_date` datetime DEFAULT NULL COMMENT 'Creation time', `modify_date` datetime DEFAULT NULL COMMENT 'Modification time', PRIMARY KEY (`id`), UNIQUE KEY `domain` (`domain`) )ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Basic information of the project'; CREATE TABLE `topologyGraph` ( `id` int(11) NOT NULL AUTO_INCREMENT, `ip` varchar(50) NOT NULL COMMENT 'Which machine does the report come from cat-client machine ip', `period` datetime NOT NULL COMMENT 'Report period,Accurate to minutes', `type` tinyint(4) NOT NULL COMMENT 'Report data format, 1/xml, 2/json, 3/binary', `content` longblob COMMENT 'It is used to store the specific contents of the report', `creation_date` datetime NOT NULL COMMENT 'Report creation time', PRIMARY KEY (`id`), KEY `period` (`period`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Topology curves for storing history'; CREATE TABLE `config` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(50) NOT NULL COMMENT 'Configuration name', `content` longtext COMMENT 'Details of configuration', `creation_date` datetime NOT NULL COMMENT 'Configure creation time', `modify_date` datetime NOT NULL COMMENT 'Configuration modification time', PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Global configuration information for the storage system'; CREATE TABLE `baseline` ( `id` int(11) NOT NULL AUTO_INCREMENT, `report_name` varchar(100) DEFAULT NULL, `index_key` varchar(100) DEFAULT NULL, `report_period` datetime DEFAULT NULL, `data` blob, `creation_date` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `period_name_key` (`report_period`,`report_name`,`index_key`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `alteration` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `type` varchar(64) NOT NULL COMMENT 'classification', `title` varchar(128) NOT NULL COMMENT 'Change title', `domain` varchar(128) NOT NULL COMMENT 'Change item', `hostname` varchar(128) NOT NULL COMMENT 'Change machine name', `ip` varchar(128) DEFAULT NULL COMMENT 'Change machine IP', `date` datetime NOT NULL COMMENT 'Change time', `user` varchar(45) NOT NULL COMMENT 'Change user', `alt_group` varchar(45) DEFAULT NULL COMMENT 'Change group', `content` longtext NOT NULL COMMENT 'Change content', `url` varchar(200) DEFAULT NULL COMMENT 'Change link', `status` tinyint(4) DEFAULT '0' COMMENT 'Change status', `creation_date` datetime NOT NULL COMMENT 'Database creation time', PRIMARY KEY (`id`), KEY `ind_date_domain_host` (`date`,`domain`,`hostname`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Change form'; CREATE TABLE `alert` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `domain` varchar(128) NOT NULL COMMENT 'Alarm items', `alert_time` datetime NOT NULL COMMENT 'Alarm time', `category` varchar(64) NOT NULL COMMENT 'Alarm classification:network/business/system/exception -alert', `type` varchar(64) NOT NULL COMMENT 'Alarm Type :error/warning', `content` longtext NOT NULL COMMENT 'Alarm content', `metric` varchar(128) NOT NULL COMMENT 'Alarm index', `creation_date` datetime NOT NULL COMMENT 'Data insertion time', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Store alarm information'; CREATE TABLE `alert_summary` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `domain` varchar(128) NOT NULL COMMENT 'Alarm items', `alert_time` datetime NOT NULL COMMENT 'Alarm time', `content` longtext NOT NULL COMMENT 'Unified alarm content', `creation_date` datetime NOT NULL COMMENT 'Data insertion time', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Unified alarm information'; CREATE TABLE `operation` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `user` varchar(128) NOT NULL COMMENT 'user name', `module` varchar(128) NOT NULL COMMENT 'modular', `operation` varchar(128) NOT NULL COMMENT 'operation', `time` datetime NOT NULL COMMENT 'Modification time', `content` longtext NOT NULL COMMENT 'Modification content', `creation_date` datetime NOT NULL COMMENT 'Data insertion time', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='User action log'; CREATE TABLE `overload` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `report_id` int(11) NOT NULL COMMENT 'report id', `report_type` tinyint(4) NOT NULL COMMENT 'Report type 1:hourly 2:daily 3:weekly 4:monthly', `report_size` double NOT NULL COMMENT 'Report size unit MB', `period` datetime NOT NULL COMMENT 'Report time', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`id`), KEY `period` (`period`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Excess capacity meter'; CREATE TABLE `config_modification` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `user_name` varchar(64) NOT NULL COMMENT 'user name', `account_name` varchar(64) NOT NULL COMMENT 'Account name', `action_name` varchar(64) NOT NULL COMMENT 'action name', `argument` longtext COMMENT 'Parameter content', `date` datetime NOT NULL COMMENT 'Modification time', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Configuration modification record table'; CREATE TABLE `user_define_rule` ( `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Self growth ID', `content` text NOT NULL COMMENT 'User defined rules', `creation_date` datetime NOT NULL COMMENT 'Creation time', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='User defined rule table'; CREATE TABLE `business_config` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL DEFAULT '' COMMENT 'Configuration name', `domain` varchar(50) NOT NULL DEFAULT '' COMMENT 'project', `content` longtext COMMENT 'Configuration content', `updatetime` datetime NOT NULL, PRIMARY KEY (`id`), KEY `updatetime` (`updatetime`), KEY `name_domain` (`name`,`domain`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE `metric_screen` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(50) NOT NULL COMMENT 'Configuration name', `graph_name` varchar(50) NOT NULL DEFAULT '' COMMENT 'Graph name', `view` varchar(50) NOT NULL DEFAULT '' COMMENT 'visual angle', `endPoints` longtext NOT NULL, `measurements` longtext NOT NULL COMMENT 'Configured indicators', `content` longtext NOT NULL COMMENT 'Details of configuration', `creation_date` datetime NOT NULL COMMENT 'Configure creation time', `updatetime` datetime NOT NULL COMMENT 'Configuration modification time', PRIMARY KEY (`id`), UNIQUE KEY `name_graph` (`name`,`graph_name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='System monitored screen to configure'; CREATE TABLE `metric_graph` ( `id` int(11) NOT NULL AUTO_INCREMENT, `graph_id` int(11) NOT NULL COMMENT 'Market ID', `name` varchar(50) NOT NULL COMMENT 'to configure ID', `content` longtext COMMENT 'Details of configuration', `creation_date` datetime NOT NULL COMMENT 'Configure creation time', `updatetime` datetime NOT NULL COMMENT 'Configuration modification time', PRIMARY KEY (`id`), UNIQUE `name` (`name`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='System monitored graph to configure'; CREATE TABLE `server_alarm_rule` ( `id` int(11) NOT NULL AUTO_INCREMENT, `category` varchar(50) NOT NULL COMMENT 'Monitoring classification', `endPoint` varchar(200) NOT NULL COMMENT 'Monitoring object ID', `measurement` varchar(200) NOT NULL COMMENT 'Monitoring indicators', `tags` varchar(200) NOT NULL DEFAULT '' COMMENT 'Monitoring indicator label', `content` longtext NOT NULL COMMENT 'Details of configuration', `type` varchar(20) NOT NULL DEFAULT '' COMMENT 'Data aggregation method', `creator` varchar(100) DEFAULT '' COMMENT 'Creator', `creation_date` datetime NOT NULL COMMENT 'Configure creation time', `updatetime` datetime NOT NULL COMMENT 'Configuration modification time', PRIMARY KEY (`id`), KEY `updatetime` (`updatetime`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Configuration of system alarm';
2. Create directories and configuration files
Create directory configuration permissions
mkdir /data chmod -R 777 /data/
This directory will store some necessary CAT configuration files and run-time data storage directories.
Create server configuration file
mkdir -p /data/appdatas/cat/ vim client.xml
Create the / data / appdata / cat folder in the disk where Tomcat is located, and then create / data / appdata / cat / client XML file, write the following contents:
<?xml version="1.0" encoding="utf-8"?> <config mode="client"> <servers> <server ip="192.168.1.111" port="2280" http-port="8080"/> </servers> </config>
ip is the ip of the server. 2280 is the default data receiving port of the CAT server and cannot be modified. HTTP port is the port started by Tomcat. The default is 8080. It is recommended to use the default port
If you modify HTTP port, you should also modify the port of tomcat
Create database profile
Configure / data / appdata / cat / datasources xml($CAT_HOME/datasources.xml)
vim datasources.xml
<?xml version="1.0" encoding="utf-8"?> <data-sources> <data-source id="cat"> <maximum-pool-size>3</maximum-pool-size> <connection-timeout>1s</connection-timeout> <idle-timeout>10m</idle-timeout> <statement-cache-size>1000</statement-cache-size> <properties> <driver>com.mysql.jdbc.Driver</driver> <url><![CDATA[jdbc:mysql://127.0.0.1:3306/cat]]></url> <!-- Please replace with the real database URL and port -- > <user>root</user> <!-- Please replace with the real database user name --> <password>root</password> <!-- Please replace with the real database password --> <connectionProperties><![CDATA[useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&socketTimeout=120000]]></connectionProperties> </properties> </data-source> </data-sources>
Replaced are: database IP, port, user name and password.
3.CATwar deployment
There are two ways to get the war package.
-
The source code is built in the source directory of cat. Execute mvn clean install -DskipTests
-
Download using the link below:
http://unidal.org/nexus/service/local/repositories/releases/content/com/dianping/cat/cat-home/3.0.0/cat-home-3.0.0.war
The official master version of cat, renamed cat War is deployed. Note that this war uses jdk8. For the server, please use jdk8
Copy to Tomcat's webapps and start Tomcat.
Modify the Chinese garbled code server in tomcat conf directory xml
<Connector port="8080" protocol="HTTP/1.1" URIEncoding="utf-8" connectionTimeout="20000" redirectPort="8443" /> <!-- increase URIEncoding="utf-8" -->
4. Modify client routing configuration
- The URL to open the console, http://ip:port/cat/s/config?op=routerConfigUpdate
- Default user name: admin default password: admin. CAT itself has no login and permission verification functions, please customize as needed
- An example of the update configuration is as follows:
<?xml version="1.0" encoding="utf-8"?> <router-config backup-server="192.168.1.111" backup-server-port="2280"> <default-server id="192.168.1.111" weight="1.0" port="2280" enable="true"/> <network-policy id="default" title="default" block="false" server-group="default_group"> </network-policy> <server-group id="default_group" title="default-group"> <group-server id="192.168.1.111"/> </server-group> <domain id="cat"> <group id="default"> <server id="192.168.1.111" port="2280" weight="1.0"/> </group> </domain> </router-config>
Configuration Description:
- Backup server attribute: set to the external IP address of the current server, and the port is fixed to 2280
- Default server attribute: defines the route addresses that can be jumped. Multiple can be set. The id attribute of default server configures the routable cat home service IP address, and the port is fixed to 2280; If the routing address needs to be disabled, set enable to false
- Multiple different network segments can be configured on the network policy side, which means that this network segment uses the cat node of the server group. This is mainly used to divide the cat into multiple sub clusters when deploying cat in multiple machine rooms, and then multiple sub clusters handle different clients to avoid cross leased line access
- domain id=cat is mainly used for customized routing. When you find that some items have a large amount of data or other scenarios, you can isolate the monitoring requests of these domains separately
5. Modify server configuration
Configuration link: http://{ip:port}/cat/s/config?op=serverConfigUpdate
Note: this only needs to be updated once. The configuration is saved in the mysql database.
<?xml version="1.0" encoding="utf-8"?> <server-config> <server id="default"> <properties> <property name="local-mode" value="false"/> <property name="job-machine" value="true"/> <property name="send-machine" value="false"/> <property name="alarm-machine" value="true"/> <property name="hdfs-enabled" value="false"/> <property name="remote-servers" value="192.168.1.111:8080"/> </properties> <storage local-base-dir="/data/appdatas/cat/bucket/" max-hdfs-storage-time="15" local-report-storage-time="2" local-logivew-storage-time="1" har-mode="true" upload-thread="5"> <hdfs id="dump" max-size="128M" server-uri="hdfs://127.0.0.1/" base-dir="/user/cat/dump"/> <harfs id="dump" max-size="128M" server-uri="har://127.0.0.1/" base-dir="/user/cat/dump"/> <properties> <property name="hadoop.security.authentication" value="false"/> <property name="dfs.namenode.kerberos.principal" value="hadoop/dev80.hadoop@testserver.com"/> <property name="dfs.cat.kerberos.principal" value="cat@testserver.com"/> <property name="dfs.cat.keytab.file" value="/data/appdatas/cat/cat.keytab"/> <property name="java.security.krb5.realm" value="value1"/> <property name="java.security.krb5.kdc" value="value2"/> </properties> </storage> <consumer> <long-config default-url-threshold="1000" default-sql-threshold="100" default-service-threshold="50"> <domain name="cat" url-threshold="500" sql-threshold="500"/> <domain name="OpenPlatformWeb" url-threshold="100" sql-threshold="500"/> </long-config> </consumer> </server> <server id="192.168.1.111"> <properties> <property name="job-machine" value="true"/> <property name="send-machine" value="false"/> <property name="alarm-machine" value="true"/> </properties> </server> </server-config>
Configuration Description:
Server node: represents the configuration of a machine. If the id is default, it represents the default configuration; If the id is ip, it represents the configuration of the server
- Local mode: defines whether the service is in local mode (development mode). In the production environment, set it to false and start the remote listening mode. The default is false;
- HDFS machine: defines whether to enable HDFS storage mode. The default value is false;
- Job machine: defines whether the current service is a report machine (only one service machine is required to enable the task of generating summary reports and statistical reports). The default value is false;
- Alarm machine: defines whether the current service is an alarm machine (only one service machine is required to enable all kinds of alarm monitoring). The default value is false;
- Send machine: defines whether the current service alarm is sent (at that time, in order to solve the problem that the alarm thread is started in the test environment, but the alarm is not notified in the end, this configuration will be gradually removed later. It is recommended that when the alarm machine is turned on to true, this synchronization is true)
Storage node: defines data storage configuration information
- Local report storage time: defines the storage time of local reports, in days
- Local logivew storage time: defines the storage time of local logs, in days
- Local base dir: defines the local data store directory
- HDFS: define HDFS configuration information to facilitate direct login to the system
- Server URI: defines the HDFS service address
- Console: defines the service console information
- Remote servers: defines the HTTP service list (this value is taken when the remote listener updates the server information synchronously)
- LDAP: define LDAP configuration information (this can be ignored)
- ldapUrl: defines the LDAP service address (this can be ignored)
Important: restart Tomcat.
6. Verification
visit http://192.168.1.111:8080/cat/r , click "State" to see "CAT server is normal" and some basic CAT states.
Seeing these transactions is successful.
Client integration
1.maven dependency
<dependency> <groupId>com.dianping.cat</groupId> <artifactId>cat-client</artifactId> <version>3.0.0</version> </dependency> above maven Not in the warehouse. Use this <dependency> <groupId>com.alidaodao.app</groupId> <artifactId>cat-client</artifactId> <version>3.0.0</version> </dependency>
2. Configure client xml
Create the / data / appdata / cat / directory and create the client XML file:
<?xml version="1.0" encoding="utf-8"?> <config xmlns:xsi="http://www.w3.org/2001/XMLSchema" xsi:noNamespaceSchemaLocation="config.xsd"> <servers> <server ip="127.0.0.1" port="2280" http-port="8080" /> </servers> </config>
Note: replace 127.0.0.1 with the IP of CAT server.
3. Configuration item name
Create the following files in each project:
src/main/resources/META-INF/app.properties
Then, add the following content to each item:
app.name=buy-buy-buy-checkout
Note: the project name can only contain English letters (a-z, A-Z), numbers (0-9) and underscores () And dash (-)
4. Test verification
Only when the designer has a deep understanding of monitoring and performance analysis can the monitoring API be defined. The scenarios for monitoring and performance analysis are as follows
- The execution time of a piece of code. A piece of code can be URL execution time or SQL execution time
- The number of times a piece of code is executed, such as the number of exception records thrown by the program, or the number of times a piece of logic is executed
- Execute a piece of code regularly, such as regularly reporting some core indicators, jvm memory, gc and other indicators
- Key business monitoring indicators, such as monitoring the number of orders, transaction volume, payment success rate, etc
Based on the above domain model, CAT designs several core monitoring objects: Transaction, Event, Heartbeat and Metric
String serverIp = ""; Transaction t = Cat.newTransaction("URL", "pageName"); try { // Record an event Cat.logEvent("URL.Server", serverIp, Event.SUCCESS, "ip="+serverIp+"&..."); // Record a business indicator to measure the total number of times in a unit time Cat.logMetricForCount("order.count"); // Record a timer business indicator to measure the average value in unit time Cat.logMetricForDuration("order.avg", 5); yourBusiness();// Own business code t.setStatus(Transaction.SUCCESS);// Set success status } catch (Exception e) { t.setStatus(e);//Set error status Cat.logError(e);// Report error } finally { t.complete();// end }
If you have any problems, you can go to the \ data\applogs\cat directory to see the log.
Function introduction
There is also an alarm function to be studied.
Core functions
CAT provides the following reports, that is, it supports the following types of functions.
- Transaction report The running time and times of a piece of code, such as URL, Cache, SQL execution times and response time
- Event Report The number of times a line of code is run, such as an exception
- Problem report Analyze the possible exceptions of the system according to the Transaction/Event data, including accessing slow programs, etc
- Heartbeat Report Some status information inside the JVM, such as Memory, Thread, etc
- Business Report Business monitoring reports, such as order indicators, payment and other business indicators
Transaction
Monitor the running status of a piece of code: running times, QPS, error times, failure rate, response time statistics (average impact time, Tp quantile value), etc.
The part that will be marked by default after the application is started:
Manage | Source component | describe |
---|---|---|
System | cat-client | Reporting management information of monitoring data |
URL | Need access to cat filter | Management information of URL access |
Transaction transaction = Cat.newTransaction("ShopService", "Service3getByUrl"); try { TimeUnit.MILLISECONDS.sleep(1300); transaction.setStatus(Transaction.SUCCESS); } catch (Exception e) { transaction.setStatus(e); // catch an exception and set the status, which indicates that the request failed Cat.logError(e); // Report exceptions to cat // You can also choose to throw up: throw e; } finally { transaction.complete(); }
Event
Monitor the number of times a piece of code runs: for example, record how many times an Event in the program is recorded and how many times an error is made. The overall structure of the Event report is almost the same as that of the Transaction report, except for the statistics of response time.
The part that will be marked by default after the application is started:
Manage | Source component | describe |
---|---|---|
System | cat-client | Report the management information of monitoring data, Reboot, etc |
I don't think so. Put it first.
Problem
Problem records the problems occurred during the operation of the whole project, including some exceptions, errors and long access behavior. The problem report is integrated by the existing features of logview, which is convenient for users to locate problems. Source:
- The business code display calls cat Logerror (E) API is used for embedding points. See the embedding point document for specific embedding point descriptions.
- Integration with the log framework will capture the exception log with exception stack in the log log.
- Long URL, which indicates the slow request of Transaction management URL
- Long SQL indicates the slow request of Transaction management SQL
- Long Service, indicating the slow request of Transaction management Service or PigeonService
- Long Call, which indicates the slow request of Transaction point Call or PigeonCall
- Long Cache, indicating Transaction management Cache Slow request at the beginning
Heartbeat
The Heartbeat report is a CAT client that periodically reports the current running status to the server in a one minute cycle.
JVM related indicators
All the following index statistics are within 1 minute, and the minimum statistical granularity of cat is one minute.
JVM GC related indicators | describe |
---|---|
NewGc Count / PS Scavenge Count | Cenozoic GC times |
NewGc Time / PS Scavenge Time | Cenozoic GC |
OldGc Count | GC times of elderly generation |
PS MarkSweepTime | Older generation GC time consuming |
Heap Usage | Java virtual machine heap usage |
None Heap Usage | Usage of Java virtual machine Perm |
JVM Thread related indicators | describe |
---|---|
Active Thread | System currently active thread |
Daemon Thread | System background thread |
Total Started Thread | The system has a total of open threads |
Started Thread | New threads started by the system every minute |
CAT Started Thread | CAT client startup thread in the system |
You can refer to Java lang.management. Definition of threadinfo
System index
System related indicators | describe |
---|---|
System Load Average | System Load details |
Memory Free | System memoryFree |
FreePhysicalMemory | Physical memory remaining space |
/ Free | /Root usage |
/data Free | /Usage of data disk |
Business
Business reports correspond to business indicators, such as order indicators. Different from Transaction, Event and Problem, business prefers macro indicators, and the other three prefer micro code execution.
Scenario example:
1. I want to monitor the order quantity. 2. I want to monitor the order time.
Cat.logMetricForCount("metric.key"); Cat.logMetricForDuration("metric.key", 5);
- Punctuation should be in pure English, without special symbols, such as space (), semicolon (:), vertical bar (|), slash (/), comma (,), and (&), asterisk (*), left and right angle brackets (< >), and some strange characters
- If there are separation requirements, it is recommended to use underline () Chinese dash (-), English dot (.) etc.
- Since the database is not case sensitive, please try to unify the case and do not change the case
- Decimals are possible: each point in the trend chart represents a value of one minute. Assuming that the monitoring interval is 10 minutes and reported 5 times in total within 10 minutes, the value of this point in the trend chart is 5% 10 = 0.5
Actual combat integration
Look at this: https://github.com/dianping/cat/tree/master/integration
SpringBoot integrated CatFilter
- After an http request comes, it will be automatically collated. It can record the access of each url, and string the subsequent call links of this request. You can view the logview on cat
- You can see the URL and URL on both cat Transaction and Event pages Forward (if there is a forward request) two types of data; The data clicked in the URL in the Transaction data is the specific URL to be accessed (remove the prefix part of the parameter)
@Configuration public class CatFilterConfigure { @Bean public FilterRegistrationBean catFilter() { FilterRegistrationBean registration = new FilterRegistrationBean(); CatFilter filter = new CatFilter(); registration.setFilter(filter); registration.addUrlPatterns("/*"); registration.setName("cat-filter"); registration.setOrder(1); return registration; } }
Logback configuration
It needs to be in logback Add the following configuration to XML:
<appender name="CatAppender" class="com.dianping.cat.logback.CatLogbackAppender"></appender> <root level="info"> <appender-ref ref="CatAppender" /> </root>
Upload the Error log to Cat
Message properties
-
type
Represents a type of message, such as SQL, RPC, or HTTP.
-
name
Represents a specific behavior, for example:
- If type is SQL, name can be select <? > from user where id = <?>, Represents an SQL template.
- If the type is RPC, the name can be QueryOrderByUserId(string, int), indicating the function signature of an API.
- If the type is HTTP, the name can be / api/v8/{int}/orders, indicating the underlying URI.
For more detailed information, it is recommended to record in the data field, such as api parameters
-
status
Indicates the status of the message.
When the status of the message is not "0", it will be marked as a "problem". No matter what the message type is, as long as it is marked as "problem", its message tree will not be aggregated, which also means that you can get its complete log information at any time.
-
data
Record the details of a message
- If the type is SQL, the data can be id=75442432
- If the type is RPC, the data can be UserType = Dianping & userid = 9987
- If the type is HTTP, the data can be orderId=75442432
In some cases, the data field will contain error stack information (for example, it represents an exception or error)