The official account of WeChat: the story of operation and development.
Summary: wechat alerts are mainly sent to enterprises through webhook. Multi sink and multi configmap are used to realize multi wechat group alerts and @ corresponding persons.
Offline event alarm
Kube eventer is an k8s offline event collector open source by Alibaba. The open source address is
https://github.com/AliyunContainerService/kube-eventer/blob/master/docs/en/webhook-sink.md
In Kubernetes, there are two kinds of events. One is the Warning event, which indicates that the state transition of this event is generated between unexpected states; The other is the Normal event, which indicates the expected state, which is consistent with the current state.
We use the event of NPD to explain. The event affects the temporary problem of the node, but it is meaningful for system diagnosis. NPD uses the reporting mechanism of kubernetes to report the error information to kubernetes node by detecting the system log (such as the journal in centos). There is too much noise information in these logs (such as kernel logs). NPD will extract valuable information and generate offline events. In this way, I can get time on the node and handle it in time.
A standard Kubernetes event has the following important attributes, which can better diagnose and alarm problems.
Namespace: the namespace of the object that generated the event.
Kind: the type of object that binds the event, such as Node, Pod, Namespace, componentet, and so on.
Timestamp: the time when the event occurs, and so on.
Reason: the reason for this event. Message: specific description of the event.
image.png
The current sins support is as follows:
Sink Name | Description |
---|---|
dingtalk | sink to dingtalk bot |
sls | sink to alibaba cloud sls service |
elasticsearch | sink to elasticsearch |
honeycomb | sink to honeycomb |
influxdb | sink to influxdb |
kafka | sink to kafka |
mysql | sink to mysql database |
sink to wechat |
|
Today we mainly bring the opening and hanging skills of webhook. First look at the supported parameters:
-
level - Level of event (optional. default: Warning. Options: Warning and Normal)
-
namespaces - Namespaces to filter (optional. default: all namespaces,use commas to separate multi namespaces, namespace filter doesn't support regexp)
-
kinds - Kinds to filter (optional. default: all kinds,use commas to separate multi kinds. Options: Node,Pod and so on.)
-
reason - Reason to filter (optional. default: empty, Regexp pattern support). You can use multi reason fields in query.
-
method - Method to send request (optional. default: GET)
-
header - Header in request (optional. default: empty). You can use multi header field in query.
-
custom_body_configmap - The configmap name of request body template. You can use Template to customize request body. (optional.)
-
custom_body_configmap_namespace - The configmap namespace of request body template.
If each project namespace corresponds to the person in charge one by one, it can be associated with sink according to configmap. Changing online deployment is the most prone to events. Events can quickly find online image tag errors, image configuration errors and other problems.
First, configure map through custom_ body_ Use the value of configmap to select different configuration files. You can simply modify it to make it clearer.
Add Cluster:name to know which cluster event it is.
Add "mentioned_list": ["wangqin", "@ all"] to @ the corresponding person in charge.
--- apiVersion: v1 data: content: >- {"msgtype": "text","text": {"content": "Cluster:name\nEventType:{{ .Type }}\nEventNamespace:{{ .InvolvedObject.Namespace }}\nEventKind:{{ .InvolvedObject.Kind }}\nEventObject:{{ .InvolvedObject.Name }}\nEventReason:{{ .Reason }}\nEventTime:{{ .LastTimestamp }}\nEventMessage:{{ .Message }}","mentioned_list":["wangqing","@all"]}} kind: ConfigMap metadata: name: custom-webhook-body namespace: nameapce
Command part skills
sink is an array that can add many lines.
It mainly describes the notice of using webhook to enterprise wechat. Note that reason can support regular expressions. The event alarm of k8s machine is completed through configmap.
command: - "/kube-eventer" - "--source=kubernetes:https://kubernetes.default" ## .e.g,dingtalk sink demo - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=[^Unhealthy]&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body0&custom_body_configmap_namespace=xxxx&method=POST - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=BackOff&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body1&custom_body_configmap_namespace=xxxx&method=POST - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=Failed&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body2&custom_body_configmap_namespace=xxxxx&method=POST
Case list:
Create a robot for enterprise wechat group. For example: https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx .
apiVersion: apps/v1 kind: Deployment metadata: labels: name: kube-eventer name: kube-eventer namespace: namespace spec: replicas: 1 selector: matchLabels: app: kube-eventer template: metadata: labels: app: kube-eventer annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: dnsPolicy: ClusterFirstWithHostNet serviceAccount: kube-eventer containers: - image: registry.aliyuncs.com/acs/kube-eventer-amd64:v1.2.0-484d9cd-aliyun name: kube-eventer command: - "/kube-eventer" - "--source=kubernetes:https://kubernetes.default" ## .e.g,dingtalk sink demo - --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=[^Unhealthy]&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body0&custom_body_configmap_namespace=xxxx&method=POST #- --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=BackOff&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body1&custom_body_configmap_namespace=xxxx&method=POST #- --sink=webhook:https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxx&level=Warning&reason=Failed&namespaces=xxxx&header=Content-Type=application/json&custom_body_configmap=custom-webhook-body2&custom_body_configmap_namespace=xxxxx&method=POST env: # If TZ is assigned, set the TZ value as the time zone - name: TZ value: "Asia/Shanghai" volumeMounts: - name: localtime mountPath: /etc/localtime readOnly: true - name: zoneinfo mountPath: /usr/share/zoneinfo readOnly: true resources: requests: cpu: 200m memory: 100Mi limits: cpu: 500m memory: 250Mi volumes: - name: localtime hostPath: path: /etc/localtime - name: zoneinfo hostPath: path: /usr/share/zoneinfo --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: kube-eventer rules: - apiGroups: - "" resources: - events - configmaps verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kube-eventer roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: kube-eventer subjects: - kind: ServiceAccount name: kube-eventer namespace: namespace --- apiVersion: v1 kind: ServiceAccount metadata: name: kube-eventer namespace: namespace --- apiVersion: v1 data: content: >- {"msgtype": "text","text": {"content": "Cluster:name\nEventType:{{ .Type }}\nEventNamespace:{{ .InvolvedObject.Namespace }}\nEventKind:{{ .InvolvedObject.Kind }}\nEventObject:{{ .InvolvedObject.Name }}\nEventReason:{{ .Reason }}\nEventTime:{{ .LastTimestamp }}\nEventMessage:{{ .Message }}","mentioned_list":["wangqing","@all"]}} kind: ConfigMap metadata: name: custom-webhook-body namespace: nameapce
In this way, the simple allocation of who alarms and who processes can be completed. With event alarms, service problems and cluster problems can be found and repaired in time.
Official account: operation and development story
github: https://github.com/orgs/sunsharing-note/dashboard
Love life, love operation and maintenance
If you think the article is good, please click on the top right corner to send it to your friends or forward it to your circle of friends. Your support and encouragement is my greatest motivation. If you like, please pay attention to me~
Scanning QR code
Pay attention to me and maintain high-quality content from time to time
reminder
If you like this article, please share it with your circle of friends. For more information, please follow me.
........................