k8s集群部署vmalert和prometheusalert实现钉钉告警

先决条件

安装以下软件包:git, kubectl, helm, helm-docs,请参阅本教程。

1、安装 helm

wget https://xxx-xx.oss-cn-xxx.aliyuncs.com/helm-v3.8.1-linux-amd64.tar.gz
tar xvzf helm-v3.8.1-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin
rm -rf linux-amd64

2、安装victoria-metrics-alert

(1)使用以下命令添加 helm chart存储库

helm repo add vm https://victoriametrics.github.io/helm-charts/

helm repo update

(2)列出vm/victoria-metrics-alert可供安装的helm版本

helm search repo vm/victoria-metrics-alert -l

(3)victoria-metrics-alert将图表的默认值导出到文件values.yaml

helm show values vm/victoria-metrics-alert > values.yaml

(4)根据环境需要更改values.yaml文件中的值,完整配置参考如下

# Default values for victoria-metrics-alert.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

serviceAccount:
  # Specifies whether a service account should be created
  create: true
  # Annotations to add to the service account
  annotations: {}
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name:
  # mount API token to pod directly
  automountToken: true

imagePullSecrets: []

rbac:
  create: true
  pspEnabled: true
  namespaced: false
  extraLabels: {}
  annotations: {}

server:
  name: server
  enabled: true
  image:
    repository: victoriametrics/vmalert
    tag: "" # rewrites Chart.AppVersion
    pullPolicy: IfNotPresent
  nameOverride: ""
  fullnameOverride: ""

  ## See `kubectl explain poddisruptionbudget.spec` for more
  ## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/
  podDisruptionBudget:
    enabled: false
    # minAvailable: 1
    # maxUnavailable: 1
    labels: {}

  # -- Additional environment variables (ex.: secret tokens, flags) https://github.com/VictoriaMetrics/VictoriaMetrics#environment-variables
  env:
    []
    # - name: VM_remoteWrite_basicAuth_password
    #   valueFrom:
    #     secretKeyRef:
    #       name: auth_secret
    #       key: password

  replicaCount: 1

  # deployment strategy, set to standard k8s default
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%

  # specifies the minimum number of seconds for which a newly created Pod should be ready without any of its containers crashing/terminating
  # 0 is the standard k8s default
  minReadySeconds: 0

  # vmalert reads metrics from source, next section represents its configuration. It can be any service which supports
  # MetricsQL or PromQL.
  datasource:
    url: "http://192.168.47.9:8481/select/0/prometheus/"
    basicAuth:
      username: ""
      password: ""

  remote:
    write:
      url: ""
    read:
      url: ""

  notifier:
    alertmanager:
      url: "http://192.168.112.68:9093"

  extraArgs:
    envflag.enable: "true"
    envflag.prefix: VM_
    loggerFormat: json

  # Additional hostPath mounts
  extraHostPathMounts:
    []
    # - name: certs-dir
    #   mountPath: /etc/kubernetes/certs
    #   subPath: ""
    #   hostPath: /etc/kubernetes/certs
  #   readOnly: true

  # Extra Volumes for the pod
  extraVolumes:
    []
     #- name: example
     #  configMap:
     #    name: example

  # Extra Volume Mounts for the container
  extraVolumeMounts:
    []
    # - name: example
    #   mountPath: /example

  extraContainers:
    []
    #- name: config-reloader
    #  image: reloader-image

  service:
    annotations: {}
    labels: {}
    clusterIP: ""
    ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
    ##
    externalIPs: []
    loadBalancerIP: ""
    loadBalancerSourceRanges: []
    servicePort: 8880
    type: ClusterIP
    # Ref: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
    # externalTrafficPolicy: "local"
    # healthCheckNodePort: 0

  ingress:
    enabled: false
    annotations: {}
    #   kubernetes.io/ingress.class: nginx
    #   kubernetes.io/tls-acme: 'true'

    extraLabels: {}
    hosts: []
    #   - name: vmselect.local
    #     path: /select
    #     port: http

    tls: []
    #   - secretName: vmselect-ingress-tls
    #     hosts:
    #       - vmselect.local

    # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
    # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
    # ingressClassName: nginx
    # -- pathType is only for k8s >= 1.1=
    pathType: Prefix

  podSecurityContext: {}
  # fsGroup: 2000

  securityContext:
    {}
    # capabilities:
    #   drop:
    #   - ALL
    # readOnlyRootFilesystem: true
    # runAsNonRoot: true
  # runAsUser: 1000

  resources:
    {}
    # We usually recommend not to specify default resources and to leave this as a conscious
    # choice for the user. This also increases chances charts run on environments with little
    # resources, such as Minikube. If you do want to specify resources, uncomment the following
    # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
    # limits:
    #   cpu: 100m
    #   memory: 128Mi
    # requests:
    #   cpu: 100m
  #   memory: 128Mi

  # Annotations to be added to the deployment
  annotations: {}
  # labels to be added to the deployment
  labels: {}

  # Annotations to be added to pod
  podAnnotations: {}

  podLabels: {}

  nodeSelector: {}

  priorityClassName: ""

  tolerations: []

  affinity: {}

  # vmalert alert rules configuration configuration:
  # use existing configmap if specified
  # otherwise .config values will be used
  configMap: ""
  config:
    alerts:
      groups:
          - name: 磁盘挂载错误
            rules:
            - alert: 磁盘挂载错误
              expr: mount_error == 1
              for: 1m
              labels:
                level: 1
                severity: warning
              annotations:
                description: "{{$labels.job}}链{{$labels.instance}}节点磁盘挂载错误!"

serviceMonitor:
  enabled: false
  extraLabels: {}
  annotations: {}
#    interval: 15s
#    scrapeTimeout: 5s
  # -- Commented. HTTP scheme to use for scraping.
#    scheme: https
  # -- Commented. TLS configuration to use when scraping the endpoint
#    tlsConfig:
#      insecureSkipVerify: true

alertmanager:
  enabled: true
  replicaCount: 1
  podMetadata:
    labels: {}
    annotations: {}
  image: prom/alertmanager
  tag: v0.20.0
  retention: 120h
  nodeSelector: {}
  priorityClassName: ""
  resources: {}
  tolerations: []
  imagePullSecrets: []
  podSecurityContext: {}
  extraArgs: {}
  # key: value

  # external URL, that alertmanager will expose to receivers
  baseURL: ""
  # use existing configmap if specified
  # otherwise .config values will be used
  configMap: ""
  config:
    global:
      resolve_timeout: 5m
    route:
      # default receiver
      receiver: ops_notify
      # tag to group by
      group_by: [alertname]
      # How long to initially wait to send a notification for a group of alerts
      group_wait: 30s
      # How long to wait before sending a notification about new alerts that are added to a group
      group_interval: 60s
      # How long to wait before sending a notification again if it has already been sent successfully for an alert
      repeat_interval: 1h
    receivers:
      - name: ops_notify
        webhook_configs:
        - url: http://192.168.157.59:8080/prometheusalert?type=dd&tpl=prometheus-dd&split=false
          send_resolved: true
    inhibit_rules:
      - source_match:
          severity: 'warning'
        target_match:
          severity: 'warning'
        equal: ['alertname', 'job']

  templates: {}
  #  alertmanager.tmpl: |-
  service:
    annotations: {}
    type: ClusterIP
    port: 9093
    # if you want to force a specific nodePort. Must be use with service.type=NodePort
    # nodePort:
  ingress:
    enabled: false
    annotations: {}
    #   kubernetes.io/ingress.class: nginx
    #   kubernetes.io/tls-acme: 'true'
    extraLabels: {}
    hosts: []
    #   - name: alertmanager.local
    #     path: /
    #     port: web

    tls: []
    #   - secretName: alertmanager-ingress-tls
    #     hosts:
    #       - alertmanager.local

    # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
    # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
    # ingressClassName: nginx
    # -- pathType is only for k8s >= 1.1=
    pathType: Prefix
  persistentVolume:
    # -- Create/use Persistent Volume Claim for alertmanager component. Empty dir if false
    enabled: false
    # -- Array of access modes. Must match those of existing PV or dynamic provisioner. Ref: [http://kubernetes.io/docs/user-guide/persistent-volumes/](http://kubernetes.io/docs/user-guide/persistent-volumes/)
    accessModes:
      - ReadWriteOnce
    # -- Persistant volume annotations
    annotations: {}
    # -- StorageClass to use for persistent volume. Requires alertmanager.persistentVolume.enabled: true. If defined, PVC created automatically
    storageClass: ""
    # -- Existing Claim name. If defined, PVC must be created manually before volume will be bound
    existingClaim: ""
    # -- Mount path. Alertmanager data Persistent Volume mount root path.
    mountPath: /data
    # -- Mount subpath
    subPath: ""
    # -- Size of the volume. Better to set the same as resource limit memory property.
    size: 50Mi

(5)使用命令测试安装:

helm install vmalert vm/victoria-metrics-alert -f values.yaml -n victoria-metrics --debug --dry-run

(6)使用以下命令安装

helm install vmalert vm/victoria-metrics-alert -f values.yaml -n victoria-metrics

(7)通过运行以下命令获取 pod 列表

kubectl get pods -A | grep 'alert'

(8)通过运行以下命令获取应用程序

helm list -f vmalert -n victoria-metrics

(9)使用命令查看应用程序版本的历史记录vmalert

helm history vmalert -n victoria-metrics

(10)更新配置

cd /root/vmalert
#修改value.yaml文件
helm upgrade vmalert vm/victoria-metrics-alert -f values.yaml -n victoria-metrics

(11)查看service

kubectl get svc -n victoria-metrics

3、安装prometheusalert

(1)使用helm部署

git clone https://github.com/feiyu563/PrometheusAlert.git
cd PrometheusAlert/example/helm/prometheusalert
#如需修改配置文件,请更新config中的app.conf
helm install -n victoria-metrics prometheus-alert .

(2)values.yaml配置文件参考

cat /root/PrometheusAlert/example/helm/prometheusalert/values.yaml

# Default values for prometheusalert.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

global:
  imagePullSecrets: []
    # - name: "registry-secret"

replicaCount: 1

image:
  # 支持配置自定义模版需要重出镜像,或者使用本人构建镜像:lusson/prometheusalert:v1.0
  repository: feiyu563/prometheus-alert:v4.8
  pullPolicy: IfNotPresent

nameOverride: ""
fullnameOverride: ""

service:
  type: ClusterIP
  port: 8080

ingress:
  enabled: true
  annotations: {}
    # kubernetes.io/ingress.class: nginx
    # kubernetes.io/tls-acme: "true"
  hosts:
    - host: prometheusalert.xxxxx.com
      paths: ["/"]

  tls: []

resources:
  limits:
   cpu: 1000m
   memory: 1024Mi
  requests:
   cpu: 100m
   memory: 128Mi

nodeSelector: {}

tolerations: []

affinity: {}

(3)app.conf配置参考

cat /root/PrometheusAlert/example/helm/prometheusalert/config/app.conf

(4)ingress.yaml配置参考

cat /root/PrometheusAlert/example/helm/prometheusalert/templates/ingress.yaml

{{- if .Values.ingress.enabled -}}
{{- $fullName := include "prometheusalert.fullname" . -}}
{{- $svcPort := .Values.service.port -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ $fullName }}
  labels:
{{ include "prometheusalert.labels" . | indent 4 }}
  {{- with .Values.ingress.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
spec:
{{- if .Values.ingress.tls }}
  tls:
  {{- range .Values.ingress.tls }}
    - hosts:
      {{- range .hosts }}
        - {{ . | quote }}
      {{- end }}
      secretName: {{ .secretName }}
  {{- end }}
{{- end }}
  rules:
  {{- range .Values.ingress.hosts }}
    - host: {{ .host | quote }}
      http:
        paths:
        {{- range .paths }}
          - path: {{ . }}
            pathType: Prefix
            backend:
              service:
                name: {{ $fullName }}
                port:
                  number: {{ $svcPort }}
        {{- end }}
  {{- end }}
{{- end }}

(5)更新配置

cd /root/PrometheusAlert/example/helm/prometheusalert
helm upgrade -n victoria-metrics prometheus-alert .

(6)重启pod

删除Pod
helm delete prometheus-alert -n victoria-metrics

查看pods和service
kubectl get pods -n victoria-metrics
kubectl get svc -n victoria-metrics

重新安装
helm install -n victoria-metrics prometheus-alert .

查看pods和service
kubectl get pods -n victoria-metrics
kubectl get svc -n victoria-metrics

(1)告警测试

模板内容:

{{ $var := .externalURL}}{{ range $k,$v:=.alerts }}
{{if eq $v.status "resolved"}}
## [巡检恢复信息]({{$v.generatorURL}})
#### [{{$v.labels.alertname}}]({{$var}})
###### 告警级别:{{$v.labels.level}}
###### 开始时间:{{$v.startsAt}}
###### 故障主机:{{$v.labels.instance}}
##### {{$v.annotations.description}}
{{else}}
## [巡检告警信息]({{$v.generatorURL}})
#### [{{$v.labels.alertname}}]({{$var}})
###### 告警级别:{{$v.labels.level}}
###### 开始时间:{{$v.startsAt}}
###### 故障主机:{{$v.labels.instance}}
##### {{$v.annotations.description}}
{{end}}
{{ end }}

(2)查看日志

(3)查看钉钉告警

参考文档:https://github.com/VictoriaMetrics/helm-charts/tree/master/charts/victoria-metrics-alert

参考文档:https://github.com/feiyu563/PrometheusAlert/tree/master/example/helm/prometheusalert

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/82908.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

网络编程(TCP和UDP的基础模型)

一、TCP基础模型&#xff1a; tcp Ser&#xff1a; #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <arpa/inet.h> #include <netinet/in.h> #include <string.h> #include <head.h>#define PORT 88…

无涯教程-Perl - wait函数

描述 该函数等待子进程终止,返回已故进程的进程ID。进程的退出状态包含在$?中。 语法 以下是此函数的简单语法- wait返回值 如果没有子进程,则此函数返回-1,否则将显示已故进程的进程ID Perl 中的 wait函数 - 无涯教程网无涯教程网提供描述该函数等待子进程终止,返回已故…

Java 项目日志实例:LogBack

点击下方关注我&#xff0c;然后右上角点击...“设为星标”&#xff0c;就能第一时间收到更新推送啦~~~ LogBack 和 Log4j 都是开源日记工具库&#xff0c;LogBack 是 Log4j 的改良版本&#xff0c;比 Log4j 拥有更多的特性&#xff0c;同时也带来很大性能提升。LogBack 官方建…

Pytorch建立MyDataLoader过程详解

简介 torch.utils.data.DataLoader(dataset, batch_size1, shuffleNone, samplerNone, batch_samplerNone, num_workers0, collate_fnNone, pin_memoryFalse, drop_lastFalse, timeout0, worker_init_fnNone, multiprocessing_contextNone, generatorNone, *, prefetch_factorN…

TensorFlow2.1 模型训练使用

文章目录 1、环境安装搭建2、神经网络2.1、解决线性问题2.2、FAshion MNIST数据集使用 3、卷积神经网络3.1、卷积神经网络使用3.2、ImageDataGenerator使用3.3、猫狗识别案例3.4、参数优化 3.5、石头剪刀布案例4、词条化4.1、讽刺数据集的词条化和序列化4.2、词嵌入 1、环境安装…

[保研/考研机试] KY11 二叉树遍历 清华大学复试上机题 C++实现

题目链接&#xff1a; 二叉树遍历_牛客题霸_牛客网编一个程序&#xff0c;读入用户输入的一串先序遍历字符串&#xff0c;根据此字符串建立一个二叉树&#xff08;以指针方式存储&#xff09;。题目来自【牛客题霸】https://www.nowcoder.com/share/jump/43719512169254700747…

实数信号的傅里叶级数研究(Matlab代码实现)

&#x1f4a5;&#x1f4a5;&#x1f49e;&#x1f49e;欢迎来到本博客❤️❤️&#x1f4a5;&#x1f4a5; &#x1f3c6;博主优势&#xff1a;&#x1f31e;&#x1f31e;&#x1f31e;博客内容尽量做到思维缜密&#xff0c;逻辑清晰&#xff0c;为了方便读者。 ⛳️座右铭&a…

(2)、将SpringCache扩展功能封装为starter

(2)、将SpringCache扩展功能封装为starter 1、准备工作 前面我们写了一个common-cache模块,尽可能的将自定义的RedisConnectionFactory, RedisTemplate, RedisCacheManager等Bean封装了起来。 就是为了方便我们将其封装为一个Starter。 我们这里直接《SpringCache+Redis实…

微信个人号开发,实现机器人辅助社群操作

微信 iPad 协议是指用于在 iPad 设备上使用微信应用的技术协议。一般来说&#xff0c;通过该协议可以将微信账号同步到 iPad 设备上&#xff0c;并且可以在 iPad 上发送和接收微信消息&#xff0c;查看好友列表、聊天记录等功能。微信 iPad 协议是通过私有API实现的。 需要一定…

嵌入式:ARM Day6

作业:完成cortex-A7核UART总线实验 目的&#xff1a;1.输入a,显示b&#xff0c;将输入的字符的ASCII码下一位字符输出 2.原样输出输入的字符串 源码&#xff1a; uart4.h #ifndef __UART4_H__ #define __UART4_H__#include "stm32mp1xx_rcc.h" #incl…

九、Linux下,如何在命令行进入文本编辑页面?

1、文本编辑基础 说到文本编辑页面&#xff0c;那就必须提到vi和vim&#xff0c;两者都是Linux系统中&#xff0c;常用的文本编辑器 2、三种工作模式 3、使用方法 &#xff08;1&#xff09;在进入Linux系统&#xff0c;在输入vim text.txt之后&#xff0c;会进入文本编辑中&…

Python多组数据三维绘图系统

文章目录 增添和删除坐标数据更改绘图逻辑源代码 Python绘图系统&#xff1a; 基础&#xff1a;将matplotlib嵌入到tkinter &#x1f4c8;简单的绘图系统 &#x1f4c8;数据导入&#x1f4c8;三维绘图系统自定义控件&#xff1a;坐标设置控件&#x1f4c9;坐标列表控件 增添和…

【力扣】84. 柱状图中最大的矩形 <模拟、双指针、单调栈>

目录 【力扣】84. 柱状图中最大的矩形题解暴力求解双指针单调栈 【力扣】84. 柱状图中最大的矩形 给定 n 个非负整数&#xff0c;用来表示柱状图中各个柱子的高度。每个柱子彼此相邻&#xff0c;且宽度为 1 。求在该柱状图中&#xff0c;能够勾勒出来的矩形的最大面积。 示例…

使用ChatGPT进行创意写作的缺点

Open AI警告ChatGPT的使用者要明白此工具的局限性&#xff0c;更不应完全依赖。作为一位创作者&#xff0c;这一点非常重要&#xff0c;应尽可能地避免让版权问题或不必要的文体问题出现在自己的作品中。[1] 毕竟使用ChatGPT进行创意写作目前还有以下种种局限或缺点[2]&#xf…

5.5.webrtc的线程管理

今天呢&#xff0c;我们来介绍一下线程的管理与绑定&#xff0c;首先我们来看一下web rtc中的线程管理类&#xff0c;也就是thread manager。对于这个类来说呢&#xff0c;其实实现非常简单&#xff0c;对吧&#xff1f; 包括了几个重要的成员&#xff0c;第一个成员呢就是ins…

学习笔记230804---逻辑跳转this.$router.push在写法上的优化

今天和资深前端代码写重&#xff0c;同时写页面带参跳转&#xff0c;组长觉得他写的方式比我高端一点&#xff0c;我觉得确实是&#xff0c;像资深大佬学习。 我的写法&#xff1a; this.$router.push(/bdesign?applicationId${this.data.id}&appName${this.data.name})…

[VS/C++]如何更好的配置DLL项目中的成品输出

注意&#xff0c;解决方案与项目不放在同一个文件夹中&#xff0c;即不选中图中选项 直入主题 首先右键项目选择属性&#xff0c;或者选中项目然后AltEnter 选择配置属性下的常规 分别在四种配置中编辑输出目录如下 注意&#xff0c;四种配置要分别配置&#xff0c;一个个来…

一百六十一、Kettle——Linux上安装的kettle9.2开启carte服务(亲测、附流程截图)

一、目的 在Linux上安装好kettle9.2并且连接好各个数据库后&#xff0c;下面开启carte服务 二、实施步骤 &#xff08;一&#xff09;carte服务文件路径 kettle的Linux运行的carte服务文件是carte.sh &#xff08;二&#xff09;修改kettle安装路径下的pwd文件夹里的服务器…

Netty+springboot开发即时通讯系统笔记(四)终

实时性 1.线程池多线程&#xff0c;把消息同步给其他端和对方用户&#xff0c;其中数据持久化往往是最浪费时间的操作&#xff0c;可以使用mq异步存储&#xff0c;因为其他业务不需要拿着整条数据&#xff0c;只需要这条数据的id进行操作。 2。消息校验前置&#xff0c;放在t…

谷歌推出首款量子弹性 FIDO2 安全密钥

谷歌在本周二宣布推出首个量子弹性 FIDO2 安全密钥&#xff0c;作为其 OpenSK 安全密钥计划的一部分。 Elie Bursztein和Fabian Kaczmarczyck表示&#xff1a;这一开源硬件优化的实现采用了一种新颖的ECC/Dilithium混合签名模式&#xff0c;它结合了ECC抵御标准攻击的安全性和…