前言
整理技术,在这篇文章中,将会搭建grafana+prometheus+cadvisor监控容器,并使用一个热门数据看板,再监控容器的性能指标
dashboard效果
这个是node-exporter采集到的数据,我没装node-exporter,而且这也不是本文的内容,所以这个看板就没东西
这个是容器性能指标
这个性能指标里东西就比较多了
准备配置文件
docker-compose.yaml
version: "3"
services:
grafana:
image: grafana/grafana:latest
container_name: grafana
environment:
- TZ=Asia/Shanghai
ports:
- 3000:3000
volumes:
- ./grafana-data:/var/lib/grafana
networks:
custom-bridge:
restart: unless-stopped
logging:
options:
max-size: "10m"
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
networks:
custom-bridge:
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./prometheus_data:/prometheus
ports:
- 19090:9090
logging:
options:
max-size: "10m"
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
restart: unless-stopped
networks:
custom-bridge:
volumes:
- /:/rootfs:ro
- /var/run:/var/run/:ro
- /sys:/sys:ro
- /var/lib/docker:/var/lib/docker:ro
ports:
- 9090:9090
logging:
options:
max-size: "10m"
networks:
custom-bridge:
external: true
在拉取cadvisor镜像时可能遇到网络问题,解决方法是参考这篇文章:docker daemon配置网络代理
prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: cadvisor
scrape_interval: 5s
static_configs:
- targets:
- cadvisor:8080
创建数据文件夹并设置权限码
mkdir grafana-data
mkdir prometheus_data
chmod 777 grafana-data
chmod 777 prometheus_data
启动并进入grafana配置数据源
运行docker-compose up -d
启动,启动后进入grafana网页端:http://pet.anarckk.me:3000/
,默认账号密码是 admin/admin
点击add new connection
搜索并选择prometheus
修改connection地址
最后测试并保存
选择一个热门的dashboard引用过来
先创建一个dashboard
再找一个热门的dashboard,我这里用的是 https://grafana.com/grafana/dashboards/16314-docker-container-os-node-node-exporter-cadvisor/ ,dashboard id 是 16314
选择import dashboard
复制id进去,然后点击load
选择数据源prometheus, 最后再点击import
全部完成,到这里,就可以看到前面的dashboard效果了
prometheus也可以单独查询指定的指标
打开prometheus的网页端: http://pet.anarckk.me:19090/graph
# 查询容器的下行速度
rate(container_network_receive_bytes_total{name="alist"}[10s])
# 查询容器的上行速度
rate(container_network_transmit_bytes_total{name="alist"}[10s])