Grafana 是由 Torkel Ödegaard 于 2014 年创建的开源数据可视化和监控平台,是云原生可观测性领域的核心工具。Grafana 支持连接数十种数据源,通过丰富的图表和告警功能,将海量数据转化为 直观的仪表板。
Grafana 的核心定位是 统一的数据可视化平台。它提供了:
Grafana 由 Torkel Ödegaard 于 2014 年创建,最初是作为 Graphite 和 InfluxDB 的可视化工具。Grafana 很快支持了更多数据源,成为最流行的监控仪表板工具。
# 配置 Prometheus 数据源
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus:9090
access: proxy
isDefault: true
jsonData:
timeInterval: 15s
# 配置 MySQL 数据源
- name: MySQL
type: mysql
url: mysql:3306
database: mydb
user: root
secureJsonData:
password: password
jsonData:
maxOpenConns: 100
maxIdleConns: 100
# 配置 Elasticsearch 数据源
- name: Elasticsearch
type: elasticsearch
url: http://elasticsearch:9200
access: proxy
database: logs-*
jsonData:
timeField: "@timestamp"
esVersion: 7.0
{
"title": "系统监控",
"panels": [
{
"title": "CPU 使用率",
"type": "graph",
"gridPos": {"h": 8, "w": 12},
"targets": [
{
"expr": "100 - (avg(rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
"legendFormat": "CPU 使用率"
}
]
},
{
"title": "内存使用率",
"type": "graph",
"gridPos": {"h": 8, "w": 12},
"targets": [
{
"expr": "(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100",
"legendFormat": "内存使用率"
}
]
},
{
"title": "磁盘空间",
"type": "graph",
"gridPos": {"h": 8, "w": 12},
"targets": [
{
"expr": "(node_filesystem_avail_bytes{mountpoint=\"/\"} / node_filesystem_size_bytes{mountpoint=\"/\"}) * 100",
"legendFormat": "磁盘使用率"
}
]
},
{
"title": "网络流量",
"type": "graph",
"gridPos": {"h": 8, "w": 12},
"targets": [
{
"expr": "rate(node_network_receive_bytes_total[5m])",
"legendFormat": "接收"
},
{
"expr": "rate(node_network_transmit_bytes_total[5m])",
"legendFormat": "发送"
}
]
}
],
"time": {"from": "now-6h", "to": "now"},
"refresh": "30s"
}
// CPU 使用率(百分比)
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
// 内存使用率(百分比)
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
// 磁盘使用率(百分比)
(1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) * 100
// 网络接收流量
rate(node_network_receive_bytes_total[5m])
// 网络发送流量
rate(node_network_transmit_bytes_total[5m])
// Pod CPU 使用率
sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod)
// Pod 内存使用率
sum(container_memory_working_set_bytes{container!=""}) by (pod)
// QPS(每秒请求数)
sum(rate(http_requests_total[1m]))
// 错误率
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
# 告警规则配置
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# 告警联系点
receivers:
- name: email
email_configs:
- to: admin@example.com
from: alert@example.com
smarthost: smtp.example.com:587
auth_username: alert@example.com
auth_password: password
- name: webhook
webhook_configs:
- url: http://webhook.example.com/alert
send_resolved: true
- name: slack
slack_configs:
- api_url: https://hooks.slack.com/services/xxx/yyy/zzz
channel: #alerts
text: "{{ .CommonAnnotations.description }}"
# 告警路由
route:
group_by: ["alertname", "cluster"]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: email
routes:
- match:
severity: critical
receiver: webhook
- match:
severity: warning
receiver: email
# 变量定义
{
"templating": {
"list": [
{
"name": "instance",
"type": "query",
"query": "label_values(node_boot_time, instance)",
"includeAll": true
},
{
"name": "job",
"type": "query",
"query": "label_values(node_boot_time, job)"
},
{
"name": "interval",
"type": "interval",
"options": ["1m", "5m", "15m", "1h"],
"auto": true
}
]
}
}
# 在查询中使用变量
100 - (avg(rate(node_cpu_seconds_total{mode="idle", instance=~"$instance"}[$interval])) * 100)
| 对比项 | Grafana | Kibana | Tableau | DataStudio |
|---|---|---|---|---|
| 数据源 | 40+ | Elasticsearch | 60+ | Google 生态 |
| 实时监控 | ✅ 原生 | ✅ | ❌ | ✅ |
| 告警 | ✅ 原生 | ✅ | ❌ | ❌ |
| 开源 | ✅ | ✅ | ❌ | ❌ |
| 适用场景 | 监控/运维 | 日志分析 | BI 报表 | 营销分析 |
Grafana 安装、数据源配置、基础仪表板创建
PromQL 查询、图表类型选择、仪表板设计
告警配置、变量使用、模板化仪表板、插件开发
生产环境监控、业务指标看板、多数据源集成
Grafana 是数据可视化的艺术大师。
它用 统一的数据源、丰富的图表、灵活的仪表板 将海量数据转化为直观的视觉语言。Grafana + Prometheus 是可观测性领域的黄金组合。
"Grafana 让数据变得美丽而有用。" 📊