Grafana 开源了一款 eBPF 采集器 Beyla

news/2024/7/6 5:17:15 标签: ebpf

eBPF 的发展如火如荼,在可观测性领域大放异彩,Grafana 近期也发布了一款 eBPF 采集器,可以采集服务的 RED 指标,本文做一个尝鲜介绍,让读者有个大概了解。

eBPF 基础介绍可以参考我之前的文章《eBPF Hello world》。理论上,eBPF 可以拿到服务收到的请求信息,比如QPS、延迟、成功率等,这些数据对于应用级监控至关重要,Grafana Beyla 就是为此而生的。

要测试使用 Beyla 采集服务的 RED(Rate-Errors-Duration) 指标,那首先得有个服务,这里我用的是 answer( https://answer.flashcat.cloud ) 论坛,你也可以自己搞一个简单的 http 服务,比如:

package main

import (
    "net/http"
    "strconv"
    "time"
)

func handleRequest(rw http.ResponseWriter, req *http.Request) {
    status := 200
    for k, v := range req.URL.Query() {
        if len(v) == 0 {
            continue
        }
        switch k {
        case "status":
            if s, err := strconv.Atoi(v[0]); err == nil {
                status = s
            }
        case "delay":
            if d, err := time.ParseDuration(v[0]); err == nil {
                time.Sleep(d)
            }
        }
    }
    rw.WriteHeader(status)
}

func main() {
    http.ListenAndServe(":8080",
                 http.HandlerFunc(handleRequest))
}

上面这个代码,保存成 server.go,然后用 go run server.go 即可运行,当然,前提是你机器上有 go 开发环境。这个小服务,可以接收两个参数,一个是 status,用来指定返回的 http 状态码,另一个是 delay,用来指定延迟多久返回,比如:

curl -v "http://localhost:8080/foo?status=404"

上面的命令,会返回 404 状态码,如果想延迟 1 秒返回,可以这样:

curl -v "http://localhost:8080/foo?delay=1s"

接下来,我们就可以使用 Beyla 采集这个服务的 RED 指标了。

下载 Beyla

我的机器上有 go 开发环境,所以我直接使用 go install 安装了,你也可以去 Beyla 的 release 页面下载二进制包,然后解压缩使用。

go install github.com/grafana/beyla/cmd/beyla@latest
运行 Beyla

使用下面的命令运行 Beyla:

$ BEYLA_PROMETHEUS_PORT=8999 PRINT_TRACES=true OPEN_PORT=8080 sudo -E beyla

或者直接使用 root 账号运行,比如我是这么跑的:

$ BEYLA_PROMETHEUS_PORT=8999 PRINT_TRACES=true OPEN_PORT=8080 beyla

解释一下这几个参数:

  • BEYLA_PROMETHEUS_PORT: Beyla 要监听的端口,通过这个端口暴露 metrics 指标数据
  • PRINT_TRACES: 是否打印 trace 日志
  • OPEN_PORT: Beyla 采集的目标服务监听的端口,这里是 8080,上面给出的那段 go server 的代码就是监听在 8080,我的机器上 answer 论坛程序也是监听在 8080,你要监控的程序如果不是监听在 8080,可以在换成你自己的端口
查看指标

运行之后,可以通过 curl 查看指标:

curl http://localhost:8999/metrics

返回的内容如下:

# HELP http_client_duration_seconds duration of HTTP service calls from the client side, in seconds
# TYPE http_client_duration_seconds histogram
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.005"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.01"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.025"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.05"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.075"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.1"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.25"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.5"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.75"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1"} 0
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2.5"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="5"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="7.5"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="10"} 1
http_client_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 1
http_client_duration_seconds_sum{http_method="GET",http_status_code="200",service_name="answer"} 1.668771575
http_client_duration_seconds_count{http_method="GET",http_status_code="200",service_name="answer"} 1
# HELP http_client_request_size_bytes size, in bytes, of the HTTP request body as sent from the client side
# TYPE http_client_request_size_bytes histogram
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="32"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="64"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="128"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="256"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="512"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1024"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2048"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="4096"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="8192"} 1
http_client_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 1
http_client_request_size_bytes_sum{http_method="GET",http_status_code="200",service_name="answer"} 0
http_client_request_size_bytes_count{http_method="GET",http_status_code="200",service_name="answer"} 1
# HELP http_server_duration_seconds duration of HTTP service calls from the server side, in seconds
# TYPE http_server_duration_seconds histogram
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 0
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.005"} 201
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.01"} 789
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.025"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.05"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.075"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.1"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.25"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.5"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0.75"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1"} 799
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2.5"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="5"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="7.5"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="10"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 800
http_server_duration_seconds_sum{http_method="GET",http_status_code="200",service_name="answer"} 5.752096697000003
http_server_duration_seconds_count{http_method="GET",http_status_code="200",service_name="answer"} 800
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0"} 0
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.005"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.01"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.025"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.05"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.075"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.1"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.25"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0.75"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="1"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="2.5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="7.5"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="10"} 1
http_server_duration_seconds_bucket{http_method="GET",http_status_code="302",service_name="answer",le="+Inf"} 1
http_server_duration_seconds_sum{http_method="GET",http_status_code="302",service_name="answer"} 0.001523002
http_server_duration_seconds_count{http_method="GET",http_status_code="302",service_name="answer"} 1
# HELP http_server_request_size_bytes size, in bytes, of the HTTP request body as received at the server side
# TYPE http_server_request_size_bytes histogram
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="0"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="32"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="64"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="128"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="256"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="512"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="1024"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="2048"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="4096"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="8192"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="200",service_name="answer",le="+Inf"} 800
http_server_request_size_bytes_sum{http_method="GET",http_status_code="200",service_name="answer"} 0
http_server_request_size_bytes_count{http_method="GET",http_status_code="200",service_name="answer"} 800
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="0"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="32"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="64"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="128"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="256"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="512"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="1024"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="2048"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="4096"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="8192"} 1
http_server_request_size_bytes_bucket{http_method="GET",http_status_code="302",service_name="answer",le="+Inf"} 1
http_server_request_size_bytes_sum{http_method="GET",http_status_code="302",service_name="answer"} 0
http_server_request_size_bytes_count{http_method="GET",http_status_code="302",service_name="answer"} 1
# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors_total counter
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0

这些指标就可以用采集器来抓了,比如 vmagent、categraf、prometheus 等,完事之后入库,使用 Grafana 展示分析即可,经常关注本公众号的读者对于这些知识应该比较熟悉了,这里不再赘述。Beyla 默认提供了一个 Grafana Dashboard,可以导入测试:https://github.com/grafana/beyla/tree/main/grafana。

结语

Beyla 目前还不太稳定,还有很多功能没有完成。不过可以尝鲜研究了。可观测性整套技术栈搞起来还挺费劲的,如果您想建设这套技术栈,欢迎来和我们聊聊,我们提供这方面的咨询和商业产品,详情了解:

快猫星云 Flashcat | 为了无法度量的价值 | 开源监控 | 夜莺监控 | 可观测平台 | 运维监控 | IT监控快猫星云(官网),支持云原生监控、混合云监控、多云统一监控,解决云原生架构、混合云架构下统一监控难、故障定位慢的问题icon-default.png?t=N7T8https://flashcat.cloud/


http://www.niftyadmin.cn/n/5054007.html

相关文章

HZOJ-838:2020年数据结构41题

题目描述 定义三元组(a,b, c)(a,b,c 均为正数)的距离 D|a-b||b-c||c-a|。给定 3 个非空整数集合 S1, S2 ,S3, 按升序分别存储在 3 个数组中。请设计一个尽可能高效的算法,计算并输出所有可能的三元组(a, b…

Java线程池与ExecutorService教程:原理、应用与配置选项

多线程是Java中常用的技术,它可以提高程序的性能和并发处理能力。然而,直接使用Java中的Thread类创建和管理线程存在一些缺点,例如线程创建和销毁的开销很大,线程池的数量不易控制,易出现内存泄漏等问题。因此&#xf…

只需一个简单操作,保障企业电力供应安全性!

随着现代社会对电力供应的不断依赖,不间断电源(UPS)系统已经成为各种行业的关键基础设施之一。然而,UPS系统的性能监控和管理同样至关重要,以确保它们在需要时能够如期发挥作用。 客户案例 广东某公司是一家制造企业&a…

对比接口测试工具在自动化测试优缺点:Jmeter、Python、Postman

一、JMeter 总结:适合对代码不敏感的使用人员,不会代码也可以完成接口自动化,设计框架。适合紧急迭代的项目。 JMeter接口测试的优势 小巧轻量级,并且开源免费,社区接受度高,比较容易入门支持多协议&…

【机器学习 | 非线性拟合】梯度下降 vs SLSQP算法,谁更胜一筹? 解决六个数据点的非线性拟合难题,挑战非线性拟合问题

🤵‍♂️ 个人主页: AI_magician 📡主页地址: 作者简介:CSDN内容合伙人,全栈领域优质创作者。 👨‍💻景愿:旨在于能和更多的热爱计算机的伙伴一起成长!!&…

防雷接地施工方案和行业应用

防雷接地是指将建筑物或设备与地面电气连接,以防止雷电对其造成损坏或危害的措施。防雷接地的主要目的是将雷电流引入地下,消除雷电引起的危险电位差,保护人身和设备安全。 防雷接地的施工方案应根据建筑物或设备的特点、功能、位置、周围环…

新媒体运营的未来:ChatGPT的智能助手

💂 个人网站:【工具大全】【游戏大全】【神级源码资源网】🤟 前端学习课程:👉【28个案例趣学前端】【400个JS面试题】💅 寻找学习交流、摸鱼划水的小伙伴,请点击【摸鱼学习交流群】 新媒体运营是数字时代的…

微服务调用没有返回值,无法组成对象,但是会有feign的信息

事件起因 还是那个项目,至少对于我来说要学习的东西其实还是挺多的。 需求 员工信息管理,员工简历,导出功能,需要去联查员工的各项信息,其中,涉及到微服务的之间的操作出现了问题,目前主要的…