Update ai statistics (#1303)

2026-06-09 04:37:31 +08:00 · 2024-09-24 19:42:10 +08:00
parent bef9139753
commit b82853c653
5 changed files with 679 additions and 223 deletions
--- a/plugins/wasm-go/extensions/ai-statistics/README.md
+++ b/plugins/wasm-go/extensions/ai-statistics/README.md
@@ -1,69 +1,178 @@
-# 介绍
-提供AI可观测基础能力，其后需接ai-proxy插件，如果不接ai-proxy插件的话，则只支持openai协议。
+---
+title: AI可观测
+keywords: [higress, AI, observability]
+description: AI可观测配置参考
+---

-# 配置说明
+## 介绍
+提供AI可观测基础能力，包括 metric, log, trace，其后需接ai-proxy插件，如果不接ai-proxy插件的话，则需要用户进行相应配置才可生效。
+
+## 运行属性
+
+插件执行阶段：`默认阶段`
+插件执行优先级：`200`
+
+## 配置说明
+插件默认请求符合openai协议格式，并提供了以下基础可观测值，用户无需特殊配置：
+
+- metric：提供了输入token、输出token、首个token的rt（流式请求）、请求总rt等指标，支持在网关、路由、服务、模型四个维度上进行观测
+- log：提供了 input_token, output_token, model, llm_service_duration, llm_first_token_duration 等字段
+
+用户还可以通过配置的方式对可观测的值进行扩展：

 | 名称             | 数据类型  | 填写要求 | 默认值 | 描述                     |
 |----------------|-------|------|-----|------------------------|
-| `enable`       | bool  | 必填   | -   | 是否开启ai统计功能             |
-| `tracing_span` | array | 非必填  | -   | 自定义tracing span tag 配置 |
+| `attributes` | []Attribute | 非必填  | -   | 用户希望记录在log/span中的信息 |
+
+Attribute 配置说明:

-## tracing_span 配置说明
 | 名称             | 数据类型  | 填写要求 | 默认值 | 描述                     |
 |----------------|-------|-----|-----|------------------------|
-| `key`         | string | 必填  | -   | tracing tag 名称           |
-| `value_source`        | string | 必填  | -   | tag 取值来源             |
-| `value`      | string | 必填  | -   | tag 取值 key value/path           |
+| `key`         | string | 必填  | -   | attrribute 名称           |
+| `value_source` | string | 必填  | -   | attrribute 取值来源，可选值为 `fixed_value`, `request_header`, `request_body`, `response_header`, `response_body`, `response_streaming_body`             |
+| `value`      | string | 必填  | -   | attrribute 取值 key value/path |
+| `rule`      | string | 非必填  | -   | 从流式响应中提取 attrribute 的规则，可选值为 `first`, `replace`, `append`|
+| `apply_to_log`      | bool | 非必填  | false  | 是否将提取的信息记录在日志中 |
+| `apply_to_span`      | bool | 非必填  | false  | 是否将提取的信息记录在链路追踪span中 |

-value_source为 tag 值的取值来源，可选配置值有 4 个：
- property ： tag 值通过proxywasm.GetProperty()方法获取，value配置GetProperty()方法要提取的key名
- requeset_header ： tag 值通过http请求头获取，value配置为header key
- request_body ：tag 值通过请求body获取，value配置格式为 gjson的 GJSON PATH 语法
- response_header ： tag 值通过http响应头获取，value配置为header key
+`value_source` 的各种取值含义如下：
+
+- `fixed_value`：固定值
+- `requeset_header` ： attrribute 值通过 http 请求头获取，value 配置为 header key
+- `request_body` ：attrribute 值通过请求 body 获取，value 配置格式为 gjson 的 jsonpath
+- `response_header` ：attrribute 值通过 http 响应头获取，value 配置为header key
+- `response_body` ：attrribute 值通过响应 body 获取，value 配置格式为 gjson 的 jsonpath
+- `response_streaming_body` ：attrribute 值通过流式响应 body 获取，value 配置格式为 gjson 的 jsonpath
+
+
+当 `value_source` 为 `response_streaming_body` 时，应当配置 `rule`，用于指定如何从流式body中获取指定值，取值含义如下：
+
+- `first`：多个chunk中取第一个有效chunk的值
+- `replace`：多个chunk中取最后一个有效chunk的值
+- `append`：拼接多个有效chunk中的值，可用于获取回答内容
+
+## 配置示例
+如果希望在网关访问日志中记录ai-statistic相关的统计值，需要修改log_format，在原log_format基础上添加一个新字段，示例如下：

-举例如下： 
 ```yaml
-tracing_label:
- key: "session_id"
-  value_source: "requeset_header"
-  value: "session_id"
- key: "user_content"
-  value_source: "request_body"
-  value: "input.messages.1.content"
+'{"ai_log":"%FILTER_STATE(wasm.ai_log:PLAIN)%"}'
 ```

-开启后 metrics 示例：
+### 空配置
+#### 监控
 ```
-route_upstream_model_input_token{ai_route="openai",ai_cluster="qwen",ai_model="qwen-max"} 21
-route_upstream_model_output_token{ai_route="openai",ai_cluster="qwen",ai_model="qwen-max"} 17
+route_upstream_model_metric_input_token{ai_route="llm",ai_cluster="outbound|443||qwen.dns",ai_model="qwen-turbo"} 10
+route_upstream_model_metric_llm_duration_count{ai_route="llm",ai_cluster="outbound|443||qwen.dns",ai_model="qwen-turbo"} 1
+route_upstream_model_metric_llm_first_token_duration{ai_route="llm",ai_cluster="outbound|443||qwen.dns",ai_model="qwen-turbo"} 309
+route_upstream_model_metric_llm_service_duration{ai_route="llm",ai_cluster="outbound|443||qwen.dns",ai_model="qwen-turbo"} 1955
+route_upstream_model_metric_output_token{ai_route="llm",ai_cluster="outbound|443||qwen.dns",ai_model="qwen-turbo"} 69
 ```

-日志示例：
-
+#### 日志
 ```json
 {
-    "model": "qwen-max",
-    "input_token": "21",
-    "output_token": "17",
-    "authority": "dashscope.aliyuncs.com",
-    "bytes_received": "336",
-    "bytes_sent": "1675",
-    "duration": "1590",
-    "istio_policy_status": "-",
-    "method": "POST",
-    "path": "/v1/chat/completions",
-    "protocol": "HTTP/1.1",
-    "request_id": "5895f5a9-e4e3-425b-98db-6c6a926195b7",
-    "requested_server_name": "-",
-    "response_code": "200",
-    "response_flags": "-",
-    "route_name": "openai",
-    "start_time": "2024-06-18T09:37:14.078Z",
-    "trace_id": "-",
-    "upstream_cluster": "qwen",
-    "upstream_service_time": "496",
-    "upstream_transport_failure_reason": "-",
-    "user_agent": "PostmanRuntime/7.37.3",
-    "x_forwarded_for": "-"
+  "ai_log":"{\"model\":\"qwen-turbo\",\"input_token\":\"10\",\"output_token\":\"69\",\"llm_first_token_duration\":\"309\",\"llm_service_duration\":\"1955\"}"
 }
+```
+
+#### 链路追踪
+配置为空时，不会在span中添加额外的attribute
+
+### 从非openai协议提取token使用信息
+在ai-proxy中设置协议为original时，以百炼为例，可作如下配置指定如何提取model, input_token, output_token
+
+```yaml
+attributes:
+  - key: model
+    value_source: response_body
+    value: usage.models.0.model_id
+    apply_to_log: true
+    apply_to_span: false
+  - key: input_token
+    value_source: response_body
+    value: usage.models.0.input_tokens
+    apply_to_log: true
+    apply_to_span: false
+  - key: output_token
+    value_source: response_body
+    value: usage.models.0.output_tokens
+    apply_to_log: true
+    apply_to_span: false
+```
+#### 监控
+```
+route_upstream_model_metric_input_token{ai_route="bailian",ai_cluster="qwen",ai_model="qwen-max"} 343
+route_upstream_model_metric_output_token{ai_route="bailian",ai_cluster="qwen",ai_model="qwen-max"} 153
+route_upstream_model_metric_llm_service_duration{ai_route="bailian",ai_cluster="qwen",ai_model="qwen-max"} 3725
+route_upstream_model_metric_llm_duration_count{ai_route="bailian",ai_cluster="qwen",ai_model="qwen-max"} 1
+```
+
+#### 日志
+此配置下日志效果如下：
+```json
+{
+  "ai_log": "{\"model\":\"qwen-max\",\"input_token\":\"343\",\"output_token\":\"153\",\"llm_service_duration\":\"19110\"}"  
+}
+```
+
+#### 链路追踪
+链路追踪的 span 中可以看到 model, input_token, output_token 三个额外的 attribute
+
+### 配合认证鉴权记录consumer
+举例如下： 
+```yaml
+attributes:
+  - key: consumer # 配合认证鉴权记录consumer
+    value_source: request_header
+    value: x-mse-consumer
+    apply_to_log: true
+```
+
+### 记录问题与回答
+```yaml
+attributes:
+  - key: question # 记录问题
+    value_source: request_body
+    value: messages.@reverse.0.content
+    apply_to_log: true
+  - key: answer   # 在流式响应中提取大模型的回答
+    value_source: response_streaming_body
+    value: choices.0.delta.content
+    rule: append
+    apply_to_log: true
+  - key: answer   # 在非流式响应中提取大模型的回答
+    value_source: response_body
+    value: choices.0.message.content
+    apply_to_log: true
+```
+
+## 进阶
+配合阿里云SLS数据加工，可以将ai相关的字段进行提取加工，例如原始日志为：
+
+```
+ai_log:{"question":"用python计算2的3次方","answer":"你可以使用 Python 的乘方运算符 `**` 来计算一个数的次方。计算2的3次方，即2乘以自己2次，可以用以下代码表示：\n\n```python\nresult = 2 ** 3\nprint(result)\n```\n\n运行这段代码，你会得到输出结果为8，因为2乘以自己两次等于8。","model":"qwen-max","input_token":"16","output_token":"76","llm_service_duration":"5913"}
+```
+
+使用如下数据加工脚本，可以提取出question和answer：
+
+```
+e_regex("ai_log", grok("%{EXTRACTJSON}"))
+e_set("question", json_select(v("json"), "question", default="-"))
+e_set("answer", json_select(v("json"), "answer", default="-"))
+```
+
+提取后，SLS中会添加question和answer两个字段，示例如下：
+
+```
+ai_log:{"question":"用python计算2的3次方","answer":"你可以使用 Python 的乘方运算符 `**` 来计算一个数的次方。计算2的3次方，即2乘以自己2次，可以用以下代码表示：\n\n```python\nresult = 2 ** 3\nprint(result)\n```\n\n运行这段代码，你会得到输出结果为8，因为2乘以自己两次等于8。","model":"qwen-max","input_token":"16","output_token":"76","llm_service_duration":"5913"}
+
+question:用python计算2的3次方
+
+answer:你可以使用 Python 的乘方运算符 `**` 来计算一个数的次方。计算2的3次方，即2乘以自己2次，可以用以下代码表示：
+
+result = 2 ** 3
+print(result)
+
+运行这段代码，你会得到输出结果为8，因为2乘以自己两次等于8。
+
 ```