feat: improve ai statistic plugin (#2671)

2026-06-09 20:57:32 +08:00 · 2025-08-11 13:43:00 +08:00
parent f7813df1d7
commit a76808171f
3 changed files with 294 additions and 51 deletions
--- a/plugins/wasm-go/extensions/ai-statistics/README.md
+++ b/plugins/wasm-go/extensions/ai-statistics/README.md
@@ -5,6 +5,7 @@ description: AI可观测配置参考
 ---
 ## 介绍
 提供 AI 可观测基础能力，包括 metric, log, trace，其后需接 ai-proxy 插件，如果不接 ai-proxy 插件的话，则需要用户进行相应配置才可生效。
 ## 运行属性
@@ -13,6 +14,7 @@ description: AI可观测配置参考
 插件执行优先级：`200`
 ## 配置说明
 插件默认请求符合 openai 协议格式，并提供了以下基础可观测值，用户无需特殊配置：
 - metric：提供了输入 token、输出 token、首个 token 的 rt（流式请求）、请求总 rt 等指标，支持在网关、路由、服务、模型四个维度上进行观测
@@ -25,11 +27,13 @@ description: AI可观测配置参考
 | `attributes` | []Attribute | 非必填  | -   | 用户希望记录在log/span中的信息 |
 | `disable_openai_usage` | bool | 非必填  | false   | 非openai兼容协议时，model、token的支持非标，配置为true时可以避免报错 |
 | `value_length_limit` | int | 非必填  | 4000   | 记录的单个value的长度限制 |
 | `enable_path_suffixes` | []string    | 非必填   | []     | 只对这些特定路径后缀的请求生效，可以配置为 "\*" 以匹配所有路径（通配符检查会优先进行以提高性能）。如果为空数组，则对所有路径生效 |
 | `enable_content_types` | []string    | 非必填   | []     | 只对这些内容类型的响应进行缓冲处理。如果为空数组，则对所有内容类型生效                                                           |
 Attribute 配置说明:
 | 名称                    | 数据类型 | 填写要求 | 默认值 | 描述                                                                                                                                        |
-|----------------|-------|-----|-----|------------------------|
+| ----------------------- | -------- | -------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------- |
 | `key`                   | string   | 必填     | -      | attribute 名称                                                                                                                              |
 | `value_source`          | string   | 必填     | -      | attribute 取值来源，可选值为 `fixed_value`, `request_header`, `request_body`, `response_header`, `response_body`, `response_streaming_body` |
 | `value`                 | string   | 必填     | -      | attribute 取值 key value/path                                                                                                               |
@@ -49,7 +53,6 @@ Attribute 配置说明:
 - `response_body` ：attribute 值通过响应 body 获取，value 配置格式为 gjson 的 jsonpath
 - `response_streaming_body` ：attribute 值通过流式响应 body 获取，value 配置格式为 gjson 的 jsonpath
 当 `value_source` 为 `response_streaming_body` 时，应当配置 `rule`，用于指定如何从流式 body 中获取指定值，取值含义如下：
 - `first`：多个 chunk 中取第一个有效 chunk 的值
@@ -57,6 +60,7 @@ Attribute 配置说明:
 - `append`：拼接多个有效 chunk 中的值，可用于获取回答内容
 ## 配置示例
 如果希望在网关访问日志中记录 ai-statistic 相关的统计值，需要修改 log_format，在原 log_format 基础上添加一个新字段，示例如下：
 ```yaml
@@ -64,6 +68,7 @@ Attribute 配置说明:
 ```
 如果字段设置了 `as_separate_log_field`，例如：
 ```yaml
 attributes:
  - key: consumer
@@ -74,11 +79,13 @@ attributes:
 ```
 那么要在日志中打印，需要额外设置 log_format：
 ```
 '{"consumer":"%FILTER_STATE(wasm.consumer:PLAIN)%"}'
 ```
 ### 空配置
 #### 监控
 ```
@@ -120,6 +127,7 @@ irate(route_upstream_model_consumer_metric_llm_duration_count[2m])
 ```
 #### 日志
 ```json
 {
  "ai_log": "{\"model\":\"qwen-turbo\",\"input_token\":\"10\",\"output_token\":\"69\",\"llm_first_token_duration\":\"309\",\"llm_service_duration\":\"1955\"}"
@@ -127,9 +135,11 @@ irate(route_upstream_model_consumer_metric_llm_duration_count[2m])
 ```
 #### 链路追踪
 配置为空时，不会在 span 中添加额外的 attribute
 ### 从非 openai 协议提取 token 使用信息
 在 ai-proxy 中设置协议为 original 时，以百炼为例，可作如下配置指定如何提取 model, input_token, output_token
 ```yaml
@@ -150,6 +160,7 @@ attributes:
    apply_to_log: true
    apply_to_span: false
 ```
 #### 监控
 ```
@@ -160,7 +171,9 @@ route_upstream_model_consumer_metric_llm_duration_count{ai_route="bailian",ai_cl
 ```
 #### 日志
 此配置下日志效果如下：
 ```json
 {
  "ai_log": "{\"model\":\"qwen-max\",\"input_token\":\"343\",\"output_token\":\"153\",\"llm_service_duration\":\"19110\"}"
@@ -168,10 +181,13 @@ route_upstream_model_consumer_metric_llm_duration_count{ai_route="bailian",ai_cl
 ```
 #### 链路追踪
 链路追踪的 span 中可以看到 model, input_token, output_token 三个额外的 attribute
 ### 配合认证鉴权记录 consumer
 举例如下：
 ```yaml
 attributes:
  - key: consumer # 配合认证鉴权记录consumer
@@ -181,6 +197,7 @@ attributes:
 ```
 ### 记录问题与回答
 ```yaml
 attributes:
  - key: question # 记录问题
@@ -199,11 +216,12 @@ attributes:
 ```
 ## 进阶
 配合阿里云 SLS 数据加工，可以将 ai 相关的字段进行提取加工，例如原始日志为：
-```
+````
 ai_log:{"question":"用python计算2的3次方","answer":"你可以使用 Python 的乘方运算符 `**` 来计算一个数的次方。计算2的3次方，即2乘以自己2次，可以用以下代码表示：\n\n```python\nresult = 2 ** 3\nprint(result)\n```\n\n运行这段代码，你会得到输出结果为8，因为2乘以自己两次等于8。","model":"qwen-max","input_token":"16","output_token":"76","llm_service_duration":"5913"}
-```
+````
 使用如下数据加工脚本，可以提取出 question 和 answer：
@@ -215,7 +233,7 @@ e_set("answer", json_select(v("json"), "answer", default="-"))
 提取后，SLS 中会添加 question 和 answer 两个字段，示例如下：
-```
+````
 ai_log:{"question":"用python计算2的3次方","answer":"你可以使用 Python 的乘方运算符 `**` 来计算一个数的次方。计算2的3次方，即2乘以自己2次，可以用以下代码表示：\n\n```python\nresult = 2 ** 3\nprint(result)\n```\n\n运行这段代码，你会得到输出结果为8，因为2乘以自己两次等于8。","model":"qwen-max","input_token":"16","output_token":"76","llm_service_duration":"5913"}
 question:用python计算2的3次方
@@ -227,4 +245,57 @@ print(result)
 运行这段代码，你会得到输出结果为8，因为2乘以自己两次等于8。
 ````
 ### 路径和内容类型过滤配置示例
 #### 只处理特定 AI 路径
 ```yaml
 enable_path_suffixes:
  - "/v1/chat/completions"
  - "/v1/embeddings"
  - "/generateContent"
 ```
 #### 只处理特定内容类型
 ```yaml
 enable_content_types:
  - "text/event-stream"
  - "application/json"
 ```
 #### 处理所有路径（通配符）
 ```yaml
 enable_path_suffixes:
  - "*"
 ```
 #### 处理所有内容类型（空数组）
 ```yaml
 enable_content_types: []
 ```
 #### 完整配置示例
 ```yaml
 enable_path_suffixes:
  - "/v1/chat/completions"
  - "/v1/embeddings"
  - "/generateContent"
 enable_content_types:
  - "text/event-stream"
  - "application/json"
 attributes:
  - key: model
    value_source: request_body
    value: model
    apply_to_log: true
  - key: consumer
    value_source: request_header
    value: x-mse-consumer
    apply_to_log: true
 ```
--- a/plugins/wasm-go/extensions/ai-statistics/README_EN.md
+++ b/plugins/wasm-go/extensions/ai-statistics/README_EN.md
@@ -5,6 +5,7 @@ description: AI Statistics plugin configuration reference
 ---
 ## Introduction
 Provides basic AI observability capabilities, including metric, log, and trace. The ai-proxy plug-in needs to be connected afterwards. If the ai-proxy plug-in is not connected, the user needs to configure it accordingly to take effect.
 ## Runtime Properties
@@ -13,6 +14,7 @@ Plugin Phase: `CUSTOM`
 Plugin Priority: `200`
 ## Configuration instructions
 The default request of the plug-in conforms to the openai protocol format and provides the following basic observable values. Users do not need special configuration:
 - metric: It provides indicators such as input token, output token, rt of the first token (streaming request), total request rt, etc., and supports observation in the four dimensions of gateway, routing, service, and model.
@@ -25,12 +27,13 @@ Users can also expand observable values through configuration:
 | `attributes` | []Attribute | optional  | -   | Information that the user wants to record in log/span |
 | `disable_openai_usage` | bool | optional  | false   | When using a non-OpenAI-compatible protocol, the support for model and token is non-standard. Setting the configuration to true can prevent errors. |
 | `value_length_limit` | int | optional  | 4000   | length limit for each value |
-
+| `enable_path_suffixes`   | []string    | optional | ["/v1/chat/completions","/v1/completions","/v1/embeddings","/v1/models","/generateContent","/streamGenerateContent"] | Only effective for requests with these specific path suffixes, can be configured as "\*" to match all paths                                         |
 | `enable_content_types` | []string    | optional | ["text/event-stream","application/json"]                                                                             | Only buffer response body for these content types                                                                                                   |
 Attribute Configuration instructions:
 | Name                    | Type   | Required | Default | Description                                                                                                                                                  |
-|----------------|-------|-----|-----|------------------------|
+| ----------------------- | ------ | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
 | `key`                   | string | required | -       | attribute key                                                                                                                                                |
 | `value_source`          | string | required | -       | attribute value source, optional values are `fixed_value`, `request_header`, `request_body`, `response_header`, `response_body`, `response_streaming_body` |
 | `value`                 | string | required | -       | how to get attribute value                                                                                                                                   |
@@ -50,7 +53,6 @@ The meanings of various values for `value_source` are as follows:
 - `response_body`: The attribute is obtained through the http response body
 - `response_streaming_body`: The attribute is obtained through the http streaming response body
 When `value_source` is `response_streaming_body`, `rule` should be configured to specify how to obtain the specified value from the streaming body. The meaning of the value is as follows:
 - `first`: extract value from the first valid chunk
@@ -58,6 +60,7 @@ When `value_source` is `response_streaming_body`, `rule` should be configured to
 - `append`: join value pieces from all valid chunks
 ## Configuration example
 If you want to record ai-statistic related statistical values in the gateway access log, you need to modify log_format and add a new field based on the original log_format. The example is as follows:
 ```yaml
@@ -65,6 +68,7 @@ If you want to record ai-statistic related statistical values in the gateway acc
 ```
 If the field is set with `as_separate_log_field`, for example:
 ```yaml
 attributes:
  - key: consumer
@@ -75,11 +79,13 @@ attributes:
 ```
 Then to print in the log, you need to set log_format additionally:
 ```
 '{"consumer":"%FILTER_STATE(wasm.consumer:PLAIN)%"}'
 ```
 ### Empty
 #### Metric
 ```
@@ -121,6 +127,7 @@ irate(route_upstream_model_consumer_metric_llm_duration_count[2m])
 ```
 #### Log
 ```json
 {
  "ai_log": "{\"model\":\"qwen-turbo\",\"input_token\":\"10\",\"output_token\":\"69\",\"llm_first_token_duration\":\"309\",\"llm_service_duration\":\"1955\"}"
@@ -128,9 +135,11 @@ irate(route_upstream_model_consumer_metric_llm_duration_count[2m])
 ```
 #### Trace
 When the configuration is empty, no additional attributes will be added to the span.
 ### Extract token usage information from non-openai protocols
 When setting the protocol to original in ai-proxy, taking Alibaba Cloud Bailian as an example, you can make the following configuration to specify how to extract `model`, `input_token`, `output_token`
 ```yaml
@@ -151,6 +160,7 @@ attributes:
    apply_to_log: true
    apply_to_span: false
 ```
 #### Metric
 ```
@@ -161,6 +171,7 @@ route_upstream_model_consumer_metric_llm_duration_count{ai_route="bailian",ai_cl
 ```
 #### Log
 ```json
 {
  "ai_log": "{\"model\":\"qwen-max\",\"input_token\":\"343\",\"output_token\":\"153\",\"llm_service_duration\":\"19110\"}"
@@ -168,9 +179,11 @@ route_upstream_model_consumer_metric_llm_duration_count{ai_route="bailian",ai_cl
 ```
 #### Trace
 Three additional attributes `model`, `input_token`, and `output_token` can be seen in the trace spans.
 ### Cooperate with authentication and authentication record consumer
 ```yaml
 attributes:
  - key: consumer
@@ -180,6 +193,7 @@ attributes:
 ```
 ### Record questions and answers
 ```yaml
 attributes:
  - key: question
@@ -196,3 +210,50 @@ attributes:
    value: choices.0.message.content
    apply_to_log: true
 ```
 ### Path and Content Type Filtering Configuration Examples
 #### Process Only Specific AI Paths
 ```yaml
 enable_path_suffixes:
  - "/v1/chat/completions"
  - "/v1/embeddings"
  - "/generateContent"
 ```
 #### Process Only Specific Content Types
 ```yaml
 enable_content_types:
  - "text/event-stream"
  - "application/json"
 ```
 #### Process All Paths (Wildcard)
 ```yaml
 enable_path_suffixes:
  - "*"
 ```
 #### Complete Configuration Example
 ```yaml
 enable_path_suffixes:
  - "/v1/chat/completions"
  - "/v1/embeddings"
  - "/generateContent"
 enable_content_types:
  - "text/event-stream"
  - "application/json"
 attributes:
  - key: model
    value_source: request_body
    value: model
    apply_to_log: true
  - key: consumer
    value_source: request_header
    value: x-mse-consumer
    apply_to_log: true
 ```
--- a/plugins/wasm-go/extensions/ai-statistics/main.go
+++ b/plugins/wasm-go/extensions/ai-statistics/main.go
@@ -44,6 +44,15 @@ const (
 	APIName                    = "api"
 	ConsumerKey                = "x-mse-consumer"
 	RequestPath                = "request_path"
 	SkipProcessing             = "skip_processing"
 	// AI API Paths
 	PathOpenAIChatCompletions       = "/v1/chat/completions"
 	PathOpenAICompletions           = "/v1/completions"
 	PathOpenAIEmbeddings            = "/v1/embeddings"
 	PathOpenAIModels                = "/v1/models"
 	PathGeminiGenerateContent       = "/generateContent"
 	PathGeminiStreamGenerateContent = "/streamGenerateContent"
 	// Source Type
 	FixedValue            = "fixed_value"
@@ -100,6 +109,10 @@ type AIStatisticsConfig struct {
 	// If disableOpenaiUsage is true, model/input_token/output_token logs will be skipped
 	disableOpenaiUsage bool
 	valueLengthLimit   int
 	// Path suffixes to enable the plugin on
 	enablePathSuffixes []string
 	// Content types to enable response body buffering
 	enableContentTypes []string
 }
 func generateMetricName(route, cluster, model, consumer, metricName string) string {
@@ -147,6 +160,41 @@ func (config *AIStatisticsConfig) incrementCounter(metricName string, inc uint64
 	counter.Increment(inc)
 }
 // isPathEnabled checks if the request path matches any of the enabled path suffixes
 func isPathEnabled(requestPath string, enabledSuffixes []string) bool {
 	if len(enabledSuffixes) == 0 {
 		return true // If no path suffixes configured, enable for all
 	}
 	// Remove query parameters from path
 	pathWithoutQuery := requestPath
 	if queryPos := strings.Index(requestPath, "?"); queryPos != -1 {
 		pathWithoutQuery = requestPath[:queryPos]
 	}
 	// Check if path ends with any enabled suffix
 	for _, suffix := range enabledSuffixes {
 		if strings.HasSuffix(pathWithoutQuery, suffix) {
 			return true
 		}
 	}
 	return false
 }
 // isContentTypeEnabled checks if the content type matches any of the enabled content types
 func isContentTypeEnabled(contentType string, enabledContentTypes []string) bool {
 	if len(enabledContentTypes) == 0 {
 		return true // If no content types configured, enable for all
 	}
 	for _, enabledType := range enabledContentTypes {
 		if strings.Contains(contentType, enabledType) {
 			return true
 		}
 	}
 	return false
 }
 func parseConfig(configJson gjson.Result, config *AIStatisticsConfig) error {
 	// Parse tracing span attributes setting.
 	attributeConfigs := configJson.Get("attributes").Array()
@@ -177,10 +225,49 @@ func parseConfig(configJson gjson.Result, config *AIStatisticsConfig) error {
 	// Parse openai usage config setting.
 	config.disableOpenaiUsage = configJson.Get("disable_openai_usage").Bool()
 	// Parse path suffix configuration
 	pathSuffixes := configJson.Get("enable_path_suffixes").Array()
 	config.enablePathSuffixes = make([]string, 0, len(pathSuffixes))
 	for _, suffix := range pathSuffixes {
 		suffixStr := suffix.String()
 		if suffixStr == "*" {
 			// Clear the suffixes list since * means all paths are enabled
 			config.enablePathSuffixes = make([]string, 0)
 			break
 		}
 		config.enablePathSuffixes = append(config.enablePathSuffixes, suffixStr)
 	}
 	// Parse content type configuration
 	contentTypes := configJson.Get("enable_content_types").Array()
 	config.enableContentTypes = make([]string, 0, len(contentTypes))
 	for _, contentType := range contentTypes {
 		contentTypeStr := contentType.String()
 		if contentTypeStr == "*" {
 			// Clear the content types list since * means all content types are enabled
 			config.enableContentTypes = make([]string, 0)
 			break
 		}
 		config.enableContentTypes = append(config.enableContentTypes, contentTypeStr)
 	}
 	return nil
 }
 func onHttpRequestHeaders(ctx wrapper.HttpContext, config AIStatisticsConfig) types.Action {
 	// Check if request path matches enabled suffixes
 	requestPath, _ := proxywasm.GetHttpRequestHeader(":path")
 	if !isPathEnabled(requestPath, config.enablePathSuffixes) {
 		log.Debugf("ai-statistics: skipping request for path %s (not in enabled suffixes)", requestPath)
 		// Set skip processing flag and avoid reading request/response body
 		ctx.SetContext(SkipProcessing, true)
 		ctx.DontReadRequestBody()
 		ctx.DontReadResponseBody()
 		return types.ActionContinue
 	}
 	ctx.DisableReroute()
 	route, _ := getRouteName()
 	cluster, _ := getClusterName()
@@ -212,6 +299,11 @@ func onHttpRequestHeaders(ctx wrapper.HttpContext, config AIStatisticsConfig) ty
 }
 func onHttpRequestBody(ctx wrapper.HttpContext, config AIStatisticsConfig, body []byte) types.Action {
 	// Check if processing should be skipped
 	if ctx.GetBoolContext(SkipProcessing, false) {
 		return types.ActionContinue
 	}
 	// Set user defined log & span attributes.
 	setAttributeBySource(ctx, config, RequestBody, body)
 	// Set span attributes for ARMS.
@@ -254,6 +346,15 @@ func onHttpRequestBody(ctx wrapper.HttpContext, config AIStatisticsConfig, body
 func onHttpResponseHeaders(ctx wrapper.HttpContext, config AIStatisticsConfig) types.Action {
 	contentType, _ := proxywasm.GetHttpResponseHeader("content-type")
 	if !isContentTypeEnabled(contentType, config.enableContentTypes) {
 		log.Debugf("ai-statistics: skipping response for content type %s (not in enabled content types)", contentType)
 		// Set skip processing flag and avoid reading response body
 		ctx.SetContext(SkipProcessing, true)
 		ctx.DontReadResponseBody()
 		return types.ActionContinue
 	}
 	if !strings.Contains(contentType, "text/event-stream") {
 		ctx.BufferResponseBody()
 	}
@@ -265,6 +366,11 @@ func onHttpResponseHeaders(ctx wrapper.HttpContext, config AIStatisticsConfig) t
 }
 func onHttpStreamingBody(ctx wrapper.HttpContext, config AIStatisticsConfig, data []byte, endOfStream bool) []byte {
 	// Check if processing should be skipped
 	if ctx.GetBoolContext(SkipProcessing, false) {
 		return data
 	}
 	// Buffer stream body for record log & span attributes
 	if config.shouldBufferStreamingBody {
 		streamingBodyBuffer, ok := ctx.GetContext(CtxStreamingBodyBuffer).([]byte)
@@ -334,6 +440,11 @@ func onHttpStreamingBody(ctx wrapper.HttpContext, config AIStatisticsConfig, dat
 }
 func onHttpResponseBody(ctx wrapper.HttpContext, config AIStatisticsConfig, body []byte) types.Action {
 	// Check if processing should be skipped
 	if ctx.GetBoolContext(SkipProcessing, false) {
 		return types.ActionContinue
 	}
 	// Get requestStartTime from http context
 	requestStartTime, _ := ctx.GetContext(StatisticsRequestStartTime).(int64)