feat: improve ai statistic plugin (#2671)

This commit is contained in:
aias00
2025-08-11 13:43:00 +08:00
committed by GitHub
parent f7813df1d7
commit a76808171f
3 changed files with 294 additions and 51 deletions

View File

@@ -5,6 +5,7 @@ description: AI Statistics plugin configuration reference
---
## Introduction
Provides basic AI observability capabilities, including metric, log, and trace. The ai-proxy plug-in needs to be connected afterwards. If the ai-proxy plug-in is not connected, the user needs to configure it accordingly to take effect.
## Runtime Properties
@@ -13,6 +14,7 @@ Plugin Phase: `CUSTOM`
Plugin Priority: `200`
## Configuration instructions
The default request of the plug-in conforms to the openai protocol format and provides the following basic observable values. Users do not need special configuration:
- metric: It provides indicators such as input token, output token, rt of the first token (streaming request), total request rt, etc., and supports observation in the four dimensions of gateway, routing, service, and model.
@@ -25,21 +27,22 @@ Users can also expand observable values through configuration:
| `attributes` | []Attribute | optional | - | Information that the user wants to record in log/span |
| `disable_openai_usage` | bool | optional | false | When using a non-OpenAI-compatible protocol, the support for model and token is non-standard. Setting the configuration to true can prevent errors. |
| `value_length_limit` | int | optional | 4000 | length limit for each value |
| `enable_path_suffixes` | []string | optional | ["/v1/chat/completions","/v1/completions","/v1/embeddings","/v1/models","/generateContent","/streamGenerateContent"] | Only effective for requests with these specific path suffixes, can be configured as "\*" to match all paths |
| `enable_content_types` | []string | optional | ["text/event-stream","application/json"] | Only buffer response body for these content types |
Attribute Configuration instructions:
| Name | Type | Required | Default | Description |
|----------------|-------|-----|-----|------------------------|
| `key` | string | required | - | attribute key |
| `value_source` | string | required | - | attribute value source, optional values are `fixed_value`, `request_header`, `request_body`, `response_header`, `response_body`, `response_streaming_body` |
| `value` | string | required | - | how to get attribute value |
| `default_value` | string | optional | - | default value for attribute |
| `rule` | string | optional | - | Rule to extract attribute from streaming response, optional values are `first`, `replace`, `append`|
| `apply_to_log` | bool | optional | false | Whether to record the extracted information in the log |
| `apply_to_span` | bool | optional | false | Whether to record the extracted information in the link tracking span |
| `trace_span_key` | string | optional | - | span attribute key, default is the value of `key` |
| `as_separate_log_field` | bool | optional | false | Whether to use a separate log field, the field name is equal to the value of `key` |
| Name | Type | Required | Default | Description |
| ----------------------- | ------ | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `key` | string | required | - | attribute key |
| `value_source` | string | required | - | attribute value source, optional values are `fixed_value`, `request_header`, `request_body`, `response_header`, `response_body`, `response_streaming_body` |
| `value` | string | required | - | how to get attribute value |
| `default_value` | string | optional | - | default value for attribute |
| `rule` | string | optional | - | Rule to extract attribute from streaming response, optional values are `first`, `replace`, `append` |
| `apply_to_log` | bool | optional | false | Whether to record the extracted information in the log |
| `apply_to_span` | bool | optional | false | Whether to record the extracted information in the link tracking span |
| `trace_span_key` | string | optional | - | span attribute key, default is the value of `key` |
| `as_separate_log_field` | bool | optional | false | Whether to use a separate log field, the field name is equal to the value of `key` |
The meanings of various values for `value_source` are as follows:
@@ -50,7 +53,6 @@ The meanings of various values for `value_source` are as follows:
- `response_body`: The attribute is obtained through the http response body
- `response_streaming_body`: The attribute is obtained through the http streaming response body
When `value_source` is `response_streaming_body`, `rule` should be configured to specify how to obtain the specified value from the streaming body. The meaning of the value is as follows:
- `first`: extract value from the first valid chunk
@@ -58,6 +60,7 @@ When `value_source` is `response_streaming_body`, `rule` should be configured to
- `append`: join value pieces from all valid chunks
## Configuration example
If you want to record ai-statistic related statistical values in the gateway access log, you need to modify log_format and add a new field based on the original log_format. The example is as follows:
```yaml
@@ -65,6 +68,7 @@ If you want to record ai-statistic related statistical values in the gateway acc
```
If the field is set with `as_separate_log_field`, for example:
```yaml
attributes:
- key: consumer
@@ -75,11 +79,13 @@ attributes:
```
Then to print in the log, you need to set log_format additionally:
```
'{"consumer":"%FILTER_STATE(wasm.consumer:PLAIN)%"}'
```
### Empty
#### Metric
```
@@ -121,16 +127,19 @@ irate(route_upstream_model_consumer_metric_llm_duration_count[2m])
```
#### Log
```json
{
"ai_log":"{\"model\":\"qwen-turbo\",\"input_token\":\"10\",\"output_token\":\"69\",\"llm_first_token_duration\":\"309\",\"llm_service_duration\":\"1955\"}"
"ai_log": "{\"model\":\"qwen-turbo\",\"input_token\":\"10\",\"output_token\":\"69\",\"llm_first_token_duration\":\"309\",\"llm_service_duration\":\"1955\"}"
}
```
#### Trace
When the configuration is empty, no additional attributes will be added to the span.
### Extract token usage information from non-openai protocols
When setting the protocol to original in ai-proxy, taking Alibaba Cloud Bailian as an example, you can make the following configuration to specify how to extract `model`, `input_token`, `output_token`
```yaml
@@ -151,6 +160,7 @@ attributes:
apply_to_log: true
apply_to_span: false
```
#### Metric
```
@@ -161,6 +171,7 @@ route_upstream_model_consumer_metric_llm_duration_count{ai_route="bailian",ai_cl
```
#### Log
```json
{
"ai_log": "{\"model\":\"qwen-max\",\"input_token\":\"343\",\"output_token\":\"153\",\"llm_service_duration\":\"19110\"}"
@@ -168,9 +179,11 @@ route_upstream_model_consumer_metric_llm_duration_count{ai_route="bailian",ai_cl
```
#### Trace
Three additional attributes `model`, `input_token`, and `output_token` can be seen in the trace spans.
### Cooperate with authentication and authentication record consumer
```yaml
attributes:
- key: consumer
@@ -180,6 +193,7 @@ attributes:
```
### Record questions and answers
```yaml
attributes:
- key: question
@@ -195,4 +209,51 @@ attributes:
value_source: response_body
value: choices.0.message.content
apply_to_log: true
```
```
### Path and Content Type Filtering Configuration Examples
#### Process Only Specific AI Paths
```yaml
enable_path_suffixes:
- "/v1/chat/completions"
- "/v1/embeddings"
- "/generateContent"
```
#### Process Only Specific Content Types
```yaml
enable_content_types:
- "text/event-stream"
- "application/json"
```
#### Process All Paths (Wildcard)
```yaml
enable_path_suffixes:
- "*"
```
#### Complete Configuration Example
```yaml
enable_path_suffixes:
- "/v1/chat/completions"
- "/v1/embeddings"
- "/generateContent"
enable_content_types:
- "text/event-stream"
- "application/json"
attributes:
- key: model
value_source: request_body
value: model
apply_to_log: true
- key: consumer
value_source: request_header
value: x-mse-consumer
apply_to_log: true
```