feat(ai-statistics): support token details and builtin keys for reasoning_tokens/cached_tokens (#3424)

2026-04-21 20:17:29 +08:00 · 2026-02-01 11:54:52 +08:00
parent c0ab271370
commit 0c0ec53a50
4 changed files with 349 additions and 11 deletions
--- a/plugins/wasm-go/extensions/ai-statistics/README_EN.md
+++ b/plugins/wasm-go/extensions/ai-statistics/README_EN.md
@@ -60,6 +60,27 @@ When `value_source` is `response_streaming_body`, `rule` should be configured to
 - `replace`: extract value from the last valid chunk
 - `append`: join value pieces from all valid chunks

+### Built-in Attributes
+
+The plugin provides several built-in attribute keys that can be used directly without configuring `value_source` and `value`. These built-in attributes automatically extract corresponding values from requests/responses:
+
+| Built-in Key | Description | Use Case |
+|--------------|-------------|----------|
+| `question` | User's question content | Supports OpenAI/Claude message formats |
+| `answer` | AI's answer content | Supports OpenAI/Claude message formats, both streaming and non-streaming |
+| `tool_calls` | Tool call information | OpenAI/Claude tool calls |
+| `reasoning` | Reasoning process | OpenAI o1 and other reasoning models |
+| `reasoning_tokens` | Number of reasoning tokens (e.g., o1 model) | OpenAI Chat Completions, extracted from `output_token_details.reasoning_tokens` |
+| `cached_tokens` | Number of cached tokens | OpenAI Chat Completions, extracted from `input_token_details.cached_tokens` |
+| `input_token_details` | Complete input token details (object) | OpenAI/Gemini/Anthropic, includes cache, tool usage, etc. |
+| `output_token_details` | Complete output token details (object) | OpenAI/Gemini/Anthropic, includes reasoning tokens, generated images, etc. |
+
+When using built-in attributes, you only need to set `key`, `apply_to_log`, etc., without setting `value_source` and `value`.
+
+**Notes**:
+- `reasoning_tokens` and `cached_tokens` are convenience fields extracted from token details, applicable to OpenAI Chat Completions API
+- `input_token_details` and `output_token_details` will record the complete token details object as a JSON string
+
 ## Configuration example

 If you want to record ai-statistic related statistical values in the gateway access log, you need to modify log_format and add a new field based on the original log_format. The example is as follows:
@@ -147,6 +168,45 @@ If the request contains a session ID header, the log will automatically include

 When the configuration is empty, no additional attributes will be added to the span.

+### Record Token Details
+
+Use built-in attributes to record token details for OpenAI Chat Completions:
+
+```yaml
+attributes:
+  # Use convenient built-in attributes to extract specific fields
+  - key: reasoning_tokens  # Reasoning tokens (o1 and other reasoning models)
+    apply_to_log: true
+  - key: cached_tokens  # Cached tokens from prompt caching
+    apply_to_log: true
+  # Record complete token details objects
+  - key: input_token_details
+    apply_to_log: true
+  - key: output_token_details
+    apply_to_log: true
+```
+
+#### Log Example
+
+For requests using prompt caching and reasoning models, the log might look like:
+
+```json
+{
+  "ai_log": "{\"model\":\"gpt-4o\",\"input_token\":\"100\",\"output_token\":\"50\",\"reasoning_tokens\":\"25\",\"cached_tokens\":\"80\",\"input_token_details\":\"{\\\"cached_tokens\\\":80}\",\"output_token_details\":\"{\\\"reasoning_tokens\\\":25}\",\"llm_service_duration\":\"2000\"}"
+}
+```
+
+Where:
+- `reasoning_tokens`: 25 - Number of tokens generated during reasoning
+- `cached_tokens`: 80 - Number of tokens read from cache
+- `input_token_details`: Complete input token details (JSON format)
+- `output_token_details`: Complete output token details (JSON format)
+
+These details are useful for:
+1. **Cost optimization**: Understanding cache hit rates to optimize prompt caching strategy
+2. **Performance analysis**: Analyzing reasoning token ratio to evaluate actual overhead of reasoning models
+3. **Usage statistics**: Fine-grained statistics of various token types
+
 ## Debugging

 ### Verifying ai_log Content