Skip to content

Conversation

@kawayiYokami
Copy link
Contributor

@kawayiYokami kawayiYokami commented Dec 24, 2025

🎯 动机 / Motivation

解决的问题:

  • 多文本块支持缺失:当前 AstrBot 只支持单条用户消息,无法在一个请求中包含多个文本块
  • 系统信息污染用户输入:用户标识、时间、群组信息等系统提醒直接插入到用户输入中,违反了 OpenAI 最佳实践
  • 格式不够优雅:系统提醒分散在多个文本块中,浪费 token 且不易解析

添加的功能:

  • 多文本块消息支持:通过 extra_content_blocks 属性支持多个文本块
  • 优化系统提醒格式:统一用 <system_reminder> 包裹,遵循 OpenAI 最佳实践
  • 语义化标签:区分系统提醒、图片描述、引用消息

📝 改动点 / Modifications

核心文件修改:

  1. core/provider/entities.py

    • 新增 extra_content_blocks 属性(默认空列表)
    • 重构 assemble_context() 方法支持多文本块
    • 智能降级:单文本块时保持向后兼容
  2. core/provider/sources/openai_source.py

    • 更新 assemble_context() 支持多文本块和额外内容
  3. core/provider/sources/gemini_source.py

    • 更新 assemble_context() 支持多文本块和额外内容
  4. core/provider/sources/anthropic_source.py

    • 更新 assemble_context() 支持多文本块和额外内容
  5. packages/astrbot/process_llm_request.py

    • 重构系统提醒收集机制:先收集,后统一包裹
    • 图片描述改为 <image_caption> 标签
    • 系统提醒统一为 <system_reminder> 格式

实现的功能:

  • ✅ 多文本块支持:一个用户消息可包含多个文本块
  • ✅ OpenAI 最佳实践:用户发言在前,系统提醒在后
  • ✅ Token 优化:去除冗余换行,统一系统提醒格式
  • ✅ 向后兼容:现有代码无需修改
  • ✅ 语义清晰:明确区分不同类型的内容

🖼️ 测试结果 / Test Results

测试场景:

  1. 向后兼容性测试:只有 prompt 时正确降级为简单格式
  2. 多文本块测试:prompt + extra_content_blocks 正确生成数组格式
  3. 系统提醒格式测试:验证 <system_reminder> 包裹格式
  4. 图片描述测试:验证 <image_caption> 标签格式

测试输出示例:

   1 {
   2   "role": "user",
   3   "content": [
   4     {"type": "text", "text": "你好世界"},
   5     {"type": "text", "text": "<system_reminder>User ID: 123, Nickname: TestUserGroup name:
     测试群Current datetime: 2025-12-25 10:30</system_reminder>"},
   6     {"type": "text", "text": "<image_caption>一张美丽的风景图</image_caption>"}
   7   ]
   8 }

✅ 检查清单 / Checklist

  • 😊 功能讨论:多文本块功能是架构优化,无需额外讨论
  • 👀 测试充分:已通过完整的功能测试,验证了向后兼容性和新功能
  • 🤓 无新依赖:未引入任何新的依赖库
  • 😮 代码安全:代码仅涉及消息格式优化,无安全风险

🚀 使用示例

   1 # 插件中使用新功能
   2 req = event.request_llm(prompt="你好")
   3 req.extra_content_blocks.extend([
   4     {"type": "text", "text": "<system_reminder>请用友好的语气回答</system_reminder>"}
   5 ])
   6 yield req

需要注意的是,本次PR并未对所以 本人从未使用的功能进行优化。
比如知识库。

✦ 这个 PR 为 AstrBot 的多模态消息处理奠定了基础,同时保持了完全的向后兼容性!🎯

Summary by Sourcery

在保持现有单文本行为向后兼容的前提下,为用户消息添加对多个内容块的结构化支持。

新功能:

  • ProviderRequest 上引入 extra_content_blocks 字段,用于携带额外的用户消息片段,例如系统提醒、图像描述和引用消息。
  • 在 OpenAI、Gemini 和 Anthropic 提供方中支持多块用户内容,将主要文本、额外内容块和图片组合成统一的消息载荷。

增强点:

  • 优化各提供方的上下文组装逻辑,以主要用户发言为优先;在仅有图片时添加文本占位符;在适用时回退到旧的单文本格式。
  • 通过收集元数据(用户、群组、时间),将系统提醒标准化为一个统一的 <system_reminder> 文本块,而不是直接注入到提示词中。
  • 修改图像描述和引用消息的处理方式,将其作为带语义标签的文本块(例如 <image_caption><Quoted Message>)追加在用户消息之后,而非前置到提示词前面。
Original summary in English

Summary by Sourcery

Add structured support for multiple content blocks in user messages while keeping existing single-text behavior backward compatible.

New Features:

  • Introduce an extra_content_blocks field on ProviderRequest to carry additional user message segments such as system reminders, image captions, and quoted messages.
  • Support multi-block user content in OpenAI, Gemini, and Anthropic providers, combining primary text, extra content blocks, and images into a unified message payload.

Enhancements:

  • Refine context assembly across providers to prioritize the main user utterance, add a text placeholder when only images are present, and fall back to the legacy single-text format when applicable.
  • Standardize system reminder formatting by collecting metadata (user, group, time) and wrapping it in a single <system_reminder> text block instead of injecting it directly into the prompt.
  • Change image captions and quoted message handling to emit semantically tagged text blocks (e.g., <image_caption>, ) appended after the user message instead of prefixing the prompt.

@auto-assign auto-assign bot requested review from Fridemn and Raven95676 December 24, 2025 19:57
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了 3 个问题,并给出了一些总体反馈:

  • assemble_context 中的向后兼容降级逻辑在 ProviderRequest 与各个 provider source 之间不一致:OpenAI / Gemini / Anthropic 只要存在单一文本块就会降级为纯文本字符串(即使该块来自 extra_content_blocks),而 ProviderRequest.assemble_context 只会在不存在额外块或图片时才降级;建议统一这些条件,以便在多内容块场景下,各个 provider 的行为保持一致、更可预测。
  • process_llm_request 中构建 system_content 时,system_parts 使用 "".join(...) 拼接,这会生成一个没有任何分隔符的长串(例如 User ID...Nickname...Group name...Current datetime...),影响可读性;建议使用换行或其他明显分隔符(如 "\n".join(system_parts))来拼接,以符合预期的可读格式。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The backward‑compatibility downgrade logic in `assemble_context` is inconsistent between `ProviderRequest` and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from `extra_content_blocks`), whereas `ProviderRequest.assemble_context` only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
- When building `system_content` in `process_llm_request`, `system_parts` are concatenated with `"".join(...)`, which will produce a single run‑on string (e.g., `User ID...Nickname...Group name...Current datetime...`) without separators; consider joining with newlines or a clear delimiter (e.g., `"\n".join(system_parts)`) to match the intended readable format.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:243-244` </location>
<code_context>
+            req.extra_content_blocks.append({"type": "text", "text": quoted_text})
+
+        # 统一包裹所有系统提醒
+        if system_parts:
+            system_content = (
+                "<system_reminder>" + "".join(system_parts) + "</system_reminder>"
+            )
</code_context>

<issue_to_address>
**issue (bug_risk):** System reminder pieces are concatenated without separators, which harms readability and may change semantics.

Previously these system details were separated by newlines, but `"".join(system_parts)` now produces `<system_reminder>User ID: ...Group name: ...Current datetime: ...</system_reminder>` with no delimiters. This reduces readability and may break existing prompts that depend on line breaks. Consider joining with newlines (e.g. `"\n".join(system_parts)` or prefixing each part with `"\n"`) to preserve the prior structure.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/openai_source.py:665-667` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The downgrade-to-plain-text condition ignores `extra_content_blocks` and may misclassify messages.

The comment says “只有文本且没有额外内容块”, but the code only checks `len(content_blocks) == 1` and `type == "text"`. As a result:
- A message with empty `text` and a single text `extra_content_block` (no images) will be downgraded to `{"content": <extra_text>}`, losing the original block structure.
- This differs from `ProviderRequest.assemble_context`, which only downgrades when `not self.extra_content_blocks and not self.image_urls`.
To keep block structure when content comes from `extra_content_blocks` or multimodal input, consider applying the same guard (also ensuring there are no images and no extra blocks beyond the main prompt text).

```suggestion
        # 如果只有主文本且没有额外内容块和图片,返回简单格式以保持向后兼容
        # 注意:这里与 ProviderRequest.assemble_context 保持一致,
        # 仅在没有 extra_content_blocks 且没有 image_urls 时才降级为纯文本,
        # 避免把来自 extra_content_blocks 的内容或多模态消息误判为简单文本消息。
        if (
            not extra_content_blocks
            and not image_urls
            and len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"]
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}
```
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:842-843` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion:** Same downgrade condition issue as OpenAI: extra-only content can be collapsed to plain text unexpectedly.

This uses the same overly-broad downgrade rule as `openai_source.assemble_context`: any single text `content_block` is converted to `{"content": <text>}`, even when it comes solely from `extra_content_blocks`. In cases where callers only pass an extra block (e.g., system reminder, quoted message) and no prompt/images, this collapses the structure and loses semantics. Please tighten the condition (as in `ProviderRequest.assemble_context`) so that only a plain user prompt with no extras/images is downgraded.

Suggested implementation:

```python
        # 如果只有文本且没有图片或额外内容块,返回简单格式以保持向后兼容
        # 注意:仅在调用方未提供 extra_content_blocks 和 image_urls 时才降级,
        # 避免“只传额外块(例如系统提醒、引述消息)”的场景被错误折叠为纯文本。
        if (
            len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
            and not image_urls
            and not extra_content_blocks
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}

```

1. 确认 `assemble_context` 的调用方在仅传「额外内容块」而不传主 `text` 时,依然会把这些块放入 `extra_content_blocks`,而不是混入主文本块中;否则需要对构造 `content_blocks` 的逻辑做类似 `ProviderRequest.assemble_context` 的拆分(例如区分「主 user 提示」与「extra blocks」的来源)。
2. 建议对 `ProviderRequest.assemble_context` 当前的降级条件进行对照,确保这里与那里的降级语义保持一致(都仅在“单一主文本、无图片、无 extra 块”时返回简化结构)。
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得这次评审有帮助,欢迎分享给更多人 ✨
请帮我变得更有用!欢迎在每条评论上点 👍 或 👎,我会根据你的反馈不断改进评审质量。
Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

  • The backward‑compatibility downgrade logic in assemble_context is inconsistent between ProviderRequest and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from extra_content_blocks), whereas ProviderRequest.assemble_context only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
  • When building system_content in process_llm_request, system_parts are concatenated with "".join(...), which will produce a single run‑on string (e.g., User ID...Nickname...Group name...Current datetime...) without separators; consider joining with newlines or a clear delimiter (e.g., "\n".join(system_parts)) to match the intended readable format.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The backward‑compatibility downgrade logic in `assemble_context` is inconsistent between `ProviderRequest` and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from `extra_content_blocks`), whereas `ProviderRequest.assemble_context` only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
- When building `system_content` in `process_llm_request`, `system_parts` are concatenated with `"".join(...)`, which will produce a single run‑on string (e.g., `User ID...Nickname...Group name...Current datetime...`) without separators; consider joining with newlines or a clear delimiter (e.g., `"\n".join(system_parts)`) to match the intended readable format.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:243-244` </location>
<code_context>
+            req.extra_content_blocks.append({"type": "text", "text": quoted_text})
+
+        # 统一包裹所有系统提醒
+        if system_parts:
+            system_content = (
+                "<system_reminder>" + "".join(system_parts) + "</system_reminder>"
+            )
</code_context>

<issue_to_address>
**issue (bug_risk):** System reminder pieces are concatenated without separators, which harms readability and may change semantics.

Previously these system details were separated by newlines, but `"".join(system_parts)` now produces `<system_reminder>User ID: ...Group name: ...Current datetime: ...</system_reminder>` with no delimiters. This reduces readability and may break existing prompts that depend on line breaks. Consider joining with newlines (e.g. `"\n".join(system_parts)` or prefixing each part with `"\n"`) to preserve the prior structure.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/openai_source.py:665-667` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The downgrade-to-plain-text condition ignores `extra_content_blocks` and may misclassify messages.

The comment says “只有文本且没有额外内容块”, but the code only checks `len(content_blocks) == 1` and `type == "text"`. As a result:
- A message with empty `text` and a single text `extra_content_block` (no images) will be downgraded to `{"content": <extra_text>}`, losing the original block structure.
- This differs from `ProviderRequest.assemble_context`, which only downgrades when `not self.extra_content_blocks and not self.image_urls`.
To keep block structure when content comes from `extra_content_blocks` or multimodal input, consider applying the same guard (also ensuring there are no images and no extra blocks beyond the main prompt text).

```suggestion
        # 如果只有主文本且没有额外内容块和图片,返回简单格式以保持向后兼容
        # 注意:这里与 ProviderRequest.assemble_context 保持一致,
        # 仅在没有 extra_content_blocks 且没有 image_urls 时才降级为纯文本,
        # 避免把来自 extra_content_blocks 的内容或多模态消息误判为简单文本消息。
        if (
            not extra_content_blocks
            and not image_urls
            and len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}
```
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:842-843` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion:** Same downgrade condition issue as OpenAI: extra-only content can be collapsed to plain text unexpectedly.

This uses the same overly-broad downgrade rule as `openai_source.assemble_context`: any single text `content_block` is converted to `{"content": <text>}`, even when it comes solely from `extra_content_blocks`. In cases where callers only pass an extra block (e.g., system reminder, quoted message) and no prompt/images, this collapses the structure and loses semantics. Please tighten the condition (as in `ProviderRequest.assemble_context`) so that only a plain user prompt with no extras/images is downgraded.

Suggested implementation:

```python
        # 如果只有文本且没有图片或额外内容块,返回简单格式以保持向后兼容
        # 注意:仅在调用方未提供 extra_content_blocks 和 image_urls 时才降级,
        # 避免“只传额外块(例如系统提醒、引述消息)”的场景被错误折叠为纯文本。
        if (
            len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
            and not image_urls
            and not extra_content_blocks
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}

```

1. 确认 `assemble_context` 的调用方在仅传「额外内容块」而不传主 `text` 时,依然会把这些块放入 `extra_content_blocks`,而不是混入主文本块中;否则需要对构造 `content_blocks` 的逻辑做类似 `ProviderRequest.assemble_context` 的拆分(例如区分「主 user 提示」与「extra blocks」的来源)。
2. 建议对 `ProviderRequest.assemble_context` 当前的降级条件进行对照,确保这里与那里的降级语义保持一致(都仅在“单一主文本、无图片、无 extra 块”时返回简化结构)。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@kawayiYokami
Copy link
Contributor Author

这个PR将是其他模块优化的基石

Copy link
Member

@Soulter Soulter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看起来 assemble_context 增加了 extra_content_blocks 但是调用处没有传入新参数?

req.prompt = prefix + req.prompt

# 收集系统提醒信息
system_parts = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system_parts 会被放在 role=user 的部分吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

会,这部分是随时会变动的消息,如果放在前面会打断命中缓存,按照常规做法是放在用户里,用户发言的文本块后面

@kawayiYokami
Copy link
Contributor Author

看起来 assemble_context 增加了 extra_content_blocks 但是调用处没有传入新参数?

确实

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 26, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 26, 2025
@Soulter Soulter changed the title feat: extra text block feat: add extra text block support Dec 26, 2025
Copy link
Member

@Soulter Soulter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉没什么问题了👍建议增加如下修改:

  1. extra_content_blocks 修改为 extra_user_content_parts 增加可理解性;
  2. extra_content_blocks 类型同时支持 list[ContentPart]。并在合适的地方 model_dump。
  3. packages/astrbot/process_llm_request.py 使用 ContentPart 来构造 content blocks。

@dosubot dosubot bot removed the lgtm This PR has been approved by a maintainer label Dec 26, 2025
@kawayiYokami
Copy link
Contributor Author

感觉没什么问题了👍建议增加如下修改:

  1. extra_content_blocks 修改为 extra_user_content_parts 增加可理解性;
  2. extra_content_blocks 类型同时支持 list[ContentPart]。并在合适的地方 model_dump。
  3. packages/astrbot/process_llm_request.py 使用 ContentPart 来构造 content blocks。

应该好了

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 26, 2025
@Soulter Soulter changed the title feat: add extra text block support feat: add extra user content block support Dec 26, 2025
@Soulter Soulter merged commit fbdd60b into AstrBotDevs:master Dec 26, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants