feat: table、equation质检prompt by shijinpjlab · Pull Request #377 · MigoXLab/dingo

shijinpjlab · 2026-03-31T03:06:41Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces two new LLM-based text quality assessment models: LLMTextEquation for evaluating mathematical formulas and LLMTextTable for evaluating HTML tables. The review feedback highlights that the metric_name and description metadata are currently too generic and should be specialized for each class to improve documentation. Furthermore, there are suggestions to maintain case consistency for the 'none' error type in the prompts and to correct a copy-paste error in the table quality prompt where 'equation' was incorrectly referenced instead of 'table'.

gemini-code-assist · 2026-03-31T03:08:09Z

dingo/model/llm/text_quality/llm_text_equation.py

+        "metric_name": "LLMTextQualityV5",
+        "description": "Impact-driven text quality evaluation for LLM pretraining, focusing on structural completeness, readability, diversity, and safety with quantitative thresholds",


metric_name 和 description 过于通用，未能体现该类是专门针对“公式（Equation）”质检的。建议将其更新为更具描述性的内容，例如 metric_name 改为 LLMTextEquation，并更新 description 以涵盖公式语法、识别和语义等维度，以便于文档生成和识别。

gemini-code-assist · 2026-03-31T03:08:09Z

dingo/model/llm/text_quality/llm_text_equation.py

+Return JSON only: {"score": 0/1, "type": "", "name": "", "reason": ""}
+
+score 类型必须为int；
+score 为 1 表示通过，type 填 "Good"，name 填 "None"，reason 说明公式正常的依据；


建议将 name 的填充值 "None" 改为 "none"，以与第 55 行定义的二级错误类型 none 保持大小写一致，避免模型输出不一致。

Suggested change

score 为 1 表示通过，type 填 "Good"，name 填 "None"，reason 说明公式正常的依据；

score 为 1 表示通过，type 填 "Good"，name 填 "none"，reason 说明公式正常的依据；

gemini-code-assist · 2026-03-31T03:08:09Z

dingo/model/llm/text_quality/llm_text_table.py

+        "metric_name": "LLMTextQualityV5",
+        "description": "Impact-driven text quality evaluation for LLM pretraining, focusing on structural completeness, readability, diversity, and safety with quantitative thresholds",


metric_name 和 description 描述过于通用，建议更新为针对“表格（Table）”质检的描述，以便于文档生成和区分不同的质检维度。

gemini-code-assist · 2026-03-31T03:08:10Z

dingo/model/llm/text_quality/llm_text_table.py

+Return JSON only: {"score": 0/1, "type": "", "name": "", "reason": ""}
+
+score 类型必须为int；
+score 为 1 表示通过，type 填 "Good"，name 填 "None"，reason 说明公式正常的依据；


此处提示词存在拷贝错误，“公式”应改为“表格”。同时，建议将 name 的填充值 "None" 改为 "none"，以与第 57 行定义的二级错误类型保持一致。

Suggested change

score 为 1 表示通过，type 填 "Good"，name 填 "None"，reason 说明公式正常的依据；

score 为 1 表示通过，type 填 "Good"，name 填 "none"，reason 说明表格正常的依据；

shijinpjlab added 2 commits March 31, 2026 11:00

feat: table、equation质检prompt

da0b464

feat: fix lint

40fbd67

gemini-code-assist bot reviewed Mar 31, 2026

View reviewed changes

shijinpjlab merged commit 69ea31d into MigoXLab:dev Mar 31, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: table、equation质检prompt#377

feat: table、equation质检prompt#377
shijinpjlab merged 2 commits intoMigoXLab:devfrom
shijinpjlab:dev_0331

shijinpjlab commented Mar 31, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 31, 2026

Uh oh!

gemini-code-assist bot Mar 31, 2026

Uh oh!

gemini-code-assist bot Mar 31, 2026

Uh oh!

gemini-code-assist bot Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		"metric_name": "LLMTextQualityV5",
		"description": "Impact-driven text quality evaluation for LLM pretraining, focusing on structural completeness, readability, diversity, and safety with quantitative thresholds",

	score 为 1 表示通过，type 填 "Good"，name 填 "None"，reason 说明公式正常的依据；
	score 为 1 表示通过，type 填 "Good"，name 填 "none"，reason 说明公式正常的依据；

Conversation

shijinpjlab commented Mar 31, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant