Skip to content

feat: table、equation质检prompt#377

Merged
shijinpjlab merged 2 commits intoMigoXLab:devfrom
shijinpjlab:dev_0331
Mar 31, 2026
Merged

feat: table、equation质检prompt#377
shijinpjlab merged 2 commits intoMigoXLab:devfrom
shijinpjlab:dev_0331

Conversation

@shijinpjlab
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two new LLM-based text quality assessment models: LLMTextEquation for evaluating mathematical formulas and LLMTextTable for evaluating HTML tables. The review feedback highlights that the metric_name and description metadata are currently too generic and should be specialized for each class to improve documentation. Furthermore, there are suggestions to maintain case consistency for the 'none' error type in the prompts and to correct a copy-paste error in the table quality prompt where 'equation' was incorrectly referenced instead of 'table'.

Comment on lines +11 to +12
"metric_name": "LLMTextQualityV5",
"description": "Impact-driven text quality evaluation for LLM pretraining, focusing on structural completeness, readability, diversity, and safety with quantitative thresholds",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

metric_namedescription 过于通用,未能体现该类是专门针对“公式(Equation)”质检的。建议将其更新为更具描述性的内容,例如 metric_name 改为 LLMTextEquation,并更新 description 以涵盖公式语法、识别和语义等维度,以便于文档生成和识别。

Return JSON only: {"score": 0/1, "type": "", "name": "", "reason": ""}

score 类型必须为int;
score 为 1 表示通过,type 填 "Good",name 填 "None",reason 说明公式正常的依据;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

建议将 name 的填充值 "None" 改为 "none",以与第 55 行定义的二级错误类型 none 保持大小写一致,避免模型输出不一致。

Suggested change
score 1 表示通过,type "Good"name "None"reason 说明公式正常的依据
score 1 表示通过,type "Good"name "none"reason 说明公式正常的依据

Comment on lines +11 to +12
"metric_name": "LLMTextQualityV5",
"description": "Impact-driven text quality evaluation for LLM pretraining, focusing on structural completeness, readability, diversity, and safety with quantitative thresholds",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

metric_namedescription 描述过于通用,建议更新为针对“表格(Table)”质检的描述,以便于文档生成和区分不同的质检维度。

Return JSON only: {"score": 0/1, "type": "", "name": "", "reason": ""}

score 类型必须为int;
score 为 1 表示通过,type 填 "Good",name 填 "None",reason 说明公式正常的依据;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

此处提示词存在拷贝错误,“公式”应改为“表格”。同时,建议将 name 的填充值 "None" 改为 "none",以与第 57 行定义的二级错误类型保持一致。

Suggested change
score 1 表示通过,type "Good"name "None"reason 说明公式正常的依据
score 1 表示通过,type "Good"name "none"reason 说明表格正常的依据

@shijinpjlab shijinpjlab merged commit 69ea31d into MigoXLab:dev Mar 31, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant