[Issue #6631] fix normalize comments eats single line comments in code blocks#6924
Conversation
There was a problem hiding this comment.
A few comments, mostly subjective. The one that seems least subjective to me is how code fences are detected when they do not appear at the beginning of the line.
I think this change would benefit from carrying more of its rationale with it, rather than leaving that context mostly in the issue thread.
Concretely, I think that could look like a few things:
- Expand the PR body to describe the exact failure mode being fixed, especially that this is about
normalize_commentsstripping inner// ...lines from fenced examples in certain block-comment shapes, and that the non-*multi-line block-comment case is the important one. - Add a bit more implementation rationale either in the PR body or in the relevant commit message(s): why tracking code-block state line-by-line is the chosen approach here, and what cases that logic is intended to distinguish.
- Add more targeted doc comments near the new logic itself where the behavior is non-obvious. In particular,
CodeBlockTracker::next_line()seems like it should document more clearly what counts as a fence transition, and therewrite_comment_inner()call site could use a brief note explaining why lines inside fenced blocks are trimmed differently from ordinary comment lines.
I don’t have a strong preference whether that context lives mostly in the PR body, in one stronger commit message, or in a few focused doc comments, but I do think the change would be easier to review and maintain if more of the “what exact problem is being solved here, and why this approach?” were written down next to the implementation.
I’m also fairly new to this part of the codebase, and more generally comment.rs reads to me like it already has state-machine-like behavior, but that structure is not especially explicit in the code. This PR adds another piece of line-to-line state on top of that. A good follow-up might be to make that broader stateful structure more obvious in a similar way to how CodeBlockTracker makes this particular part explicit.
For context, I’m not a maintainer here, so take the maintainability/style parts of this as one reviewer’s perspective rather than project policy.
My perspective is probably preserve this. |
| (line, true) | ||
| } else if let Some(line) = line.strip_prefix(opener.trim_end()) { | ||
| (line, false) | ||
| } else if let Some(line) = line.strip_prefix(style.line_start()) { |
There was a problem hiding this comment.
This misses the bare * continuation-line form for block comments.
For example, a comment like:
/*
*
* text
*/should normalize to:
// textbut with the current checks the blank separator line falls through as comment text, so it gets treated like // * instead. The current logic handles * via style.line_start().trim_start(), but not a line whose full trimmed content is just *.
There was a problem hiding this comment.
Thanks for the catch! I added your example as a test case with the fix.
Fix for Issue 6631
Issue to Fix
normalize_commentsstrips comments in code blocks that do not begin with a*in block comments.Changes for Fix
Rewrote
left_trim_comment_line. It now only strips the opener or line start of the comment style from thestyleparameter. Before it went down anifelsebranch to see if it could strip different types of comment openers and line starters, regardless of the comment style.The previous doc comment of
left_trim_comment_linedid not match the behavior of the actual function. For example, the lastelseblock has:(line, line.starts_with(' '))as its return type. It simply returns the original string, and a boolean indicating if it starts with a whitespace. Yet, the function comment says it will return true if a single whitespace has been trimmed, but in this case it was not. Same goes for the second branch, where theCommentStyle::Customis checked. If we trim it successfully, we don't check if the custom opener ends with a whitespace, we just return true assuming we trimmed a whitespace.With the new function, the description is accurate.
Question about Fix
How do we want a comment like this to be formatted in normalize mode?
With this change, it is formatted like this:
Should we remove the extra
//or preserve it because it was in the original comment?