Prevent pytest fail with non unicode chars by aseembits93 · Pull Request #1384 · codeflash-ai/codeflash

aseembits93 · 2026-02-04T16:48:33Z

No description provided.

codeflash/verification/parse_test_output.py

codeflash-ai · 2026-02-04T17:03:41Z

⚡️ Codeflash found optimizations for this PR

📄 32% (0.32x) speedup for `get_analyzer_for_file` in `codeflash/languages/treesitter_utils.py`

⏱️ Runtime : 1.10 milliseconds → 834 microseconds (best of 235 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function get_analyzer_for_file by 32% in PR #1384 (non-unicode-pytest-fail) #1385

If you approve, it will be merged into this PR (branch non-unicode-pytest-fail).

claude · 2026-02-04T17:08:01Z

PR Review Summary

✅ Prek Checks

All prek checks (ruff check, ruff format) passed successfully. No linting or formatting issues detected.

⚠️ Type Checking (mypy)

Status: Multiple type errors detected (162 errors across 5 files)

Critical Issue: The mypy errors indicate that this PR adds # type: ignore comments to suppress type errors rather than fixing the root causes. This violates the project's type safety standards per CLAUDE.md which states: "DO NOT add type: ignore comments - always fix the root cause."

Files with type: ignore suppressions:

codeflash/github/PrComment.py: Lines 34, 48, 49, 51 - suppressing no-redef, assignment, and return-value errors
Other files use # noqa comments but don't affect type safety

Main functional change:

codeflash/code_utils/shell_utils.py:253 - Added errors='replace' parameter to handle non-UTF-8 bytes in subprocess output. This is the core fix for the PR's goal (preventing pytest failures with non-unicode chars).

Recommendation: The errors='replace' change in shell_utils.py is valid and addresses the PR's purpose. However, the # type: ignore comments in PrComment.py should be removed and the underlying type issues should be fixed properly.

📊 Test Coverage

Coverage for changed files (PR branch):

File	Statements	Missed	Coverage
shell_utils.py	149	60	60%
PrComment.py	34	10	71%
find_references.py	324	46	86%
parse.py	191	90	53%
vitest_runner.py	188	44	77%
treesitter_utils.py	729	56	92%
function_optimizer.py	1152	940	18%
TOTAL	2767	1246	55%

Analysis:

The core fix (shell_utils.py line 253) is in a function with 60% coverage
Most changes are cosmetic (noqa comments, formatting) and don't affect coverage
No new files added
Test results: 8 failed, 2195 passed (failures appear unrelated to this PR - they're in test_tracer.py)

Note: Cannot compare with main branch coverage as git operations require approval. The changes are minimal and primarily add error handling for non-UTF-8 output.

🎯 Code Review Issues

1. Type ignore comments violate project standards (HIGH PRIORITY)

Location: codeflash/github/PrComment.py:34, 48, 49, 51
Issue: Using # type: ignore to suppress type errors instead of fixing root causes
Fix: Remove type: ignore comments and properly type the method return value

2. Core fix looks correct

Location: codeflash/code_utils/shell_utils.py:253
The addition of errors='replace' to subprocess.run() args is a valid approach to handle non-UTF-8 bytes
This prevents UnicodeDecodeError when pytest output contains non-unicode characters

🤖 Pending Optimization PRs

Found 30 open optimization PRs from codeflash-ai[bot]. However, most have failing CI checks:

PR ⚡️ Speed up function get_analyzer_for_file by 32% in PR #1384 (non-unicode-pytest-fail) #1385 (related to this PR) - FAILING (multiple CI checks fail)
PR ⚡️ Speed up method PrComment.to_json by 329% in PR #1318 (fix/js-jest30-loop-runner) #1383 - FAILING (prek, pr-review, windows-unit-tests fail)
Most other PRs also have failing checks

Recommendation: Do not merge any optimization PRs until CI checks pass.

Last updated: 2026-02-04

codeflash/github/PrComment.py

aseembits93 · 2026-02-04T20:44:49Z

@claude review this

claude · 2026-02-04T20:45:28Z

Claude finished @aseembits93's task in 1m 27s —— View job

Code Review Complete ✓

Overview

This PR addresses non-UTF-8 character handling in pytest output and adds linter suppressions. The main fix is solid, but there are critical issues that must be addressed before merging.

Critical Issues ❌

1. Type Ignore Comments Violate Project Standards (BLOCKING)

Location: codeflash/github/PrComment.py:34, 48, 49, 51

Issue: The PR adds # type: ignore comments to suppress mypy errors instead of fixing the underlying type issues. This directly violates CLAUDE.md project standards:

"DO NOT add type: ignore comments - always fix the root cause."

Problem Analysis:

result: dict[str, Union[str, int, dict[str, dict[str, int]], list[BenchmarkDetail], None]] = {  # type: ignore[no-redef]
    ...
}

The type error occurs because:

Line 27: Variable result is first used as a dict[str, dict[str, int]]
Line 34: The same name result is redefined with a different type annotation
Lines 48-49: Assigning str values to fields that are typed as int in the Union
Line 51: Return type doesn't match the declared return type

Required Fix:

Rename the first result variable (line 27-32) to report_table to avoid shadowing
Update the return type annotation of to_json() to accurately reflect that throughput fields are str
Remove all four # type: ignore comments

Fix this →

Approved Changes ✓

Core Fix: Non-UTF-8 Handling

Location: codeflash/code_utils/shell_utils.py:250-253

Change:

if text:
    run_args["errors"] = "replace"

Assessment: ✅ EXCELLENT

This is the correct approach to handle non-UTF-8 bytes in subprocess output
Setting errors='replace' prevents UnicodeDecodeError by replacing invalid bytes with � (replacement character)
Well-documented with clear comment explaining the rationale
Follows Python best practices for subprocess error handling
Aligns with the PR's stated goal

Minor Issues (Non-blocking)

2. Linter Suppressions

Locations: Multiple files

Changes:

# noqa: RUF059 - Unused assignment suppressions (find_references.py:216, function_optimizer.py:2096)
# noqa: FURB171 - Single-item tuple suppressions (find_references.py:407, treesitter_utils.py:1583, 1585)
# noqa: PLW0108 - Unnecessary lambda suppressions (function_optimizer.py:318, 333)

Assessment: ⚠️ ACCEPTABLE BUT QUESTIONABLE

These suppressions are valid for existing code patterns, but:

RUF059 warnings indicate unused assignments - the variables original_import and concolic_tests are assigned but never used. Consider removing the unused assignments entirely rather than suppressing.
FURB171 warnings are stylistic (single-item tuples ("x",) could be ["x"]). Suppressing is fine if the tuple is intentional.
PLW0108 warnings suggest using the function directly instead of wrapping in lambda. These are marginal cases where the lambda might be more readable.

Recommendation: These are acceptable but ideally should be addressed in a separate refactoring PR.

3. Import Formatting

Location: codeflash/languages/javascript/parse.py:16-19

Change: Multi-line import consolidated to single line.

Assessment: ✅ Style change, non-functional. Acceptable.

4. String Concatenation

Location: codeflash/languages/javascript/vitest_runner.py:291

Change: Multi-line f-string consolidated to single line.

Assessment: ✅ Style change, non-functional. Acceptable.

Testing Status

Based on the previous review:

✅ 2195 tests passing
❌ 8 tests failing (test_tracer.py - appears unrelated to this PR)
✅ Coverage: 55% for changed files (reasonable)
⚠️ Mypy: 162 type errors (mostly from the # type: ignore additions)

Recommendations

Must Do Before Merge:

Remove all # type: ignore comments from PrComment.py and fix the underlying type issues properly
Run uv run mypy codeflash/ to verify no new type errors are introduced after fixes
Run uv run prek run --from-ref origin/main to ensure all checks pass

Optional Improvements:

Consider removing unused assignments instead of suppressing RUF059 warnings
Investigate the 8 failing tests in test_tracer.py (though they appear unrelated)

Summary

Verdict: ⚠️ CHANGES REQUESTED

The core fix (shell_utils.py) is excellent and solves the stated problem correctly. However, the # type: ignore suppressions in PrComment.py violate project standards and must be fixed before this PR can be merged.

Impact: Once the type ignore comments are fixed, this PR will:

✅ Prevent pytest failures from non-UTF-8 characters in output
✅ Improve robustness of subprocess handling
✅ Maintain type safety without suppressions

Branch: non-unicode-pytest-fail

prevent pytest fail with non unicode chars

b59882e

claude bot reviewed Feb 4, 2026

View reviewed changes

codeflash/verification/parse_test_output.py Outdated Show resolved Hide resolved

fix: re-export jest patterns for backward compatibility

8be10e2

codeflash-ai bot mentioned this pull request Feb 4, 2026

⚡️ Speed up function get_analyzer_for_file by 32% in PR #1384 (non-unicode-pytest-fail) #1385

Open

aseembits93 added 2 commits February 4, 2026 09:19

undo non relevant changes

1a0bb05

mypy fix

ff4c365

claude bot reviewed Feb 4, 2026

View reviewed changes

codeflash/github/PrComment.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent pytest fail with non unicode chars#1384

Prevent pytest fail with non unicode chars#1384
aseembits93 wants to merge 4 commits intomainfrom
non-unicode-pytest-fail

aseembits93 commented Feb 4, 2026

Uh oh!

Uh oh!

codeflash-ai bot commented Feb 4, 2026

⚡️ Speed up function `get_analyzer_for_file` by 32% in PR #1384 (`non-unicode-pytest-fail`) #1385

Uh oh!

claude bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

aseembits93 commented Feb 4, 2026

Uh oh!

claude bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aseembits93 commented Feb 4, 2026

Uh oh!

Uh oh!

codeflash-ai bot commented Feb 4, 2026

⚡️ Codeflash found optimizations for this PR

📄 32% (0.32x) speedup for get_analyzer_for_file in codeflash/languages/treesitter_utils.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function get_analyzer_for_file by 32% in PR #1384 (non-unicode-pytest-fail) #1385

Uh oh!

claude bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

✅ Prek Checks

⚠️ Type Checking (mypy)

📊 Test Coverage

🎯 Code Review Issues

🤖 Pending Optimization PRs

Uh oh!

Uh oh!

aseembits93 commented Feb 4, 2026

Uh oh!

claude bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete ✓

Overview

Critical Issues ❌

1. Type Ignore Comments Violate Project Standards (BLOCKING)

Approved Changes ✓

Core Fix: Non-UTF-8 Handling

Minor Issues (Non-blocking)

2. Linter Suppressions

3. Import Formatting

4. String Concatenation

Testing Status

Recommendations

Must Do Before Merge:

Optional Improvements:

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

📄 32% (0.32x) speedup for `get_analyzer_for_file` in `codeflash/languages/treesitter_utils.py`

⚡️ Speed up function `get_analyzer_for_file` by 32% in PR #1384 (`non-unicode-pytest-fail`) #1385

claude bot commented Feb 4, 2026 •

edited

Loading

claude bot commented Feb 4, 2026 •

edited

Loading