Skip to content

⚡️ Speed up function _add_java_class_members by 48% in PR #1199 (omni-java)#1368

Open
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
codeflash/optimize-pr1199-2026-02-04T04.23.41
Open

⚡️ Speed up function _add_java_class_members by 48% in PR #1199 (omni-java)#1368
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
codeflash/optimize-pr1199-2026-02-04T04.23.41

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 4, 2026

⚡️ This pull request contains optimizations for PR #1199

If you approve this dependent PR, these changes will be merged into the original PR branch omni-java.

This PR will be automatically closed if the original PR is merged.


📄 48% (0.48x) speedup for _add_java_class_members in codeflash/code_utils/code_replacer.py

⏱️ Runtime : 47.3 milliseconds 32.0 milliseconds (best of 82 runs)

📝 Explanation and details

The optimized code achieves a 47% runtime improvement (from 47.3ms to 32.0ms) by eliminating redundant parsing operations through two key optimizations:

What Changed

1. Parse Result Caching in JavaAnalyzer

Added a _tree_cache dictionary and _get_cached_tree() method that caches parsed tree-sitter trees by source content. This prevents reparsing identical source code when find_methods(), find_classes(), or find_fields() are called multiple times on the same source within a single analyzer instance.

2. Elimination of Repeated Reparsing in _insert_class_members()

The original implementation reparsed the source after inserting fields to get updated byte positions for method insertion. The optimized version:

  • Computes both insertion points (fields and methods) from the original parse tree
  • Tracks a delta offset as bytes are inserted
  • Adjusts subsequent insertion points by the accumulated delta
  • Uses list-based string building (field_parts, method_parts) instead of repeated concatenation
  • Works entirely in bytes until the final decode, avoiding multiple encode/decode cycles

Why It's Faster

Parse elimination is the key win: Line profiler shows parsing (self.parse()) consumed 10.8-28.2% of time in various methods in the original. By caching parse results, the optimized code avoids ~60% of these expensive tree-sitter operations when the same source is analyzed multiple times (e.g., _add_java_class_members calls find_classes() on original_source twice and optimized_code up to three times).

Delta-based insertion tracking: The original code's repeated reparsing after field insertion (classes = analyzer.find_classes(result) taking 14.2% and 28.2% of _insert_class_members time) is completely eliminated. The optimized version calculates both insertion points upfront and adjusts them arithmetically.

String operation efficiency: Using list accumulation for building field/method text reduces repeated string concatenation overhead, though this is a minor contributor compared to parse elimination.

Test Case Performance

The optimization shows consistent 25-55% speedups across test cases:

  • Basic operations: 35-50% faster (e.g., test_basic_adds_new_static_field_when_missing: 38% faster)
  • Large scale: 30-108% faster (e.g., test_large_scale_many_fields_and_methods: 37.8% faster, test_performance_with_large_original_source: 108% faster due to eliminating repeated parsing of large ASTs)
  • No performance regressions on meaningful workloads (edge cases with invalid/empty inputs show negligible 1-5% variations due to measurement noise)

Impact Context

This optimization is valuable when Java source code analysis involves repeated queries on the same source content, particularly in code transformation pipelines where multiple analyses (classes, methods, fields) are performed on both original and optimized versions of the same code.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 41 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 93.2%
🌀 Click to see Generated Regression Tests
import re
# imports
import sys
import types

import pytest
from codeflash.code_utils.code_replacer import _add_java_class_members

def test_basic_adds_new_static_field_when_missing():
    # Original class with no static fields.
    original = (
        "public class Example {\n"
        "    public void doSomething() {\n"
        "    }\n"
        "}\n"
    )

    # Optimized code introduces a new static field 'MAGIC'.
    optimized = (
        "public class Example {\n"
        "    private static final int MAGIC = 42;\n"
        "    public void doSomething() {\n"
        "    }\n"
        "}\n"
    )

    # Call function with no target functions.
    codeflash_output = _add_java_class_members(optimized, original, target_function_names=None); result = codeflash_output # 258μs -> 187μs (38.0% faster)

def test_basic_does_not_add_existing_field():
    # Original already contains static field FOO.
    original = (
        "public class Example {\n"
        "    private static int FOO = 1;\n"
        "    public void x() {}\n"
        "}\n"
    )

    # Optimized also contains the same field (possibly re-ordered).
    optimized = (
        "public class Example {\n"
        "    private static int FOO = 1;\n"
        "    private static int NEW = 2;\n"
        "    public void x() {}\n"
        "}\n"
    )

    codeflash_output = _add_java_class_members(optimized, original, target_function_names=None); result = codeflash_output # 331μs -> 231μs (42.8% faster)

def test_basic_adds_helper_method_and_excludes_target():
    # Original has one method 'compute' which is the target being replaced.
    original = (
        "public class Calc {\n"
        "    public int compute(int x) {\n"
        "        return x;\n"
        "    }\n"
        "}\n"
    )

    # Optimized adds a helper method 'helper' and provides a new implementation for 'compute'.
    optimized = (
        "public class Calc {\n"
        "    private int helper(int y) {\n"
        "        return y * 2;\n"
        "    }\n"
        "    public int compute(int x) {\n"
        "        return helper(x);\n"
        "    }\n"
        "}\n"
    )

    # Specify that 'compute' is the target function being replaced; it must not be added as a helper.
    codeflash_output = _add_java_class_members(optimized, original, target_function_names=["compute"]); result = codeflash_output # 347μs -> 246μs (40.7% faster)

def test_edge_no_classes_returns_original():
    # Empty sources or sources without classes should be returned unchanged.
    original = "/* not a java class */\nint x = 1;\n"
    optimized = "// optimized but no class\nint x = 2;\n"
    codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 35.1μs -> 36.7μs (4.40% slower)

def test_edge_optimized_different_class_name_uses_first_optimized():
    # Original class named A, optimized contains only class B. According to logic,
    # function should fall back to the first optimized class.
    original = (
        "public class A {\n"
        "    public void a() {}\n"
        "}\n"
    )
    optimized = (
        "public class B {\n"
        "    private static int NEWB = 5;\n"
        "    public void b() {}\n"
        "}\n"
    )
    codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 182μs -> 145μs (25.0% faster)

def test_edge_handles_class_with_no_body_gracefully():
    # Class declaration without braces (no body)
    original = "public class NoBody"
    optimized = "public class NoBody { private static int X = 1; }\n"
    codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 62.6μs -> 63.8μs (1.88% slower)

def test_large_scale_many_fields_and_methods():
    # Large-scale test within constraints (<1000 elements)
    # Build an original class with a single method.
    original_lines = ["public class Big {\n", "    public void existing() {}\n", "}\n"]
    original = "".join(original_lines)

    # Construct optimized class with many static fields and many helper methods.
    num_fields = 150  # well under 1000
    num_helpers = 150

    optimized_lines = ["public class Big {\n"]
    for i in range(num_fields):
        optimized_lines.append(f"    private static int F{i} = {i};\n")
    for i in range(num_helpers):
        # create helper methods with simple bodies
        optimized_lines.append(f"    private int helper{i}(int v) {{\n        return v + {i};\n    }}\n")
    # Also include the existing method (unchanged)
    optimized_lines.append("    public void existing() {}\n")
    optimized_lines.append("}\n")
    optimized = "".join(optimized_lines)

    codeflash_output = _add_java_class_members(optimized, original, target_function_names=None); result = codeflash_output # 18.6ms -> 13.5ms (37.8% faster)

    # All new static field names should now appear in the result.
    for i in (0, num_fields - 1):
        pass

def test_preserves_javadoc_and_indentation_for_methods():
    # Original class without helper.
    original = (
        "public class DocTest {\n"
        "    public void main() {}\n"
        "}\n"
    )

    # Optimized method has Javadoc preceding it and multiple indented lines.
    optimized = (
        "public class DocTest {\n"
        "    /**\n"
        "     * Helper does something\n"
        "     */\n"
        "    private int helperWithDoc(int a) {\n"
        "        // perform compute\n"
        "        return a + 1;\n"
        "    }\n"
        "    public void main() {}\n"
        "}\n"
    )

    codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 316μs -> 233μs (35.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.code_utils.code_replacer import _add_java_class_members
from codeflash.languages.java.parser import get_java_analyzer

class TestAddJavaClassMembersBasic:
    """Basic test cases for _add_java_class_members function."""

    def test_no_new_members_returns_original_source(self):
        """When optimized code has no new members, return original source unchanged."""
        original = """public class MyClass {
    public void existingMethod() {
        System.out.println("test");
    }
}"""
        optimized = """public class MyClass {
    public void existingMethod() {
        System.out.println("test");
    }
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 253μs -> 169μs (49.9% faster)

    def test_empty_original_source_returns_original(self):
        """When original source is empty, return it unchanged."""
        original = ""
        optimized = """public class MyClass {
    public void newMethod() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 45.2μs -> 46.9μs (3.57% slower)

    def test_empty_optimized_code_returns_original(self):
        """When optimized code is empty, return original unchanged."""
        original = """public class MyClass {
    public void existingMethod() {}
}"""
        optimized = ""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 44.2μs -> 45.8μs (3.50% slower)

    def test_single_new_static_field_added_to_class(self):
        """When optimized code has a new static field, it should be added to original class."""
        original = """public class MyClass {
    public void method() {}
}"""
        optimized = """public class MyClass {
    private static final int NEW_CONSTANT = 42;
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 260μs -> 192μs (35.1% faster)

    def test_single_new_helper_method_added_to_class(self):
        """When optimized code has a new helper method, it should be added to original class."""
        original = """public class MyClass {
    public void publicMethod() {}
}"""
        optimized = """public class MyClass {
    public void publicMethod() {}
    
    private static int helperMethod(int x) {
        return x * 2;
    }
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 289μs -> 214μs (34.8% faster)

    def test_multiple_new_fields_and_methods_added(self):
        """When optimized code has multiple new fields and methods, all should be added."""
        original = """public class Calculator {
    public int compute() {
        return 0;
    }
}"""
        optimized = """public class Calculator {
    private static final int SCALE = 10;
    private static final String VERSION = "1.0";
    
    public int compute() {
        return helper1();
    }
    
    private static int helper1() {
        return 5;
    }
    
    private static int helper2() {
        return 10;
    }
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 500μs -> 331μs (50.8% faster)

    def test_target_function_excluded_from_new_helpers(self):
        """When a method name is in target_function_names, it should not be added as new helper."""
        original = """public class MyClass {
    public int targetMethod(int x) {
        return x + 1;
    }
}"""
        optimized = """public class MyClass {
    public int targetMethod(int x) {
        return helper(x);
    }
    
    private static int helper(int x) {
        return x + 1;
    }
}"""
        # Specify targetMethod as a target function being optimized
        codeflash_output = _add_java_class_members(optimized, original, target_function_names=["targetMethod"]); result = codeflash_output # 365μs -> 256μs (42.3% faster)
        # Count occurrences of "targetMethod" - should only be in the original one
        lines = result.split("\n")
        target_method_count = sum(1 for line in lines if "targetMethod" in line)

    def test_class_with_javadoc_on_new_method(self):
        """When new method has Javadoc, it should be preserved in the added method."""
        original = """public class MyClass {
    public void main() {}
}"""
        optimized = """public class MyClass {
    /**
     * Calculates the result.
     * @return the computed value
     */
    private static int calculate() {
        return 42;
    }
    
    public void main() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 280μs -> 208μs (34.7% faster)

    def test_no_classes_in_original_source_returns_original(self):
        """When original source has no class definition, return it unchanged."""
        original = "// Just a comment"
        optimized = """public class MyClass {
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 44.4μs -> 46.3μs (3.92% slower)

    def test_no_classes_in_optimized_code_returns_original(self):
        """When optimized code has no class definition, return original unchanged."""
        original = """public class MyClass {
    public void method() {}
}"""
        optimized = "// Just a comment"
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 43.8μs -> 45.9μs (4.73% slower)

class TestAddJavaClassMembersEdgeCases:
    """Edge case tests for _add_java_class_members function."""

    def test_class_name_mismatch_uses_first_class(self):
        """When class names don't match, should use first class as fallback."""
        original = """public class OriginalClass {
    public void method() {}
}"""
        optimized = """public class OptimizedClass {
    private static final int VALUE = 100;
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 190μs -> 150μs (26.7% faster)

    def test_field_with_complex_initializer(self):
        """New field with complex initializer should be added correctly."""
        original = """public class MyClass {
    public void method() {}
}"""
        optimized = """public class MyClass {
    private static final int[] ARRAY = {1, 2, 3, 4, 5};
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 295μs -> 214μs (37.9% faster)

    def test_method_with_complex_signature_and_generics(self):
        """New method with complex signature including generics should be added."""
        original = """public class MyClass {
    public void main() {}
}"""
        optimized = """public class MyClass {
    private static <T> T genericHelper(T input) {
        return input;
    }
    
    public void main() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 281μs -> 208μs (34.7% faster)

    def test_field_name_substring_of_existing_field(self):
        """Field name that is substring of existing field should be recognized as new."""
        original = """public class MyClass {
    private static final int CONSTANT_VALUE = 1;
    
    public void method() {}
}"""
        optimized = """public class MyClass {
    private static final int CONSTANT_VALUE = 1;
    private static final int CONSTANT = 2;
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 346μs -> 240μs (44.0% faster)

    def test_method_name_substring_of_existing_method(self):
        """Method name that is substring of existing method should be recognized as new."""
        original = """public class MyClass {
    public int calculateValue() {
        return 0;
    }
}"""
        optimized = """public class MyClass {
    public int calculateValue() {
        return calculate();
    }
    
    private static int calculate() {
        return 5;
    }
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 305μs -> 221μs (37.8% faster)

    def test_inner_class_ignored(self):
        """Inner classes should be ignored; focus on top-level class."""
        original = """public class MyClass {
    public void method() {}
    
    public static class Inner {}
}"""
        optimized = """public class MyClass {
    private static final int VALUE = 42;
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 300μs -> 211μs (42.0% faster)

    def test_interface_instead_of_class(self):
        """When source is an interface, should handle gracefully."""
        original = """public interface MyInterface {
    void method();
}"""
        optimized = """public interface MyInterface {
    void method();
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 132μs -> 100μs (32.5% faster)

    def test_enum_instead_of_class(self):
        """When source is an enum, should handle gracefully."""
        original = """public enum MyEnum {
    VALUE1, VALUE2;
}"""
        optimized = """public enum MyEnum {
    VALUE1, VALUE2;
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 112μs -> 84.2μs (34.0% faster)

    def test_null_target_function_names(self):
        """When target_function_names is None, should handle gracefully."""
        original = """public class MyClass {
    public void original() {}
}"""
        optimized = """public class MyClass {
    private static void helper() {}
    
    public void original() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original, target_function_names=None); result = codeflash_output # 245μs -> 182μs (34.5% faster)

    def test_empty_target_function_names_list(self):
        """When target_function_names is empty list, should add all new methods."""
        original = """public class MyClass {
    public void original() {}
}"""
        optimized = """public class MyClass {
    private static void helper() {}
    
    public void original() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original, target_function_names=[]); result = codeflash_output # 240μs -> 177μs (35.0% faster)

    def test_exception_handling_returns_original_source(self):
        """When exception occurs during processing, should return original source."""
        # Create invalid Java code that will cause parsing to fail gracefully
        original = """public class MyClass {
    public void method() {}
}"""
        optimized = """this is not valid java code at all"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 147μs -> 148μs (0.679% slower)

    def test_field_already_exists_in_original(self):
        """When field already exists in original, it should not be added again."""
        original = """public class MyClass {
    private static final int EXISTING = 1;
    
    public void method() {}
}"""
        optimized = """public class MyClass {
    private static final int EXISTING = 1;
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 238μs -> 164μs (44.4% faster)

    def test_method_already_exists_in_original(self):
        """When method already exists in original, it should not be added again."""
        original = """public class MyClass {
    public void method() {}
    
    private static int helper() {
        return 5;
    }
}"""
        optimized = """public class MyClass {
    public void method() {}
    
    private static int helper() {
        return 5;
    }
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 248μs -> 174μs (42.7% faster)

    def test_whitespace_only_lines_handled_correctly(self):
        """Source code with whitespace-only lines should be handled correctly."""
        original = """public class MyClass {
    
    
    public void method() {}
}"""
        optimized = """public class MyClass {
    private static final int VALUE = 42;
    
    public void method() {}
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 253μs -> 186μs (36.0% faster)

    def test_method_with_overloads_handled_correctly(self):
        """When new helper has same name as existing overload, should handle correctly."""
        original = """public class MyClass {
    public int helper(int x) {
        return x;
    }
}"""
        optimized = """public class MyClass {
    public int helper(int x) {
        return x;
    }
    
    private static int helper(int x, int y) {
        return x + y;
    }
}"""
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 274μs -> 211μs (29.8% faster)

class TestAddJavaClassMembersLargeScale:
    """Large scale test cases for _add_java_class_members function."""

    def test_many_new_fields_added_efficiently(self):
        """When optimized code has many new fields (100), all should be added."""
        original = """public class MyClass {
    public void method() {}
}"""
        
        # Build optimized code with 100 new fields
        field_lines = ["public class MyClass {"]
        for i in range(100):
            field_lines.append(f"    private static final int FIELD_{i} = {i};")
        field_lines.append("    public void method() {}")
        field_lines.append("}")
        optimized = "\n".join(field_lines)
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 3.80ms -> 2.66ms (43.2% faster)
        
        # Verify all fields are added
        for i in range(100):
            pass

    def test_many_new_methods_added_efficiently(self):
        """When optimized code has many new helper methods (50), all should be added."""
        original = """public class MyClass {
    public void main() {}
}"""
        
        # Build optimized code with 50 new helper methods
        method_lines = ["public class MyClass {"]
        for i in range(50):
            method_lines.append(f"    private static int helper{i}() {{")
            method_lines.append(f"        return {i};")
            method_lines.append("    }")
        method_lines.append("    public void main() {}")
        method_lines.append("}")
        optimized = "\n".join(method_lines)
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 2.91ms -> 2.23ms (30.2% faster)
        
        # Verify all methods are added
        for i in range(50):
            pass

    def test_large_method_with_many_lines_added(self):
        """When new helper method has many lines (100+), all content should be preserved."""
        original = """public class MyClass {
    public void main() {}
}"""
        
        # Build a large method
        method_lines = ["public class MyClass {"]
        method_lines.append("    private static int largeHelper() {")
        method_lines.append("        int result = 0;")
        for i in range(100):
            method_lines.append(f"        result += {i};")
        method_lines.append("        return result;")
        method_lines.append("    }")
        method_lines.append("    public void main() {}")
        method_lines.append("}")
        optimized = "\n".join(method_lines)
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 1.84ms -> 1.27ms (44.9% faster)

    def test_mixed_large_scale_fields_and_methods(self):
        """When optimized code has many fields and methods combined, all should be added."""
        original = """public class MyClass {
    public void main() {}
}"""
        
        # Build code with mixed fields and methods
        lines = ["public class MyClass {"]
        
        # Add 30 fields
        for i in range(30):
            lines.append(f"    private static final int CONST_{i} = {i};")
        
        # Add 20 methods
        for i in range(20):
            lines.append(f"    private static void helper{i}() {{")
            lines.append(f"        System.out.println({i});")
            lines.append("    }")
        
        lines.append("    public void main() {}")
        lines.append("}")
        optimized = "\n".join(lines)
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 3.15ms -> 2.02ms (55.8% faster)
        
        # Verify all members are added
        for i in range(30):
            pass
        for i in range(20):
            pass

    def test_very_long_class_name_and_identifiers(self):
        """When identifiers are very long, should still work correctly."""
        original = """public class VeryLongClassNameWithManyCharactersToTestIdentifierHandling {
    public void method() {}
}"""
        
        optimized = """public class VeryLongClassNameWithManyCharactersToTestIdentifierHandling {
    private static final int VERY_LONG_FIELD_NAME_WITH_DESCRIPTIVE_NAMING_CONVENTION = 42;
    
    private static int veryLongMethodNameDescribingItsComplexFunctionality() {
        return 0;
    }
    
    public void method() {}
}"""
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 375μs -> 244μs (53.7% faster)

    def test_performance_with_large_original_source(self):
        """When original source is large (many unrelated fields/methods), processing should complete."""
        # Build a large original class with 100 fields and 50 methods
        lines = ["public class LargeClass {"]
        
        for i in range(100):
            lines.append(f"    private int field{i};")
        
        for i in range(50):
            lines.append(f"    public void method{i}() {{")
            lines.append(f"        this.field{i} = {i};")
            lines.append("    }")
        
        lines.append("}")
        original = "\n".join(lines)
        
        # Optimized code with just a few new members
        optimized = """public class LargeClass {
    private static final String NEW_CONSTANT = "new";
    
    private static int newHelper() {
        return 999;
    }
    
    public void method() {}
}"""
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 8.52ms -> 4.09ms (108% faster)

    def test_target_function_names_with_many_items(self):
        """When target_function_names has many items, all should be excluded correctly."""
        original = """public class MyClass {
    public void method() {}
}"""
        
        # Create list of 50 target function names
        target_names = [f"targetMethod{i}" for i in range(50)]
        
        optimized = """public class MyClass {
    private static int helper() {
        return 42;
    }
    
    public void method() {}
}"""
        
        codeflash_output = _add_java_class_members(optimized, original, target_function_names=target_names); result = codeflash_output # 277μs -> 205μs (35.1% faster)

    def test_preserved_indentation_with_deeply_nested_structures(self):
        """When new methods have nested structures, indentation should be preserved."""
        original = """public class MyClass {
    public void main() {}
}"""
        
        optimized = """public class MyClass {
    private static void complexHelper() {
        for (int i = 0; i < 10; i++) {
            if (i > 5) {
                try {
                    System.out.println("deep");
                } catch (Exception e) {
                    System.err.println("error");
                }
            }
        }
    }
    
    public void main() {}
}"""
        
        codeflash_output = _add_java_class_members(optimized, original); result = codeflash_output # 496μs -> 356μs (39.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1199-2026-02-04T04.23.41 and push.

Codeflash Static Badge

The optimized code achieves a **47% runtime improvement** (from 47.3ms to 32.0ms) by eliminating redundant parsing operations through two key optimizations:

## What Changed

### 1. Parse Result Caching in JavaAnalyzer
Added a `_tree_cache` dictionary and `_get_cached_tree()` method that caches parsed tree-sitter trees by source content. This prevents reparsing identical source code when `find_methods()`, `find_classes()`, or `find_fields()` are called multiple times on the same source within a single analyzer instance.

### 2. Elimination of Repeated Reparsing in `_insert_class_members()`
The original implementation reparsed the source after inserting fields to get updated byte positions for method insertion. The optimized version:
- Computes both insertion points (fields and methods) from the **original** parse tree
- Tracks a `delta` offset as bytes are inserted
- Adjusts subsequent insertion points by the accumulated delta
- Uses list-based string building (`field_parts`, `method_parts`) instead of repeated concatenation
- Works entirely in bytes until the final decode, avoiding multiple encode/decode cycles

## Why It's Faster

**Parse elimination is the key win**: Line profiler shows parsing (`self.parse()`) consumed 10.8-28.2% of time in various methods in the original. By caching parse results, the optimized code avoids ~60% of these expensive tree-sitter operations when the same source is analyzed multiple times (e.g., `_add_java_class_members` calls `find_classes()` on `original_source` twice and `optimized_code` up to three times).

**Delta-based insertion tracking**: The original code's repeated reparsing after field insertion (`classes = analyzer.find_classes(result)` taking 14.2% and 28.2% of `_insert_class_members` time) is completely eliminated. The optimized version calculates both insertion points upfront and adjusts them arithmetically.

**String operation efficiency**: Using list accumulation for building field/method text reduces repeated string concatenation overhead, though this is a minor contributor compared to parse elimination.

## Test Case Performance

The optimization shows consistent 25-55% speedups across test cases:
- **Basic operations**: 35-50% faster (e.g., `test_basic_adds_new_static_field_when_missing`: 38% faster)
- **Large scale**: 30-108% faster (e.g., `test_large_scale_many_fields_and_methods`: 37.8% faster, `test_performance_with_large_original_source`: 108% faster due to eliminating repeated parsing of large ASTs)
- **No performance regressions** on meaningful workloads (edge cases with invalid/empty inputs show negligible 1-5% variations due to measurement noise)

## Impact Context

This optimization is valuable when Java source code analysis involves repeated queries on the same source content, particularly in code transformation pipelines where multiple analyses (classes, methods, fields) are performed on both original and optimized versions of the same code.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 4, 2026
@codeflash-ai codeflash-ai bot mentioned this pull request Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants