Skip to content

⚡️ Speed up function _rel_to_abs_instr by 19%#109

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-_rel_to_abs_instr-mlccl8m1
Open

⚡️ Speed up function _rel_to_abs_instr by 19%#109
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-_rel_to_abs_instr-mlccl8m1

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Feb 7, 2026

📄 19% (0.19x) speedup for _rel_to_abs_instr in src/datasets/arrow_reader.py

⏱️ Runtime : 474 microseconds 397 microseconds (best of 117 runs)

📝 Explanation and details

This optimization achieves a 19% runtime improvement by restructuring control flow to eliminate redundant operations in the hot path of the _rel_to_abs_instr function.

Key Optimizations:

  1. Lazy pct_to_abs Function Selection (Primary Speedup)

    • Original: Selected pct_to_abs function unconditionally at the start, even when not needed for non-percentage units
    • Optimized: Moved selection inside the if rel_instr.unit == "%" block, eliminating unnecessary attribute lookup and conditional evaluation for ~84% of test cases (227 absolute vs 43 percentage calls)
    • Impact: Saves ~180-190ns per call in the absolute unit path (visible in line profiler: 185825ns → 22406ns for pct_to_abs selection)
  2. Streamlined Boundary Clamping Logic

    • Original: Used min() operations for all values regardless of whether they exceeded bounds
    • Optimized: Replaced with elif chains that only apply clamping when needed (if from_ < 0: ... elif from_ > num_examples: ...)
    • Impact: Reduces function call overhead and eliminates unnecessary comparisons when values are already in valid range
  3. Reduced min() Function Overhead

    • Original: 4 min()/max() calls per invocation (2 for from_, 2 for to)
    • Optimized: Conditional assignment eliminates 2 min() calls when values don't exceed boundaries
    • Impact: Particularly beneficial for absolute unit instructions where indices are typically pre-validated

Performance Context from function_references:

The function is called within to_absolute() via list comprehension, iterating over _relative_instructions. This means:

  • The optimization compounds when processing multiple split instructions
  • The 19% per-call speedup translates directly to dataset loading performance
  • Since dataset splits are processed during initialization, this optimization reduces startup latency

Test Case Analysis:

The optimization excels across all scenarios:

  • Absolute unit cases (most common): 22-35% faster due to eliminated pct_to_abs selection
  • Percentage cases: 9-16% faster from streamlined clamping logic
  • Boundary handling (negative indices, exceeding bounds): 13-32% faster from elif optimization
  • Sequential operations (100-iteration tests): 21% faster, showing consistent performance gain

The optimization maintains correctness while improving runtime through smarter conditional evaluation ordering that matches actual usage patterns (absolute units more common than percentages).

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 273 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import math
from dataclasses import dataclass
from typing import Dict, Optional

# imports
import pytest  # used for our unit tests
from src.datasets.arrow_reader import _rel_to_abs_instr

# function to test
# file: src/datasets/arrow_reader.py
# (We include the original function implementation exactly as provided,
# and define minimal supporting classes used by it so the function can run.)
@dataclass
class _AbsoluteInstruction:
    # Simple container to mirror the expected structure returned by _rel_to_abs_instr
    split: str
    from_: int
    to: int

@dataclass
class RelativeInstruction:
    # Minimal stand-in for the RelativeInstruction used by _rel_to_abs_instr.
    # Fields are named to match what the function accesses.
    splitname: str
    from_: Optional[int]
    to: Optional[int]
    unit: str  # "%" or other (for absolute)
    rounding: str  # "closest" or other (triggers pct1 behavior)

def test_pct_closest_basic_conversion():
    # Test closest rounding with percentages: simple case
    name2len = {"train": 1000}
    rel = RelativeInstruction(splitname="train", from_=10, to=20, unit="%", rounding="closest")
    # 10% of 1000 rounds to 100, 20% to 200
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 5.13μs -> 4.60μs (11.5% faster)

def test_pct_pct1_truncation_behavior():
    # Test percent rounding using pct1-style truncation (any rounding != "closest")
    # math.trunc(350/100.0) == 3, so boundary * 3
    name2len = {"validation": 350}
    rel = RelativeInstruction(splitname="validation", from_=10, to=50, unit="%", rounding="pct1")
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 4.83μs -> 4.63μs (4.30% faster)

def test_unit_absolute_none_handling():
    # When unit is absolute (not "%"), None maps to 0 and num_examples
    name2len = {"test": 123}
    rel = RelativeInstruction(splitname="test", from_=None, to=None, unit="abs", rounding="closest")
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 3.07μs -> 2.50μs (22.9% faster)

def test_unknown_split_raises_value_error():
    # If split not present in name2len, a ValueError is raised with helpful message
    name2len = {"a": 10, "b": 20}
    rel = RelativeInstruction(splitname="missing", from_=0, to=1, unit="abs", rounding="closest")
    with pytest.raises(ValueError) as excinfo:
        _rel_to_abs_instr(rel, name2len) # 4.72μs -> 4.48μs (5.47% faster)

def test_pct1_with_small_split_raises_specific_error():
    # pct1 rounding is forbidden for num_examples < 100 and should raise ValueError with the correct message fragment
    name2len = {"s": 50}
    rel = RelativeInstruction(splitname="s", from_=10, to=20, unit="%", rounding="pct1")
    with pytest.raises(ValueError) as excinfo:
        _rel_to_abs_instr(rel, name2len) # 2.48μs -> 2.25μs (10.3% faster)
    msg = str(excinfo.value)

def test_negative_indices_wrap_and_clamp():
    # Negative indices should be interpreted as offsets from the end and clamped between 0 and num_examples
    name2len = {"set": 100}
    # from_ = -10 => 100 + (-10) = 90 ; to = -1 => 99
    rel = RelativeInstruction(splitname="set", from_=-10, to=-1, unit="abs", rounding="closest")
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 3.72μs -> 3.27μs (13.7% faster)

def test_from_to_clamped_to_num_examples():
    # If from_ or to exceed num_examples, they should be clamped to num_examples
    name2len = {"s": 10}
    rel = RelativeInstruction(splitname="s", from_=1000, to=2000, unit="abs", rounding="closest")
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 3.14μs -> 2.37μs (32.4% faster)

def test_percent_none_bounds_closest():
    # When using percentages with rounding closest and None for bounds, mapping should be [0, num_examples]
    name2len = {"x": 999}
    rel = RelativeInstruction(splitname="x", from_=None, to=None, unit="%", rounding="closest")
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 3.20μs -> 2.51μs (27.3% faster)

def test_close_rounding_ties_behavior():
    # Validate tie/rounding behavior for 'closest' using a scenario with potential rounding ties.
    # Use numbers where the product is near .5 to observe Python's round() behavior (ties to even).
    name2len = {"small": 3}
    # boundary = 33 => 33 * 3 / 100 = 0.99 -> rounds to 1
    rel = RelativeInstruction(splitname="small", from_=33, to=66, unit="%", rounding="closest")
    codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 5.14μs -> 4.71μs (9.20% faster)

def test_many_small_instructions_loop():
    # Create a moderate number of splits (100) and verify the function remains correct across them.
    # This checks scalability without exceeding the 1000-iteration limit.
    count = 100
    name2len: Dict[str, int] = {f"split_{i}": (i + 1) * 3 for i in range(count)}  # sizes 3,6,9,...
    for i in range(count):
        split_name = f"split_{i}"
        size = name2len[split_name]
        # Create a RelativeInstruction that asks for an absolute range [1,2]
        rel = RelativeInstruction(splitname=split_name, from_=1, to=2, unit="abs", rounding="closest")
        codeflash_output = _rel_to_abs_instr(rel, name2len); res = codeflash_output # 113μs -> 94.0μs (21.1% faster)

def test_percent_conversions_across_various_sizes():
    # Test percent conversions across a variety of sizes for both rounding modes.
    # Keeps total checks reasonable (<1000) while covering diverse sizes.
    sizes = [100, 101, 250, 999]
    for size in sizes:
        name2len = {"a": size}
        # closest rounding
        rel_closest = RelativeInstruction(splitname="a", from_=1, to=50, unit="%", rounding="closest")
        codeflash_output = _rel_to_abs_instr(rel_closest, name2len); res_closest = codeflash_output # 10.9μs -> 9.75μs (11.4% faster)
        # pct1 rounding (allowed since sizes >= 100)
        rel_pct1 = RelativeInstruction(splitname="a", from_=1, to=50, unit="%", rounding="pct1")
        codeflash_output = _rel_to_abs_instr(rel_pct1, name2len); res_pct1 = codeflash_output # 8.00μs -> 7.22μs (10.9% faster)
        # For size 100 specifically, truncation factor is exactly 1, so values should equal boundaries
        if size == 100:
            pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import math

import pytest
from src.datasets.arrow_reader import _rel_to_abs_instr

# Helper classes and functions from the source code
class RelativeInstruction:
    """Represents a relative instruction for splitting data."""
    def __init__(self, splitname, unit, from_=None, to=None, rounding="closest"):
        self.splitname = splitname
        self.unit = unit
        self.from_ = from_
        self.to = to
        self.rounding = rounding

class TestBasicFunctionality:
    """Basic tests for normal operation scenarios."""

    def test_absolute_instruction_with_none_from_and_to(self):
        """Test absolute instruction with None values for from_ and to (should use full range)."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=None)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.39μs -> 2.58μs (31.5% faster)

    def test_absolute_instruction_with_explicit_values(self):
        """Test absolute instruction with explicit from_ and to values."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=100, to=500)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.26μs -> 2.41μs (34.9% faster)

    def test_absolute_instruction_only_from_specified(self):
        """Test absolute instruction with only from_ specified."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=200, to=None)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.32μs -> 2.48μs (33.9% faster)

    def test_absolute_instruction_only_to_specified(self):
        """Test absolute instruction with only to specified."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=700)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.29μs -> 2.56μs (28.5% faster)

    def test_percentage_instruction_with_none_from_and_to(self):
        """Test percentage instruction with None values (should use full range)."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=None, to=None, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.25μs -> 2.65μs (22.7% faster)

    def test_percentage_instruction_with_explicit_values(self):
        """Test percentage instruction with explicit percentage values."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=10, to=50, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.33μs -> 4.56μs (16.8% faster)

    def test_percentage_instruction_rounding_closest(self):
        """Test percentage instruction with closest rounding method."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=33, to=67, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.35μs -> 4.68μs (14.3% faster)

    def test_percentage_instruction_rounding_pct1(self):
        """Test percentage instruction with pct1_dropremainder rounding method."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=10, to=50, rounding="pct1_dropremainder")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.09μs -> 4.38μs (16.4% faster)

    def test_multiple_splits_specified(self):
        """Test with multiple splits in name2len dictionary."""
        rel_instr = RelativeInstruction(splitname="validation", unit="abs", from_=10, to=90)
        name2len = {"train": 1000, "validation": 200, "test": 300}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.25μs -> 2.39μs (35.8% faster)

class TestEdgeCases:
    """Edge case tests for unusual or extreme conditions."""

    def test_unknown_split_name_raises_error(self):
        """Test that ValueError is raised for unknown split name."""
        rel_instr = RelativeInstruction(splitname="unknown", unit="abs", from_=0, to=100)
        name2len = {"train": 1000, "test": 200}
        with pytest.raises(ValueError) as exc_info:
            _rel_to_abs_instr(rel_instr, name2len) # 4.98μs -> 4.81μs (3.43% faster)

    def test_zero_examples_in_split(self):
        """Test behavior when split has zero examples."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=None)
        name2len = {"train": 0}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.29μs -> 2.54μs (29.6% faster)

    def test_single_example_in_split(self):
        """Test behavior with a single example in split."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=None)
        name2len = {"train": 1}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.25μs -> 2.54μs (28.4% faster)

    def test_negative_absolute_index_from(self):
        """Test negative indexing for from_ with absolute units (like Python negative indices)."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=-100, to=None)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.64μs -> 3.18μs (14.4% faster)

    def test_negative_absolute_index_to(self):
        """Test negative indexing for to with absolute units."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=-200)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.55μs -> 3.03μs (17.1% faster)

    def test_negative_absolute_indices_both_from_and_to(self):
        """Test negative indexing for both from_ and to."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=-500, to=-100)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.70μs -> 3.06μs (20.7% faster)

    def test_from_greater_than_to_absolute(self):
        """Test when from_ is greater than to (should result in valid bounds)."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=700, to=300)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.22μs -> 2.45μs (31.6% faster)

    def test_from_exceeds_num_examples_absolute(self):
        """Test when from_ exceeds number of examples (should be clipped)."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=1500, to=None)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.31μs -> 2.50μs (32.5% faster)

    def test_to_exceeds_num_examples_absolute(self):
        """Test when to exceeds number of examples (should be clipped)."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=2000)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.19μs -> 2.57μs (24.3% faster)

    def test_percentage_zero_percent(self):
        """Test percentage instruction with zero percent values."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=0, to=0, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.30μs -> 4.60μs (15.3% faster)

    def test_percentage_100_percent(self):
        """Test percentage instruction with 100 percent values."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=0, to=100, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.44μs -> 4.73μs (14.9% faster)

    def test_percentage_rounding_pct1_with_small_split_raises_error(self):
        """Test that pct1_dropremainder raises error on split with < 100 examples."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=10, to=50, rounding="pct1_dropremainder")
        name2len = {"train": 99}
        with pytest.raises(ValueError) as exc_info:
            _rel_to_abs_instr(rel_instr, name2len) # 2.65μs -> 2.34μs (13.1% faster)

    def test_percentage_rounding_pct1_with_exactly_100_examples(self):
        """Test pct1_dropremainder with exactly 100 examples (boundary condition)."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=50, to=100, rounding="pct1_dropremainder")
        name2len = {"train": 100}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.22μs -> 4.38μs (19.0% faster)

    def test_negative_percentage_closest_rounding(self):
        """Test negative percentage values with closest rounding."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=-50, to=-10, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.92μs -> 5.41μs (9.50% faster)

    def test_from_negative_index_exceeds_split_size(self):
        """Test when negative index from_ exceeds split size."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=-2000, to=None)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.66μs -> 3.18μs (15.0% faster)

    def test_to_negative_index_exceeds_split_size(self):
        """Test when negative index to exceeds split size."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=-2000)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.61μs -> 3.13μs (15.1% faster)

    def test_empty_name2len_dict(self):
        """Test with empty name2len dictionary."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=0, to=100)
        name2len = {}
        with pytest.raises(ValueError) as exc_info:
            _rel_to_abs_instr(rel_instr, name2len) # 2.87μs -> 2.58μs (11.5% faster)

    def test_percentage_non_integer_result_closest(self):
        """Test percentage calculation that results in non-integer with closest rounding."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=33.3, to=66.7, rounding="closest")
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.26μs -> 4.51μs (16.7% faster)

    def test_very_large_split_size(self):
        """Test with very large split size."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=1000000, to=5000000)
        name2len = {"train": 10000000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.16μs -> 2.46μs (28.6% faster)

    def test_percentage_with_very_large_split(self):
        """Test percentage calculation with very large split size."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=10, to=90, rounding="closest")
        name2len = {"train": 10000000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.44μs -> 4.60μs (18.2% faster)

class TestLargeScale:
    """Large scale tests to assess performance and scalability."""

    def test_large_number_of_splits(self):
        """Test with a large number of splits in name2len."""
        # Create name2len with 500 different splits
        name2len = {f"split_{i}": 10000 + i * 100 for i in range(500)}
        rel_instr = RelativeInstruction(splitname="split_250", unit="abs", from_=100, to=500)
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.37μs -> 2.71μs (24.4% faster)

    def test_large_absolute_indices(self):
        """Test with very large absolute indices."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=500000, to=900000)
        name2len = {"train": 1000000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.25μs -> 2.44μs (33.2% faster)

    def test_percentage_with_large_numbers(self):
        """Test percentage calculation with large dataset sizes."""
        rel_instr = RelativeInstruction(splitname="train", unit="%", from_=25, to=75, rounding="closest")
        name2len = {"train": 1000000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 5.23μs -> 4.55μs (14.9% faster)

    def test_many_sequential_calls(self):
        """Test multiple sequential calls to _rel_to_abs_instr."""
        name2len = {"train": 1000, "test": 500}
        results = []
        # Call function 100 times with different parameters
        for i in range(100):
            rel_instr = RelativeInstruction(
                splitname="train",
                unit="abs",
                from_=i * 5,
                to=(i + 1) * 5
            )
            codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 113μs -> 92.7μs (21.9% faster)
            results.append(result)
        for i, result in enumerate(results):
            pass

    def test_large_percentage_calculations(self):
        """Test percentage calculations across many different values."""
        name2len = {"train": 100000}
        percentages = [i for i in range(0, 101, 10)]
        results = []
        
        for pct in percentages:
            rel_instr = RelativeInstruction(
                splitname="train",
                unit="%",
                from_=pct,
                to=100,
                rounding="closest"
            )
            codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 22.9μs -> 20.0μs (14.7% faster)
            results.append(result)
        for i in range(len(results) - 1):
            pass

    def test_extreme_negative_indices(self):
        """Test with extreme negative indices on large dataset."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=-900000, to=-100000)
        name2len = {"train": 1000000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.68μs -> 3.16μs (16.3% faster)

class TestIntegrationAndConsistency:
    """Tests for consistency and integration scenarios."""

    def test_absolute_and_percentage_equivalence_full_range(self):
        """Test that absolute and percentage instructions can produce equivalent results."""
        name2len = {"train": 1000}
        
        # Using percentage for full range
        rel_instr_pct = RelativeInstruction(splitname="train", unit="%", from_=0, to=100, rounding="closest")
        codeflash_output = _rel_to_abs_instr(rel_instr_pct, name2len); result_pct = codeflash_output # 5.36μs -> 4.64μs (15.5% faster)
        
        # Using absolute for full range
        rel_instr_abs = RelativeInstruction(splitname="train", unit="abs", from_=0, to=1000)
        codeflash_output = _rel_to_abs_instr(rel_instr_abs, name2len); result_abs = codeflash_output # 1.64μs -> 1.31μs (25.3% faster)

    def test_consistency_across_rounding_methods_large_percentages(self):
        """Test that both rounding methods produce results for large enough splits."""
        name2len = {"train": 1000}
        
        rel_instr_closest = RelativeInstruction(splitname="train", unit="%", from_=50, to=100, rounding="closest")
        codeflash_output = _rel_to_abs_instr(rel_instr_closest, name2len); result_closest = codeflash_output # 5.29μs -> 4.51μs (17.2% faster)
        
        rel_instr_pct1 = RelativeInstruction(splitname="train", unit="%", from_=50, to=100, rounding="pct1_dropremainder")
        codeflash_output = _rel_to_abs_instr(rel_instr_pct1, name2len); result_pct1 = codeflash_output # 3.18μs -> 2.74μs (15.8% faster)

    def test_from_and_to_clipping_behavior(self):
        """Test that from_ and to are properly clipped to valid range."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=-100, to=1500)
        name2len = {"train": 1000}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.60μs -> 3.14μs (14.7% faster)

    def test_percentage_calculation_precision(self):
        """Test that percentage calculations maintain precision for various split sizes."""
        test_cases = [
            (100, 25, 0),    # 25% of 100 = 25
            (200, 50, 0),    # 50% of 200 = 100
            (333, 33, 0),    # 33% of 333 ≈ 110
            (1000, 10, 0),   # 10% of 1000 = 100
        ]
        
        for num_examples, percentage, _ in test_cases:
            rel_instr = RelativeInstruction(splitname="train", unit="%", from_=0, to=percentage, rounding="closest")
            name2len = {"train": num_examples}
            codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 11.3μs -> 9.80μs (15.1% faster)

    def test_none_values_behave_consistently(self):
        """Test that None values consistently map to split boundaries."""
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=None, to=None)
        name2len = {"train": 500}
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.42μs -> 2.54μs (34.8% faster)

    def test_absolute_zero_and_max_boundaries(self):
        """Test behavior at zero and maximum boundaries."""
        name2len = {"train": 1000}
        
        # Test from_ = 0
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=0, to=500)
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 3.23μs -> 2.41μs (34.4% faster)
        
        # Test to = max
        rel_instr = RelativeInstruction(splitname="train", unit="abs", from_=500, to=1000)
        codeflash_output = _rel_to_abs_instr(rel_instr, name2len); result = codeflash_output # 1.59μs -> 1.23μs (30.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_rel_to_abs_instr-mlccl8m1 and push.

Codeflash Static Badge

This optimization achieves a **19% runtime improvement** by restructuring control flow to eliminate redundant operations in the hot path of the `_rel_to_abs_instr` function.

**Key Optimizations:**

1. **Lazy pct_to_abs Function Selection (Primary Speedup)**
   - **Original**: Selected `pct_to_abs` function unconditionally at the start, even when not needed for non-percentage units
   - **Optimized**: Moved selection inside the `if rel_instr.unit == "%"` block, eliminating unnecessary attribute lookup and conditional evaluation for ~84% of test cases (227 absolute vs 43 percentage calls)
   - **Impact**: Saves ~180-190ns per call in the absolute unit path (visible in line profiler: 185825ns → 22406ns for pct_to_abs selection)

2. **Streamlined Boundary Clamping Logic**
   - **Original**: Used `min()` operations for all values regardless of whether they exceeded bounds
   - **Optimized**: Replaced with `elif` chains that only apply clamping when needed (`if from_ < 0: ... elif from_ > num_examples: ...`)
   - **Impact**: Reduces function call overhead and eliminates unnecessary comparisons when values are already in valid range

3. **Reduced min() Function Overhead**
   - **Original**: 4 `min()/max()` calls per invocation (2 for from_, 2 for to)
   - **Optimized**: Conditional assignment eliminates 2 `min()` calls when values don't exceed boundaries
   - **Impact**: Particularly beneficial for absolute unit instructions where indices are typically pre-validated

**Performance Context from function_references:**

The function is called within `to_absolute()` via list comprehension, iterating over `_relative_instructions`. This means:
- The optimization compounds when processing multiple split instructions
- The 19% per-call speedup translates directly to dataset loading performance
- Since dataset splits are processed during initialization, this optimization reduces startup latency

**Test Case Analysis:**

The optimization excels across all scenarios:
- **Absolute unit cases** (most common): 22-35% faster due to eliminated pct_to_abs selection
- **Percentage cases**: 9-16% faster from streamlined clamping logic
- **Boundary handling** (negative indices, exceeding bounds): 13-32% faster from elif optimization
- **Sequential operations** (100-iteration tests): 21% faster, showing consistent performance gain

The optimization maintains correctness while improving runtime through smarter conditional evaluation ordering that matches actual usage patterns (absolute units more common than percentages).
@codeflash-ai codeflash-ai bot requested a review from aseembits93 February 7, 2026 13:26
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants