Skip to content

feat: add import profiler tool with dynamic loaded lines code volume …#17467

Merged
hebaalazzeh merged 32 commits into
mainfrom
feat/import-profiler-tool
Jun 30, 2026
Merged

feat: add import profiler tool with dynamic loaded lines code volume …#17467
hebaalazzeh merged 32 commits into
mainfrom
feat/import-profiler-tool

Conversation

@hebaalazzeh

@hebaalazzeh hebaalazzeh commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Overview

This PR introduces profiler.py, a zero-dependency, process-isolated import profiling tool designed to benchmark Python SDK modules with extreme precision and zero dynamic cache interference.

To evaluate authentic import initialization overhead without the "observer effect", the profiler leverages a Master/Worker architecture. The master orchestrator executes each benchmark iteration inside a completely isolated, dynamically spawned subprocess. By bypassing Python's sys.modules cache and sweeping disk-level __pycache__ directories, the tool guarantees a 100% authentic "cold-start" evaluation every time.

Key Telemetry Vectors & Features

  • Initialization Latency (ms): Measured at high resolution natively, securely omitting VM startup skew.
  • Physical Memory Deltas (RAM): Employs a dual-strategy approach tracking both internal block allocator shifts via tracemalloc (cross-platform), and pure OS-level physical memory footprints via /proc/self/statm deltas (Linux RSS) to catch heavy C/C++ backend extensions like protobuf.
  • Deterministic Code Volume (LOC): Tracks the physical delta of sys.modules across the import window, traversing disk paths to aggregate the exact number of physical Lines of Code (LOC) injected into the interpreter.
  • CPU Pinning: Binds worker processes to a single CPU core via taskset on Linux to eliminate OS-level context-switching jitter across iterations.
  • Diagnostic Tooling: Supports one-off diagnostic tracing for bottleneck discovery via --trace (-X importtime) and --cprofile.

A comprehensive breakdown of the architecture, data pipeline, and trace log analysis is documented in scripts/import_profiler/documentation.md.

Related Links

@hebaalazzeh hebaalazzeh self-assigned this Jun 15, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a process-isolated Python SDK Import Profiler tool (profiler.py) along with its comprehensive documentation (documentation.md). The profiler measures initialization latency, peak memory usage, and dynamic code volume across multiple cold-start iterations. Feedback on the implementation highlights several key improvements: adding a guard to prevent crashes during single-iteration runs, replacing absolute local file paths in the documentation with relative links, handling .pyc files and logging exceptions during line counting, and exposing the worker process's stderr to facilitate debugging when subprocesses fail.

Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/documentation.md Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
- Add fallback to handle single-iteration quantile calculation crashes
- Normalize .pyc file paths during line counting and log exceptions
- Expose worker process stderr to facilitate debugging CalledProcessError
- Fix absolute paths in documentation.md to use relative directory paths

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK import profiler tool along with its documentation. The profiler is designed to run process-isolated benchmarks to measure import latency, peak memory usage, and dynamic code volume. The feedback provided aims to improve the robustness of the tool. Key recommendations include using importlib.util.source_from_cache to correctly resolve source paths from compiled .pyc files, moving imports to the module level, parsing only the last line of subprocess stdout to handle noisy module initializations safely, and explicitly specifying encoding="utf-8" when writing files to ensure platform-independent behavior.

Comment thread scripts/import_profiler/profiler.py
Comment on lines +7 to +9
import importlib
import csv
import os

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Import importlib.util and logging at the module level to support robust .pyc path resolution and avoid importing logging inside a loop.

import importlib
import importlib.util
import csv
import os
import logging

Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
…g, and encodings

- Use importlib.util.source_from_cache for robust .pyc resolution
- Move importlib.util and logging imports to module level
- Refine json.loads to parse only the last line of stdout
- Specify UTF-8 encoding when opening files for writing
@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK import profiling tool (profiler.py) along with its comprehensive documentation (documentation.md). The profiler uses a process-isolated master/worker architecture to accurately measure cold-start import latency, peak memory usage, and dynamic code volume. The review feedback highlights two key areas for improvement: robustly parsing the worker's stdout in run_master to prevent crashes from unexpected output (such as warnings) and validating the iteration count, and checking the subprocess return code in run_trace to warn users if an import fails instead of silently writing the error log.

Comment thread scripts/import_profiler/profiler.py Outdated
Comment on lines +153 to +167
def run_trace(target_module):
"""Generates importtime trace log and writes it to a file."""
trace_file = f"import_trace_{target_module.replace('.', '_')}.log"
print(f"Generating importtime trace log for {target_module} -> {trace_file}...")

# We run: python -X importtime -c "import <module>"
result = subprocess.run(
[sys.executable, "-X", "importtime", "-c", f"import {target_module}"],
capture_output=True, text=True
)

with open(trace_file, "w", encoding="utf-8") as f:
f.write(result.stderr)

print(f"Trace log successfully written to {trace_file}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In run_trace, if the import fails, subprocess.run will complete with a non-zero exit code, but the script will silently write the stderr to the trace file and print a success message without warning the user that the import actually failed. Adding a check for result.returncode and printing a warning with the worker's stdout/stderr will significantly improve usability and prevent users from analyzing incomplete or failed traces.

def run_trace(target_module):
    """Generates importtime trace log and writes it to a file."""
    trace_file = f"import_trace_{target_module.replace('.', '_')}.log"
    print(f"Generating importtime trace log for {target_module} -> {trace_file}...")
    
    # We run: python -X importtime -c "import <module>"
    result = subprocess.run(
        [sys.executable, "-X", "importtime", "-c", f"import {target_module}"],
        capture_output=True, text=True
    )
    
    if result.returncode != 0:
        print(f"WARNING: Import failed with exit code {result.returncode}. The trace log may be incomplete or contain errors.", file=sys.stderr)
        if result.stdout:
            print(f"Worker stdout:\\n{result.stdout}", file=sys.stderr)
        if result.stderr:
            print(f"Worker stderr:\\n{result.stderr}", file=sys.stderr)
            
    with open(trace_file, "w", encoding="utf-8") as f:
        f.write(result.stderr)
        
    print(f"Trace log successfully written to {trace_file}")

- Extract worker logic to _run_worker_and_parse for robust JSON parsing
- Add early validation for iterations parameter in run_master
- Add non-zero exit code check and warning for run_trace failures
@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK Import Profiler, which includes a comprehensive documentation guide and a self-spawning profiling script (profiler.py) designed to benchmark import latency, peak memory, and dynamic code volume using a process-isolated master/worker architecture. The feedback recommends enhancing the robustness of the worker's stdout parsing by prefixing metrics with a unique identifier to prevent interference from module initialization logs, and replacing the manual CLI argument parsing with the standard argparse library for improved usability and maintainability.

Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
hebaalazzeh and others added 2 commits June 15, 2026 15:18
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK Import Profiler tool, consisting of a detailed documentation guide and a self-spawning, process-isolated benchmarking script (profiler.py) designed to measure cold-start import latency, peak memory usage, and dynamic code volume. The reviewer's feedback focuses on improving the robustness and usability of the script. Key recommendations include converting the cpu parameter to a string to prevent potential runtime errors, automatically creating parent directories for the CSV export path to avoid FileNotFoundError, and replacing manual CLI argument parsing with Python's standard argparse module for a more robust and user-friendly interface.

I am having trouble creating individual review comments. Click here to see my feedback.

scripts/import_profiler/profiler.py (78-81)

medium

If cpu is passed as an integer (e.g., cpu=0), calling cpu.lower() will raise an AttributeError. Additionally, passing an integer to the subprocess command list will cause a TypeError during execution. Converting cpu to a string at the beginning of run_master ensures robust handling of both integer and string inputs.

def run_master(iterations, target_module, cpu="0", csv_path=None):
    cpu = str(cpu) if cpu is not None else "none"
    if iterations < 1:
        raise ValueError("Number of iterations must be at least 1.")

scripts/import_profiler/profiler.py (127-128)

medium

If the provided csv_path contains nested directories that do not exist yet, open(csv_path, "w") will raise a FileNotFoundError. Automatically creating the parent directories using os.makedirs makes the tool more robust and utilizes the otherwise unused os import.

    if csv_path:
        os.makedirs(os.path.dirname(os.path.abspath(csv_path)), exist_ok=True)
        with open(csv_path, "w", newline="", encoding="utf-8") as f:

scripts/import_profiler/profiler.py (232-267)

medium

Using manual argument parsing is error-prone, doesn't support standard --help / -h flags (which currently falls back to running the full benchmark), and only supports specific formats like --module=value instead of --module value. Replacing this with Python's standard argparse module provides robust validation, automatic help documentation, and a much better user experience.

if __name__ == "__main__":
    import argparse
    
    parser = argparse.ArgumentParser(
        description="Python SDK Import Profiler: benchmark and profile module imports."
    )
    parser.add_argument(
        "--module",
        default="google.cloud.compute",
        help="Target module to profile (e.g., google.cloud.aiplatform)"
    )
    parser.add_argument(
        "--iterations",
        type=int,
        default=50,
        help="Number of cold-start iterations to run"
    )
    parser.add_argument(
        "--cpu",
        default="0",
        help="CPU core to pin processes to using taskset (set to 'none' to disable)"
    )
    parser.add_argument(
        "--csv",
        help="Path to export raw iteration metrics as a CSV file"
    )
    parser.add_argument(
        "--trace",
        action="store_true",
        help="Generate an importtime trace log file"
    )
    parser.add_argument(
        "--cprofile",
        action="store_true",
        help="Run cProfile to capture latency stack traces"
    )
    parser.add_argument(
        "--mprofile",
        action="store_true",
        help="Run tracemalloc memory snapshot"
    )
    parser.add_argument(
        "--worker",
        action="store_true",
        help="Run in worker mode (internal use)"
    )
    
    args = parser.parse_args()
            
    if args.worker:
        run_worker(args.module)
    elif args.trace:
        run_trace(args.module)
    elif args.cprofile:
        run_cprofile(args.module)
    elif args.mprofile:
        run_mprofile(args.module)
    else: 
        run_master(args.iterations, args.module, args.cpu, args.csv)

@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK Import Profiler tool, consisting of a master/worker profiling script (profiler.py) and its comprehensive documentation (documentation.md). The profiler measures import latency, peak memory, and dynamic code volume in a process-isolated manner to avoid caching issues. The feedback suggests a platform-aware improvement to default the CPU pinning option to 'none' on non-Linux systems, preventing unnecessary warnings and fallback attempts on macOS or Windows.

Comment thread scripts/import_profiler/profiler.py Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK Import Profiler tool, consisting of a comprehensive documentation guide and a master/worker execution script (profiler.py) designed to benchmark module import performance under cold-start conditions. Feedback on the implementation highlights opportunities to preserve the cold-start design by running both cProfile and tracemalloc profiling in isolated subprocesses rather than directly within the master process. Additionally, it is recommended to separate FileNotFoundError and subprocess.CalledProcessError handling in the master process to avoid misleading warnings about taskset availability when a worker process crashes.

Comment on lines +195 to +216
def run_cprofile(target_module):
"""Runs cProfile to capture stack traces for latency."""
import cProfile
import pstats

prof_file = f"cprofile_{target_module.replace('.', '_')}.prof"
print(f"Generating cProfile data for {target_module} -> {prof_file}...")

# Run profiling
pr = cProfile.Profile()
pr.enable()
importlib.import_module(target_module)
pr.disable()

# Save for flame charts (e.g. via snakeviz)
pr.dump_stats(prof_file)
print(f"cProfile stats successfully written to {prof_file}")

# Print top bottlenecks
print("\n--- Top 15 functions by cumulative time ---")
ps = pstats.Stats(pr).sort_stats(pstats.SortKey.CUMULATIVE)
ps.print_stats(15)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Running cProfile directly in the master process defeats the "cold start" design of the profiler. Since profiler.py already imports heavy standard library modules (like subprocess, json, csv, logging, tracemalloc, statistics), any imports of these modules by the target module will be resolved instantly from Python's sys.modules cache and won't be captured in the profile.

To ensure a true cold-start profile, cProfile should be executed in an isolated subprocess.

def run_cprofile(target_module):
    """Runs cProfile in a clean subprocess to capture stack traces for latency."""
    import pstats
    
    prof_file = f"cprofile_{target_module.replace('.', '_')}.prof"
    print(f"Generating cProfile data for {target_module} -> {prof_file}...")
    
    # Run profiling in a clean subprocess to ensure cold-start
    result = subprocess.run(
        [sys.executable, "-m", "cProfile", "-o", prof_file, "-c", f"import importlib; importlib.import_module('{target_module}')"],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        print(f"Error generating cProfile data:\n{result.stderr}", file=sys.stderr)
        return

    print(f"cProfile stats successfully written to {prof_file}")
    
    # Print top bottlenecks
    print("\n--- Top 15 functions by cumulative time ---")
    ps = pstats.Stats(prof_file).sort_stats(pstats.SortKey.CUMULATIVE)
    ps.print_stats(15)

Comment thread scripts/import_profiler/profiler.py Outdated
Comment thread scripts/import_profiler/profiler.py Outdated
- Execute cProfile in a fresh subprocess to ensure cold-start accuracy
- Execute tracemalloc in a fresh subprocess to capture all memory allocations
- Clean up master process exception handling for missing taskset vs crashed workers
@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Python SDK Import Profiler tool, which includes a comprehensive documentation file (documentation.md) and a core profiling script (profiler.py) designed to measure import latency, peak memory, and dynamic code volume using process-isolated worker runs. The review feedback highlights several critical security and robustness improvements: addressing potential code injection vulnerabilities in run_trace, run_cprofile, and run_mprofile by safely escaping the target_module parameter using json.dumps(), and preventing potential UnicodeDecodeError failures in run_worker when reading source files by adding errors='ignore' to the file-opening logic.

Comment thread scripts/import_profiler/profiler.py
Comment thread scripts/import_profiler/profiler.py
Comment thread scripts/import_profiler/profiler.py Outdated
Comment on lines +222 to +232
"import tracemalloc\n"
"import importlib\n"
"tracemalloc.start()\n"
f"importlib.import_module('{target_module}')\n"
"snapshot = tracemalloc.take_snapshot()\n"
"tracemalloc.stop()\n"
"top_stats = snapshot.statistics('lineno')\n"
"for stat in top_stats[:15]:\n"
" print(stat)\n"
)
result = subprocess.run([sys.executable, "-c", code], capture_output=True, text=True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Directly interpolating target_module into the code string is vulnerable to code injection. Use json.dumps() to safely escape the module name.

    code = (
        "import tracemalloc\n"
        "import importlib\n"
        "tracemalloc.start()\n"
        f"importlib.import_module({json.dumps(target_module)})\n"
        "snapshot = tracemalloc.take_snapshot()\n"
        "tracemalloc.stop()\n"
        "top_stats = snapshot.statistics('lineno')\n"
        "for stat in top_stats[:15]:\n"
        "    print(stat)\n"
    )

Comment thread scripts/import_profiler/profiler.py Outdated
hebaalazzeh and others added 4 commits June 15, 2026 16:00
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
- Use json.dumps to safely escape target_module string in all subprocess commands (mprofile, cprofile, trace)
@hebaalazzeh

Copy link
Copy Markdown
Contributor Author

/gemini review

Comment on lines +11 to +12
* **[plan.md](./plan.md)**: The current project phases and roadmap checklist.
* **[status.md](./status.md)**: Tracks the active task state and hosts recorded baseline performance metrics.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are these files?

- Recursively wipes __pycache__ and .pyc files via pathlib prior to iteration loops.
- Employs Python's -B flag on worker subprocesses to prevent re-caching during the benchmark, enforcing raw .py parsing for every execution.
- Replaces --clear-pycache with an opt-out --keep-pycache flag
- The profiler now automatically sweeps and enforces disk-level cold starts without requiring explicit flags from the user.
Address PR review comment by moving setup tracking (rss_before, modules_before) outside of the time.perf_counter() window to prevent artificial latency inflation.
Addresses PR feedback to avoid broad exception catching when parsing physical python files for line counts.
Addresses PR review by explicitly documenting the expected PEP 3147/488 error path for irregular .pyc files.
…ssing modules

Addresses PR review by replacing null-checks with a cleaner try/except block that assumes normal module behavior and explicitly logs when a C-extension or built-in is skipped due to a missing __file__ attribute.
Addresses PR review by replacing the 'none' string check with a typed NO_CPU_PINNING = -1 constant for cleaner argument parsing and state tracking.
…askset

Addresses PR review by removing the fallback unpinned execution logic when taskset is not installed. The script now fails immediately with a clear error message, preventing code duplication.
Addresses PR review by collapsing the repeated statistics.quantiles logic into a reusable _calculate_percentiles helper method.
…terations

Addresses PR review by validating that 'loaded_modules' and 'loaded_lines' remain deterministic across cold-start iterations, warning the user if non-deterministic import paths are triggered.
… tracemalloc profiling

Addresses PR review by replacing the hard-to-maintain dynamic code string inside run_mprofile with a properly structured multiprocessing 'spawn' context.
@hebaalazzeh hebaalazzeh assigned parthea and unassigned daniel-sanche Jun 22, 2026
@hebaalazzeh hebaalazzeh marked this pull request as ready for review June 22, 2026 20:15
@hebaalazzeh hebaalazzeh requested a review from a team as a code owner June 22, 2026 20:15
Comment thread scripts/import_profiler/profiler.py Outdated
"import tracemalloc\n"
"import importlib\n"
"tracemalloc.start()\n"
f"importlib.import_module({json.dumps(target_module)})\n"

@daniel-sanche daniel-sanche Jun 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels a little risky, since it could result in arbitrary code execution, depending on how the environment is set up

How many packages do you plan to test against? If it's a small number, can we hard-code a whitelist? If you wanted to run against all our libraries, Gemini suggested this code to dynamically build one from all installed google.cloud packages (we'd probably have to modify it a bit for non-cloud libraries too though)

import pkgutil
import google.cloud

# Dynamically find all installed legitimate google.cloud submodules
VALID_GCP_MODULES = {
    f"google.cloud.{name}" 
    for _, name, _ in pkgutil.iter_modules(google.cloud.__path__)
}

def generate_benchmark_code(target_module):
    if target_module not in VALID_GCP_MODULES:
        raise ValueError("Invalid or uninstalled module requested.")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

hebaalazzeh added a commit that referenced this pull request Jun 30, 2026
## Overview
This is an initiative to resolve severe initialization bottlenecks
(~10s-13s) in generated client libraries by adopting **Native Python
3.15 Explicit Lazy Imports (PEP 0810)**.
To avoid introducing maintenance burdens or subtle breaking changes via
custom import hooks, this PR focuses *exclusively* on utilizing native
interpreter standards.

### Implementation Architecture
We modify the upstream GAPIC Generator to emit a native
`__lazy_modules__` set at the top of the `__init__.py` files,
immediately followed by the standard eager imports.
* **Python 3.15+ (Native Acceleration):** The Python 3.15 runtime
natively intercepts the standard imports referenced in the
`__lazy_modules__` set and treats them as fast C-level lazy proxies.
Startup times drop from ~13s to ~200ms, and peak RAM consumption drops
by up to 85%.
* **Python 3.14 and Older (Zero Blast Radius):** Older Python
environments safely ignore the `__lazy_modules__` variable and process
the standard eager imports exactly as they do today. This guarantees
100% backwards compatibility with zero risk.
* **Perfect Static Typing:** Because standard imports are still present
in the file, IDEs (IntelliSense) and static analyzers (MyPy) maintain
zero-friction support.

**Example Structure:**
```python
# 1. Native C-Level Lazy Proxying for Python 3.15+ (PEP 0810)
__lazy_modules__ = {
   f"{__name__}.services.accelerator_types",
   f"{__name__}.services.addresses",
   f"{__name__}.types.compute",
}
# 2. Standard eager imports
# Evaluated instantly by older Python versions. Intercepted natively by Python 3.15.
from .services.accelerator_types import AcceleratorTypesClient
from .services.addresses import AddressesClient
from .types.compute import Address
__all__ = (
    "AcceleratorTypesClient",
    "AddressesClient",
    "Address",
)
```

## 🚀 Performance Benchmarking Results (Disk-Level Cold Start)

We utilized our custom `profiler.py` script #17467 to run 5 iterations
measuring true disk-level cold starts (with `__pycache__` sweeps) on the
`google-cloud-compute` library. The results validate this Phase 1 PEP
810 Native Lazy Loading implementation:

### Python 3.14 (Current Eager Import)
*   **Time (Median):** `23,246.97 ms` (~23.2 seconds)
*   **Memory (Physical RSS):** `224.87 MB`
*   **Code Volume:** Loaded `1,407` modules (`1,040,992` lines of code)

### Python 3.15.0b2 (Phase 1 PEP 810 Lazy Proxy)
*   **Time (Median):** `1,062.87 ms` (~1.06 seconds)
*   **Memory (Physical RSS):** `7.21 MB`
*   **Code Volume:** Loaded `15` modules (`8,064` lines of code)

**Impact:** By natively deferring the 120+ submodules at the C-level, we
bypassed parsing over **1 million lines of code**. This resulted in an
**approximately 22x reduction in startup latency (~95% speedup)**, and a
staggering **96% reduction in peak RAM consumption** (dropping from
~225MB down to just ~7MB).


Related Links:
see #17594 for a demonstration of this generator change applied to
google-cloud-compute package.
Design Doc: go/sdk-performance-design
Towards googleapis/python-aiplatform#4749
…ecution

Addresses PR review feedback regarding the string-based dynamic code execution in run_trace and run_cprofile. By ensuring the --module argument strictly consists of valid Python identifiers via argparse type validation, we prevent any possibility of arbitrary code injection without requiring a hardcoded whitelist or polluting the cold-start cache.
Comment thread scripts/import_profiler/profiler.py
Comment on lines +168 to +171
if cpu != NO_CPU_PINNING:
print(f"CPU Pinning enabled: Pinning processes to core {cpu} using taskset.")
else:
print("CPU Pinning disabled.")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running taskset is Linux-specific. If CPU pinning is explicitly requested on macOS or Windows, the script may throw a FileNotFoundError and crashes.

Fallback gracefully to NO_CPU_PINNING on non-Linux platforms.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment thread scripts/import_profiler/profiler.py
Comment thread scripts/import_profiler/profiler.py Outdated
rss_memories, p50_rss, p90_rss, p99_rss):
"""Helper method to format and print the final benchmark results."""
def _format_stats(title, data, p50, p90, p99, fmt):
if not data: return ""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python's style guide discourages placing the body of a compound statement on the same line.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Comment thread scripts/import_profiler/profiler.py Outdated
Comment on lines +89 to +92
except KeyError:
logging.debug(f"Module {m} disappeared from sys.modules during execution.")
except AttributeError:
logging.debug(f"Module {m} has no __file__ attribute (likely a C-extension or built-in). Skipping.")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script uses logging.debug() but logging is never configured (via logging.basicConfig()).

Depending on how simple you want to keep the codebase, here are two options:

  • If these messages are only meant for developers reading the code to understand why a module is skipped, replace the dead logging.debug calls with regular inline comments.
  • If you want developers to be able to toggle these logs on from the command line:
    • Add the --verbose flag to argparse.
    • Configure logging at startup and pass it to the worker command construction.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

if not args.keep_pycache: clean_bytecode()
run_mprofile(args.module)
else:
run_master(args.iterations, args.module, args.cpu, args.csv, not args.keep_pycache) No newline at end of file

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add newline at the end.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

- Optimized `clean_bytecode` to use `os.walk` and skip hidden directories (e.g. .git, .venv), preventing slow scans and unwanted cache eviction in developer venvs.
- Added graceful fallback to disable CPU pinning on non-Linux platforms instead of crashing.
- Ensured `clean_bytecode()` runs consistently for `--trace` and `--cprofile` modes for true cold-starts.
- Replaced dead `logging.debug` calls with inline comments to keep the script dependency-free.
- Fixed Python style guideline infractions and added a trailing newline.
Comment thread scripts/import_profiler/profiler.py Outdated
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
loaded_lines += sum(1 for _ in f)
except OSError as e:
logging.warning(f"Failed to read lines from {file_path}: {e}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this reference but it seems like you've removed the logging module.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

Resolves a potential NameError crash caused by a leftover `logging` reference after the module was removed in the previous commit.
Removes the pathlib import which became obsolete after the clean_bytecode logic was refactored to use os.walk.
@hebaalazzeh hebaalazzeh merged commit 1c24692 into main Jun 30, 2026
30 checks passed
@hebaalazzeh hebaalazzeh deleted the feat/import-profiler-tool branch June 30, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants