Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7e325db
feat: add import profiler tool with dynamic loaded lines code volume …
hebaalazzeh Jun 15, 2026
5c0dfcb
fix: address PR review feedback on profiler tool
hebaalazzeh Jun 15, 2026
0cb3d1c
fix: address further PR feedback on imports, pyc caching, json loadin…
hebaalazzeh Jun 15, 2026
9340dc9
fix: refactor profiler worker execution and handling
hebaalazzeh Jun 15, 2026
d8c26a2
Update scripts/import_profiler/profiler.py
hebaalazzeh Jun 15, 2026
f2c998c
Update scripts/import_profiler/profiler.py
hebaalazzeh Jun 15, 2026
d7ae73f
refactor: replace manual sys.argv parsing with argparse
hebaalazzeh Jun 15, 2026
6a94207
Update scripts/import_profiler/profiler.py
hebaalazzeh Jun 15, 2026
1015b2c
fix: refactor cprofile and mprofile to use isolated subprocesses
hebaalazzeh Jun 15, 2026
e15d972
Update scripts/import_profiler/profiler.py
hebaalazzeh Jun 15, 2026
051b455
Update scripts/import_profiler/profiler.py
hebaalazzeh Jun 15, 2026
9f55f1c
Update scripts/import_profiler/profiler.py
hebaalazzeh Jun 15, 2026
fb35fd3
fix: resolve code injection vulnerability when importing target_module
hebaalazzeh Jun 15, 2026
727d44e
docs: simplify profiler mechanism section
hebaalazzeh Jun 15, 2026
70c2a24
update documenation md
hebaalazzeh Jun 15, 2026
40ac359
docs: add inline comments explaining isolation of interpreter boot-ti…
hebaalazzeh Jun 18, 2026
ae00eda
feat: add physical RSS memory tracking alongside tracemalloc
hebaalazzeh Jun 18, 2026
8488200
feat: add --clear-pycache flag for total disk-level cold starts
hebaalazzeh Jun 18, 2026
88c2e7f
feat: make pycache wiping automatic by default
hebaalazzeh Jun 18, 2026
fa00b05
fix: tighten performance timer to strictly wrap import_module
hebaalazzeh Jun 22, 2026
137085a
fix: replace bare Exception with OSError during LOC parsing
hebaalazzeh Jun 22, 2026
68465e5
docs: explain why ValueError is silently passed during source_from_cache
hebaalazzeh Jun 22, 2026
479b39d
refactor: simplify sys.modules traversal and add debug logging for mi…
hebaalazzeh Jun 22, 2026
9a3e981
refactor: replace magic string for CPU pinning with integer constant
hebaalazzeh Jun 22, 2026
aaaa1a4
refactor: remove code duplication and enforce fail-fast for missing t…
hebaalazzeh Jun 22, 2026
352188f
refactor: extract percentile calculation into a helper method
hebaalazzeh Jun 22, 2026
c082ba7
refactor: emit warnings if dynamic import volume fluctuates between i…
hebaalazzeh Jun 22, 2026
3cc3c9d
refactor: use multiprocessing instead of dynamic string execution for…
hebaalazzeh Jun 22, 2026
ccb8a3a
security: strictly validate module names to prevent arbitrary code ex…
hebaalazzeh Jun 30, 2026
9b6399f
fix: implement PR feedback from ohmayr
hebaalazzeh Jun 30, 2026
2c983fb
fix: replace leftover logging.warning with print
hebaalazzeh Jun 30, 2026
4b7bc78
chore: remove unused pathlib import
hebaalazzeh Jun 30, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions scripts/import_profiler/documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Python SDK Import Profiler: Documentation & Breakdown

This document provides a comprehensive guide to the `import_profiler` scripts, directory files, and how to analyze the generated import trace logs to target optimization areas.

---

## 1. File Guide & Directory Structure
The profiling tool is located in the [scripts/import_profiler/](./) directory:

* **[profiler.py](./profiler.py)**: The core executable script. It is designed as a single-file, self-spawning harness that performs process-isolated importing benchmarks and generates trace logs.

---

## 2. Profiler Mechanism (`profiler.py`)

**Objective**
The Profiler functions as a process-isolated verification harness designed to capture before-and-after metrics across three distinct vectors: Initialization Latency (ms), Peak Memory Usage (MB), and Dynamic Code Volume (Loaded Modules & Lines of Code).

**Usage**

Run this command to collect the metrics:
```bash
python profiler.py --module <target_module> --iterations <N>
```

**Expected Output**
```text
--- Results for <target_module> (<N> iterations) ---
Code Volume (Deterministic):
Loaded Modules: <count>
Loaded Lines: <count>
Time (ms):
P50 (Median): <time>
P90: <time>
P99: <time>
RAM (MB):
P50 (Median): <memory>
P90: <memory>
P99: <memory>
```

---

## 3. How to Interpret Python `-X importtime` Trace Logs
When running with the `--trace` flag, the script captures the raw stderr trace produced by Python's `-X importtime` option. The trace looks like this:

```
import time: self [us] | cumulative | imported package
import time: 536 | 536 | _io
import time: 1077 | 2385 | _frozen_importlib_external
import time: 773659 | 793010 | google.cloud.compute_v1.types.compute
```

### Explaining the Fields:
1. **`self [us]` (Microseconds):** The time spent importing the module itself, *excluding* any time spent importing its child dependencies.
2. **`cumulative` (Microseconds):** The total time spent loading the module *including* all nested imports. This represents the total wait time introduced by this line.
3. **Hierarchy Indentation:** Indented packages are sub-imports triggered by the parent module. A package with higher indentation is loaded deeper in the call stack.

---

## 4. Execution Reference
Ensure you are in the correct pyenv virtual environment where packages are installed in editable mode:

```bash
# 1. Run the profiler to get baseline/optimized outcomes (e.g. 5 iterations)
PYENV_VERSION=py312 python profiler.py --module=google.cloud.compute --iterations=5
PYENV_VERSION=py312 python profiler.py --module=google.cloud.aiplatform --iterations=5

# 2. Run the profiler to generate trace logs
PYENV_VERSION=py312 python profiler.py --module=google.cloud.compute --trace
PYENV_VERSION=py312 python profiler.py --module=google.cloud.aiplatform --trace
```
Loading
Loading