Commit 3a81f94
authored
Optimize unpack, str.__add__ and fastlocals (RustPython#7293)
* Remove intermediate Vec allocation in unpack_sequence fast path
Push elements directly from tuple/list slice in reverse order
instead of cloning into a temporary Vec first.
* Use read-only atomic load before swap in check_signals
Add Relaxed load guard before the Acquire swap to avoid cache-line
invalidation on every instruction dispatch when no signal is pending.
* Cache builtins downcast in ExecutingFrame for LOAD_GLOBAL
Pre-compute builtins.downcast_ref::<PyDict>() at frame entry and reuse
the cached reference in load_global_or_builtin and LoadBuildClass.
Also add get_chain_exact to skip redundant exact_dict type checks.
* Add number Add slot to PyStr for direct str+str dispatch
binary_op1 can now resolve str+str addition directly via the number
slot instead of falling through to the sequence concat path.
* Guard FastLocals access in locals() with try_lock on state mutex
Address CodeRabbit review: f_locals() could access fastlocals without
synchronization when called from another thread. Use try_lock on the
state mutex so concurrent access is properly serialized.
* Use exact type check for builtins_dict cache
downcast_ref::<PyDict>() matches dict subclasses, causing
get_chain_exact to bypass custom __getitem__ overrides.
Use downcast_ref_if_exact to only fast-path exact dict types.
* Consolidate with_recursion in _cmp to single guard
Move the recursion depth check to wrap the entire _cmp body
instead of each individual call_cmp direction, reducing Cell
read/write pairs and scopeguard overhead per comparison.
* Add opcode-level fast paths for FOR_ITER, COMPARE_OP, BINARY_OP
- FOR_ITER: detect PyRangeIterator and bypass generic iterator
protocol (atomic slot load + indirect call)
- COMPARE_OP: inline int/float comparison for exact types,
skip rich_compare dispatch and with_recursion overhead
- BINARY_OP: inline int add/sub with i64 checked arithmetic
to avoid BigInt heap allocation and binary_op1 dispatch
* Also check globals is exact dict for LOAD_GLOBAL fast path
get_chain_exact bypasses __missing__ on dict subclasses.
Move get_chain_exact to PyExact<PyDict> impl with debug_assert,
and have get_chain delegate to it. Store builtins_dict as
Option<&PyExact<PyDict>> to enforce exact type at compile time.
Use PyRangeIterator::next_fast() instead of pub(crate) fields.
Fix comment style issues.1 parent 7b89d82 commit 3a81f94
6 files changed
Lines changed: 271 additions & 102 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
681 | 681 | | |
682 | 682 | | |
683 | 683 | | |
684 | | - | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
685 | 688 | | |
686 | 689 | | |
687 | 690 | | |
| |||
690 | 693 | | |
691 | 694 | | |
692 | 695 | | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
| 706 | + | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
693 | 711 | | |
694 | 712 | | |
695 | 713 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
613 | 613 | | |
614 | 614 | | |
615 | 615 | | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
616 | 629 | | |
617 | 630 | | |
618 | 631 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1584 | 1584 | | |
1585 | 1585 | | |
1586 | 1586 | | |
| 1587 | + | |
| 1588 | + | |
| 1589 | + | |
| 1590 | + | |
| 1591 | + | |
| 1592 | + | |
| 1593 | + | |
| 1594 | + | |
| 1595 | + | |
| 1596 | + | |
| 1597 | + | |
| 1598 | + | |
| 1599 | + | |
| 1600 | + | |
1587 | 1601 | | |
1588 | 1602 | | |
1589 | 1603 | | |
| |||
0 commit comments