⚡️ Speed up method ReferenceFinder._find_references_in_file by 313% in PR #1335 (gpu-flag)
#1356
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1335
If you approve this dependent PR, these changes will be merged into the original PR branch
gpu-flag.📄 313% (3.13x) speedup for
ReferenceFinder._find_references_in_fileincodeflash/languages/javascript/find_references.py⏱️ Runtime :
5.05 milliseconds→1.22 milliseconds(best of8runs)📝 Explanation and details
This optimization achieves a 313% speedup (from 5.05ms to 1.22ms) by eliminating redundant string decoding operations during AST traversal. The key improvements are:
What was optimized:
_node_text_cacheand_node_bytes_cachedictionaries to store decoded text and byte slices for each tree-sitter node, keyed by node ID_get_node_text()and_get_node_bytes()helper methods that cache results on first accessname == search_name) to byte equality (node_bytes == search_bytes), avoiding UTF-8 decoding unless necessarysearch_nameis encoded once per file assearch_bytesrather than repeatedly during comparisonsWhy this is faster:
The original code repeatedly sliced and decoded the same AST node text during recursive traversal. Line profiler shows
_find_identifier_referencesspent 52.1% of time inchild_by_field_name("function")and 13.9% checking node types, with additional time decoding node text multiple times. The optimization eliminates this redundancy—each node's text is decoded at most once and cached. Byte comparisons are faster than string comparisons in Python and skip decoding entirely when names don't match.Impact:
_find_references_in_filetotal time dropped from 21.5ms to 6.6ms (69% reduction)_find_identifier_referencesbecomes dramatically faster by avoiding repeated decode operations on the same nodesThe caching strategy is safe because tree-sitter nodes are immutable within a parse tree, and the caches are explicitly cleared between files to prevent memory leaks or cross-file contamination.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1335-2026-02-04T01.22.32and push.