Skip to content

Conversation

@ChALkeR
Copy link
Member

@ChALkeR ChALkeR commented Dec 18, 2025

Tracking: #61041

This builds on top of #61093 and gives an additional ~2x improvement by moving the same logic to native

Warning

Very crude, just a concept demonstration at this point

The current native fast path removed in #61093 has a bunch of nested ifs and converts data from utf16 to utf8 and then back to utf16
This instead constructs strings using direct maps as #61093 and returns them as raw buffers

windows-1252, main:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 0.31 GiB/s 0.272 ms
Complex 1 79.771 KiB 0.06 GiB/s 1.292 ms

windows-1252, #61093:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 33.41 GiB/s 0.003 ms
Complex 1 79.771 KiB 1.48 GiB/s 0.056 ms

windows-1252, this PR:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 36.83 GiB/s 0.002 ms
Complex 1 79.771 KiB 3.11 GiB/s 0.027 ms

This also similarly improves all other 1-byte encodings compared to #61093

Only the second commit, first is #61093

Warning

Has a lot of cleanup to do, do mot merge, reviewing except for benchmarking / concept is pointless at this point


For comparison, Bun:

Test Size Throughput Mean Time
Latin lipsum (ASCII) 84.902 KiB 36.89 GiB/s 0.003 ms
Complex 1 79.771 KiB 0.23 GiB/s 0.329 ms

v8/jsc is not an issue, unoptimal code is

cc @nodejs/performance

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/startup

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Dec 18, 2025
@ChALkeR ChALkeR changed the title perf: move all 1-byte encodings to native src: move all 1-byte encodings to native Dec 18, 2025
@ChALkeR ChALkeR force-pushed the chalker/decoder/single-byte/1 branch 7 times, most recently from 6130c12 to 4be8283 Compare December 19, 2025 00:16
'windows-1257',
'windows-1258',
'x-user-defined', // Has to be last, special case
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you don’t need this array long term. Can you create it inline with the Set?

@ChALkeR ChALkeR force-pushed the chalker/decoder/single-byte/1 branch from 4be8283 to 6a46ae3 Compare December 19, 2025 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants