Skip to content

P2728: loosen wording to admit chunked (SIMD) implementations#237

Merged
ednolan merged 1 commit into
mainfrom
enolan_simdwording2
Jun 9, 2026
Merged

P2728: loosen wording to admit chunked (SIMD) implementations#237
ednolan merged 1 commit into
mainfrom
enolan_simdwording2

Conversation

@ednolan

@ednolan ednolan commented Jun 9, 2026

Copy link
Copy Markdown
Member

The wording previously pinned the transcoding iterator to decoding one code point per read: buf_ was sized to hold exactly one transcoded code point, and base() and iterator equality were specified directly in terms of the exposition-only members, making the buffer anchor observable and blocking an as-if chunked implementation. Loosen it so an implementation may transcode chunks of input at a time, e.g. with SIMD:

  • Make buf_'s capacity an unspecified constant, buffer-capacity, of at least 4 / sizeof(ToType), and widen buf_index_ and to_increment_ accordingly. Chunking is permitted only when the underlying range models forward_range; read-ahead on a single-pass range is destructive and therefore observable.
  • Allow read() to transcode an implementation-chosen number n >= 1 of consecutive input subsequences per invocation, provided the result fits in buf_. The choice of n is unobservable and may vary call to call: an implementation typically transcodes a fixed-size window of code units trimmed back to whole input subsequences. Substitution of Maximal Subparts applies per input subsequence exactly as before.
  • Allow read-reverse() to chunk symmetrically (n subsequences ending at current_), specify explicitly that it leaves current_ at the beginning of the first subsequence it transcoded, and note that the chunk's starting boundary is locatable by bounded backward scanning.
  • Respecify base() and iterator equality positionally -- in terms of the position of the current input subsequence and the offset of the current element within its transcoded code units -- instead of by memberwise comparison, since iterators denoting the same element can hold different buffer anchors once buffers are chunked.
  • Make operator-- skip an ill-formed input subsequence's three-unit replacement-character encoding as a unit in _or_error views with char8_t output, mirroring operator++: with chunked buffers those code units can sit mid-buffer, and the skip preserves the canonical first-unit position that positional equality relies on.
  • Add a "SIMD support" design discussion section: iterator size and ABI implications of the buffer capacity, base() implementation strategy, validation-plus-scalar-fallback for SMS error handling, interactivity considerations, preliminary UTF-16 to UTF-8 performance numbers from the prototype std::simd kernel (enolan_simd4 branch), and why a view cannot approach bulk transcoding speed. Update the changelog.

@coveralls

coveralls commented Jun 9, 2026

Copy link
Copy Markdown

Coverage Status

coverage: 99.744%. remained the same — enolan_simdwording2 into main

The wording previously pinned the transcoding iterator to decoding one
code point per read: buf_ was sized to hold exactly one transcoded code
point, and base() and iterator equality were specified directly in
terms of the exposition-only members, making the buffer anchor
observable and blocking an as-if chunked implementation. Loosen it so
an implementation may transcode chunks of input at a time, e.g. with
SIMD:

- Make buf_'s capacity an unspecified constant, buffer-capacity, of at
  least 4 / sizeof(ToType), and widen buf_index_ and to_increment_
  accordingly. Chunking is permitted only when the underlying range
  models forward_range; read-ahead on a single-pass range is
  destructive and therefore observable.
- Allow read() to transcode an implementation-chosen number n >= 1 of
  consecutive input subsequences per invocation, provided the result
  fits in buf_. The choice of n is unobservable and may vary call to
  call: an implementation typically transcodes a fixed-size window of
  code units trimmed back to whole input subsequences. Substitution of
  Maximal Subparts applies per input subsequence exactly as before.
- Allow read-reverse() to chunk symmetrically (n subsequences ending at
  current_), specify explicitly that it leaves current_ at the
  beginning of the first subsequence it transcoded, and note that the
  chunk's starting boundary is locatable by bounded backward scanning.
- Respecify base() and iterator equality positionally -- in terms of
  the position of the current input subsequence and the offset of the
  current element within its transcoded code units -- instead of by
  memberwise comparison, since iterators denoting the same element can
  hold different buffer anchors once buffers are chunked.
- Make operator-- skip an ill-formed input subsequence's three-unit
  replacement-character encoding as a unit in _or_error views with
  char8_t output, mirroring operator++: with chunked buffers those code
  units can sit mid-buffer, and the skip preserves the canonical
  first-unit position that positional equality relies on.
- Add a "SIMD support" design discussion section: iterator size and ABI
  implications of the buffer capacity, base() implementation strategy,
  validation-plus-scalar-fallback for SMS error handling, interactivity
  considerations, preliminary UTF-16 to UTF-8 performance numbers from
  the prototype std::simd kernel (enolan_simd4 branch), and why a view
  cannot approach bulk transcoding speed. Update the changelog.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ednolan ednolan force-pushed the enolan_simdwording2 branch from ac3cf3d to 7d9aefd Compare June 9, 2026 23:47
@ednolan ednolan merged commit 9a87f3b into main Jun 9, 2026
53 checks passed
@ednolan ednolan deleted the enolan_simdwording2 branch June 9, 2026 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants