Skip to content

[0083] string->utf8 测试整合并下沉到 native C++ 实现#881

Merged
da-liii merged 2 commits into
mainfrom
da/0083/string_to_utf8
Jun 28, 2026
Merged

[0083] string->utf8 测试整合并下沉到 native C++ 实现#881
da-liii merged 2 commits into
mainfrom
da/0083/string_to_utf8

Conversation

@da-liii

@da-liii da-liii commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • 将 `string->utf8` 从纯 Scheme 实现下沉到 native C++(新建 `src/scheme_base.cpp`,归属 `scheme base`)
  • 用公开 API `s7_string_length` 拿字节长度(支持含 NUL 的字符串),自实现 UTF-8 字符宽度判定与字节定位,不碰 s7 内部宏
  • `goldfish/scheme/base.scm` 的 `string->utf8` 转发到 native,删除原 Scheme 循环
  • 同时整合 `string->utf8` 测试:合并两份分散的测试到 `tests/scheme/base/string-to-utf8-test.scm`(47 check),删除 `tests/liii/unicode/string-to-utf8-test.scm`
  • 新增 `bench/string-to-utf8-perf.scm`,覆盖 1/2/3/4 字节码点 × 5 长度档 + start/end 切片

性能对比(debug 构建,单次平均耗时,字符串预构造)

长度 ASCII(1B) before → after 加速 中文(3B) before → after 加速
1 ~22µs → ~2.1µs ~11x ~29µs → ~2.2µs ~13x
10 ~133µs → ~2.2µs ~60x ~198µs → ~2.3µs ~87x
100 ~1.25ms → ~2.8µs ~446x ~1.88ms → ~3.1µs ~606x
1000 ~12.3ms → ~10µs ~1230x ~18.7ms → ~13µs ~1438x
10000 ~123ms → ~82µs ~1494x ~193ms → ~97µs ~1990x

start/end 切片(1000 字符中文字符串):全程 ~1357x、前半 ~1167x、后半 ~1538x、中段 ~1365x。整轮 bench 从 ~40s 降至 ~1.1s。

Test plan

  • `bin/gf tests/scheme/base/string-to-utf8-test.scm` — 47 correct, 0 failed
  • `bin/gf tests/scheme/base/utf8-to-string-test.scm` — 9 correct, 0 failed(依赖 string->utf8)
  • `bin/gf tests/liii/base64/base64-encode-test.scm` — 10 correct, 0 failed(string-base64-encode 内部调 string->utf8)
  • `bin/gf tests/liii/base64/bytevector-base64-encode-test.scm` — 22 correct, 0 failed
  • `bin/gf bench/string-to-utf8-perf.scm` — before/after 数据已记录

🤖 Generated with Claude Code

da-liii and others added 2 commits June 28, 2026 23:23
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@da-liii da-liii merged commit 57ddc4e into main Jun 28, 2026
5 checks passed
@da-liii da-liii deleted the da/0083/string_to_utf8 branch June 28, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant