Commit fdf4cab
fix: remove UTF-8 fast path that could bypass encoding conversion
The UTF-8 fast path (from_utf8 + to_owned) skipped the encoding_rs
decoder when input validated as UTF-8. This is unsafe for the crate's
purpose: bytes detected as a non-UTF-8 encoding might partially validate
as UTF-8 but need proper re-encoding. The decoder also handles
replacement characters for invalid sequences, which the fast path
skipped.
Keeps: pre-allocated output, larger decode buffers, flattened binary
detection, SIMD search via memchr.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent df28fab commit fdf4cab
2 files changed
Lines changed: 1 addition & 24 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | 81 | | |
91 | 82 | | |
92 | 83 | | |
| |||
390 | 381 | | |
391 | 382 | | |
392 | 383 | | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | | - | |
402 | | - | |
403 | | - | |
404 | | - | |
405 | | - | |
406 | | - | |
407 | 384 | | |
408 | 385 | | |
409 | 386 | | |
| |||
0 commit comments