Commit 70e7587
committed
perf: adaptive SIMD search, flattened binary detection, deduplicated decode
- Fix detect_encoding loop that never iterated past first <meta> tag
- Fix detect_html_metadata early-returning None on bad lang bytes
- Extract shared decode_to_string() helper (removed 130 lines of duplication)
- Extract shared extract_quoted_or_unquoted() for attribute parsing
- Remove dead dependencies: phf_codegen, percent-encoding
- Add memchr for SIMD-accelerated byte search
- Adaptive find_short(): scalar loop for <128 byte haystacks (avoids SIMD
setup overhead), memchr + verify for larger inputs
- Flatten is_binary_file: sorted static table + binary_search replaces
double PHF lookup, eliminates string hashing
- Pre-allocate output String with input capacity, 4x larger decode buffers
- Stack-based ASCII lowercasing in encoding_for_locale (no heap alloc)
- Replace .to_vec() allocations with str::from_utf8 in detection paths
- Add criterion benchmarks and 14 new tests
- Update dependencies1 parent e9a1c80 commit 70e7587
5 files changed
Lines changed: 954 additions & 365 deletions
0 commit comments