1
0
mirror of https://github.com/php/php-src.git synced 2026-04-11 10:03:18 +02:00
Files
archived-php-src/ext/mbstring
Alex Dowad ffbddc4848 Optimize conversion of GB18030 to Unicode
As with CP936, iterating over the PUA table and looking for matches in
it was a significant bottleneck for GB18030 decoding (though not as
severe a bottleneck as for CP936, since more is involved in GB18030
decoding than CP936 decoding).

Here are some benchmark results after optimizing out that bottleneck:

    GB18030, medium - to UTF-16BE - faster by 60.71% (0.0007 vs 0.0017)
    GB18030, medium - to UTF-8    - faster by 59.88% (0.0007 vs 0.0017)
    GB18030, long - to UTF-8      - faster by 44.91% (0.0669 vs 0.1214)
    GB18030, long - to UTF-16BE   - faster by 43.05% (0.0672 vs 0.1181)
    GB18030, short - to UTF-8     - faster by 27.22% (0.0003 vs 0.0004)
    GB18030, short - to UTF-16BE  - faster by 26.98% (0.0003 vs 0.0004)

(The 'short' test strings had 0-5 codepoints each, 'medium' ~100
codepoints, and 'long' ~10,000 codepoints. For each benchmark, the
test harness cycled through all the test strings 40,000 times.)
2023-01-04 21:58:27 +02:00
..
2022-08-05 17:06:23 +02:00
2021-09-20 09:58:20 +02:00