archived-php-src/ext/mbstring at ffbddc484884b275b7f2eb4d6cf7e0035c1c6200 - archived-php-src - Gitea: Git with a cup of tea

php/archived-php-src

mirror of https://github.com/php/php-src.git synced 2026-04-11 10:03:18 +02:00

Files

History

Alex Dowad ffbddc4848 Optimize conversion of GB18030 to Unicode

As with CP936, iterating over the PUA table and looking for matches in
it was a significant bottleneck for GB18030 decoding (though not as
severe a bottleneck as for CP936, since more is involved in GB18030
decoding than CP936 decoding).

Here are some benchmark results after optimizing out that bottleneck:

    GB18030, medium - to UTF-16BE - faster by 60.71% (0.0007 vs 0.0017)
    GB18030, medium - to UTF-8    - faster by 59.88% (0.0007 vs 0.0017)
    GB18030, long - to UTF-8      - faster by 44.91% (0.0669 vs 0.1214)
    GB18030, long - to UTF-16BE   - faster by 43.05% (0.0672 vs 0.1181)
    GB18030, short - to UTF-8     - faster by 27.22% (0.0003 vs 0.0004)
    GB18030, short - to UTF-16BE  - faster by 26.98% (0.0003 vs 0.0004)

(The 'short' test strings had 0-5 codepoints each, 'medium' ~100
codepoints, and 'long' ~10,000 codepoints. For each benchmark, the
test harness cycled through all the test strings 40,000 times.)

2023-01-04 21:58:27 +02:00

..

Optimize conversion of GB18030 to Unicode

2023-01-04 21:58:27 +02:00

Implement mb_detect_encoding using fast text conversion filters

2023-01-03 09:10:10 +02:00

Optimize mb_str{,im}width for performance

2021-09-29 18:19:01 +02:00

common_codepoints.txt

Improve mb_detect_encoding's recognition of Turkish text

2022-12-30 14:22:46 +02:00

config.m4

Move mobile variants of SJIS into mbfilter_sjis.c

2022-12-12 16:28:49 +02:00

config.w32

Move mobile variants of SJIS into mbfilter_sjis.c

2022-12-12 16:28:49 +02:00

CREDITS

…

gen_rare_cp_bitvec.php

Improve detection accuracy of mb_detect_encoding

2021-10-19 18:05:51 +02:00

mb_gpc.c

Remove unused 'to_language' and 'from_language' struct fields

2022-08-16 16:43:26 +02:00

mb_gpc.h

Remove unused 'to_language' and 'from_language' struct fields

2022-08-16 16:43:26 +02:00

mbstring_arginfo.h

Do not generate CONST_CS when registering constants (#9439 )

2022-08-28 08:27:19 +02:00

mbstring.c

Use smart_str in mb_http_input rather than mbfl_memory_device

2023-01-03 09:10:13 +02:00

mbstring.h

Implement mb_output_handler using fast text conversion filters

2023-01-03 09:02:21 +02:00

mbstring.stub.php

Fix mb_strimwidth RC info

2022-08-05 17:06:23 +02:00

php_mbregex.c

Reduce memory allocated by var_export, json_encode, serialize, and other (#8902 )

2022-07-08 14:47:46 +02:00

php_mbregex.h

Declare ext/mbstring constants in stubs (#8798 )

2022-06-23 17:34:08 +02:00

php_onig_compat.h

…

php_unicode.c

Speed boost for mb_stripos (when not using UTF-8)

2022-12-18 15:31:20 +02:00

php_unicode.h

Speed boost for mb_stripos (when not using UTF-8)

2022-12-18 15:31:20 +02:00

rare_cp_bitvec.h

Improve mb_detect_encoding's recognition of Turkish text

2022-12-30 14:22:46 +02:00

unicode_data.h

Update Unicode tables to 14.0.0

2021-09-20 09:58:20 +02:00