mirror of
https://github.com/php/php-src.git
synced 2026-03-24 08:12:21 +01:00
When converting text to/from wchars, mbstring makes one function call for each and every byte or wchar to be converted. Typically, each of these conversion functions contains a state machine, and its state has to be restored and then saved for every single one of these calls. It doesn't take much to see that this is grossly inefficient. Instead of converting one byte or wchar on each call, the new conversion functions will either fill up or drain a whole buffer of wchars on each call. In benchmarks, this is about 3-10× faster. Adding the new, faster conversion functions for all supported legacy text encodings still needs some work. Also, all the code which uses the old-style conversion functions needs to be converted to use the new ones. After that, the old code can be dropped. (The mailparse extension will also have to be fixed up so it will still compile.)
20 lines
536 B
PHP
20 lines
536 B
PHP
--TEST--
|
|
Exhaustive test of verification and conversion of CP850 text
|
|
--EXTENSIONS--
|
|
mbstring
|
|
--SKIPIF--
|
|
<?php
|
|
if (getenv("SKIP_SLOW_TESTS")) die("skip slow test");
|
|
?>
|
|
--FILE--
|
|
<?php
|
|
include('encoding_tests.inc');
|
|
testEncodingFromUTF16ConversionTable(__DIR__ . '/data/CP850.txt', 'CP850');
|
|
/* Try replacement character which cannot be encoded in CP850; ? will be used instead */
|
|
mb_substitute_character(0x1234);
|
|
convertInvalidString("\x23\x45", '?', 'UTF-16BE', 'CP850');
|
|
?>
|
|
--EXPECT--
|
|
Tested CP850 -> UTF-16BE
|
|
Tested UTF-16BE -> CP850
|