archived-php-src

mirror of https://github.com/php/php-src.git synced 2026-04-02 05:32:28 +02:00

Files

Alex Dowad be1a215538 Optimize (AND FIX) mb_check_encoding (cut execution time by 50%+)

Previously, `mb_check_encoding` did an awful lot of unneeded work. In order to
determine whether a string was valid or not, it would convert the whole string
into wchar (code points), which required dynamically allocating a (potentially
large) buffer. Then it would turn right around and convert that big 'ol buffer
of code points back to the original encoding again. Finally, it would check
whether any invalid bytes were detected during that long and onerous process.

The thing is, mbstring _already_ has machinery for detecting whether a string
is valid in a certain encoding or not, and it doesn't require copying any data
around or allocating buffers. Better yet, it can fail fast when an invalid byte
is found. Why not use it? It's sure a lot faster!

Further, the legacy code was also badly broken. Why? Because aside from
checking whether illegal characters were detected, it would also check whether
the conversion to and from wchars was lossless. But, some encodings have
more than one valid encoding for the same character. In such cases, it is
not possible to make the conversion to and from wchars lossless for every
valid character. So `mb_check_encoding` would actually reject good strings
in a lot of encodings!

2020-11-02 21:31:06 +02:00

libmbfl

Remove dead code from mbfilter_koi8u.c (and do general code cleanup)

2020-11-02 21:31:06 +02:00

tests

Add test suite for KOI8-U encoding

2020-11-02 21:31:06 +02:00

ucgendat

[ci skip] Move OpenLDAP license to redistributable info file

2019-05-06 23:02:46 +02:00

config.m4

Remove redundant includes from mbstring (and make sure correct config.h is used)