1
0
mirror of https://github.com/php/php-src.git synced 2026-03-28 02:02:32 +01:00
Files
archived-php-src/ext
Alex Dowad 8f84192403 Fix mangled kana output for JIS encoding
For JIS encoding, hiragana and katakana can be input in multiple forms.
One form uses JISX 0201 escape sequences. Another is called 'GR-invoked'
kana.

In the context of ISO-2022 encoding, bytes with a zero bit in the MSB
are called "GL" (or "graphics left") and those with the MSB set are
called "GR" (or "graphics right"). Regarding the variants of
ISO-2022-JP which are called "JIS7" and "JIS8", Wikipedia states:

"Other, older variants known as JIS7 and JIS8 build directly on the
7-bit and 8-bit encodings defined by JIS X 0201 and allow use of JIS X
0201 kana from G1 without escape sequences, using Shift Out and Shift
In or setting the eighth bit (GR-invoked), respectively."

In harmony with this, we have always accepted bytes from 0xA3-0xDF and
decoded them to the corresponding hiragana/katakana. However, at some
point I accidentally broke output for these kana. You can see the
problem in 3v4l.org by running this program:

    <?php
    echo bin2hex(mb_convert_encoding("\xA3", 'JIS', 'JIS'));

The results are:

    Output for 8.2rc1 - rc3
    1b244200231b2842
    Output for 7.4.0 - 7.4.33, 8.0.1 - 8.0.25, 8.1.12
    1b2849231b2842
    Output for 8.1.0 - 8.1.11
    1b284923

You can see that from 8.1.0 - 8.1.11, there was a missing escape
sequence at the end. That was caused because the flush functions were
not being called properly, and has already been fixed. However, this
also shows that the output for 8.2rc1-rc3 is completely invalid.
It is trying to output a JISX 0208 sequence, but with 0x00 as one of
the JISX 0208 bytes, which is illegal.

Add the missing code which will make the new text conversion filters
behave the same as the old ones when outputting hiragana/katakana in
JIS encoding.
2022-11-22 15:49:19 +02:00
..
2022-06-25 07:40:19 +01:00
2022-10-27 14:42:17 +01:00
2022-11-02 09:43:40 +00:00
2022-10-27 14:42:17 +01:00
2022-11-03 14:37:59 +01:00
2022-10-24 15:02:55 +02:00
2022-10-21 00:05:35 -06:00
2022-09-27 23:32:37 +02:00
2022-09-06 10:34:10 +01:00
2022-11-13 11:05:28 +01:00
2022-08-26 14:59:59 +02:00
2022-07-28 21:09:18 +02:00
2022-06-09 13:42:45 +02:00
2022-09-27 18:45:54 +02:00
2022-10-13 16:08:34 +02:00
2022-11-22 12:26:03 +00:00