If the `ICONV_MIME_DECODE_CONTINUE_ON_ERROR` flag is set, parsing
should not fail, if there are illegal characters in the headers;
instead we silently ignore these like before.
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines in all
*.phpt sections.
According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.
C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."
Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines.
According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.
C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."
Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
Before the fix for bug 48289 has been applied, the algorithm to
construct a Q-encoded-word has been optimistic, i.e. try to encode as
many bytes that *may* fit in the remaining space, calculate the actual
length of the Q-encoded word, and if it's too long, try again with a
reduced size. However, the fix for the mentioned bug replaced this by
a pessimistic algorithm, which always terminates[1] the for loop[2]
during the first iteration (which renders the following 3 lines as dead
code), and as such easily produces unnecessarily short encoded-words.
Instead the proper fix for the bug would have been to make sure that
`out_size` is always decremented, if the space isn't sufficient for the
encoded-word.
[1] <https://github.com/php/php-src/blob/php-7.3.0beta3/ext/iconv/iconv.c#L1421>
[2] <https://github.com/php/php-src/blob/php-7.3.0beta3/ext/iconv/iconv.c#L1360>
Basically, the algorithm to append a converted string to an existing
`smart_str` works by increasing the `smart_str` buffer, to let `iconv`
convert characters until there is no more space, to set the new length
of the `smart_str` and to repeat until there is no more input.
Formerly, the new length calculation has been wrong, though, since we
would have to take the old `out_len` into account (`buf_growth -
old_out_len - out_len`). However, since there is no need to take the
old `out_len` into account when increasing the `smart_str` buffer, we
can simplify the fix, avoiding an additional variable.
We must not ignore erroneous characters in mime headers, but rather let
iconv_mime_decode() fail in this case, issuing the usual notice
regarding illegal characters.
We have to cater to the possibility that `=?` is not the start of an
encoded-word, but rather a literal `=?`. If a line break is found
while we're still looking for the charset, we can safely assume that
it's a literal `=?`, and act accordingly.
If we're expecting the start of an encoded word (`=?`), but instead of
the question mark get a line break (CR or LF), we must not append it to
the `pretval`.
This only involves switching zval_dtor to zval_ptr_dtor for arrays
and making the convert_to_object for arrays a bit more generic.
All the other changes outside zend_operators.c just make use of
this new ability (use COPY instead of DUP).
What's still missing: Proper references handling. I've seen many
convert_to* calls that will break when a reference is used.
Also fixes bug #69788.