Some INI processors allow to specify empty values by just giving the
key without the equals sign, for instance MySQL and Python. It appears
to be sensible to add this possibility to our INI parser, so that it
can be used for such INI files as well. We choose NULL as the value of
empty values.
This syntactical enhancement is a (minor) BC break, though, as can be
seen by the necessary change to bug49692.ini. The “comment” formerly
has been simply ignored, but now it would be parsed as key with an
empty value.
This PR is based on Adam's former patch.
According to the “Deprecate and remove INTL_IDNA_VARIANT_2003” RFC[1],
we change the default of the $variant parameter of `idn_to_ascii()` and
`idn_to_utf8()` from `INTL_IDNA_VARIANT_2003` to
`INTL_IDNA_VARIANT_UTS46`.
[1] <https://wiki.php.net/rfc/deprecate-and-remove-intl_idna_variant_2003>
Given that ICU is a set of lively developed libraries, that ICU 50.1
has been released on 2012-11-05, and PHP 7.4 is scheduled to be
released seven years after it, we consider it appropriate to ditch
these legacy versions.
Particularly, that would be a reasonable groundwork to implement part
two of the “Deprecate and remove INTL_IDNA_VARIANT_2003” RFC[1], namely
to default idn_to_ascii()'s and idn_to_utf8()'s $variant parameter to
INTL_IDNA_VARIANT_UTS46, which is not defined in ICU < 4.6.
See also the related discussion on internals@[2].
[1] <https://wiki.php.net/rfc/deprecate-and-remove-intl_idna_variant_2003>
[2] <http://news.php.net/php.internals/101626>ff
According to the POSIX specification of `getgrnam_r()` the result of
`sysconf(_SC_GETGR_R_SIZE_MAX)` is an initial value suggested for the
size of the buffer, and `ERANGE` signals that insufficient storage was
supplied. So if we get `ERANGE`, we try again with a buffer twice as
big, and so on, instead of failing.
Basically, the algorithm to append a converted string to an existing
`smart_str` works by increasing the `smart_str` buffer, to let `iconv`
convert characters until there is no more space, to set the new length
of the `smart_str` and to repeat until there is no more input.
Formerly, the new length calculation has been wrong, though, since we
would have to take the old `out_len` into account (`buf_growth -
old_out_len - out_len`). However, since there is no need to take the
old `out_len` into account when increasing the `smart_str` buffer, we
can simplify the fix, avoiding an additional variable.
We must not ignore erroneous characters in mime headers, but rather let
iconv_mime_decode() fail in this case, issuing the usual notice
regarding illegal characters.
We have to cater to the possibility that `=?` is not the start of an
encoded-word, but rather a literal `=?`. If a line break is found
while we're still looking for the charset, we can safely assume that
it's a literal `=?`, and act accordingly.
If we're expecting the start of an encoded word (`=?`), but instead of
the question mark get a line break (CR or LF), we must not append it to
the `pretval`.
The minimum length of an encoded-word is actually the pure encoding
overhead plus the length of the `output-charset` plus the minimum unit
of encoded text, which is 4 for B-encoding and (for simplicity) 3 for
Q-encoding. We also cater to the possibility that we need further
encoded words, which would be split by the `line-break-chars` followed
by a space character. Obviously, the former `out_charset_len + 12` is
too simplistic and wrong in the given case (where the magic number
would be 13).
These simplifications are somewhat wasteful, but iconv_mime_encode()
with Q-encoding is wasteful anyway (see bug 66828[1]), and the proper
solution to convert the whole input to the desired output charset
upfront, and applying the encoding afterwards appears too much a change
for the stable releases.
[1] <https://bugs.php.net/66828>
We work around this peculiarity of libxml by using xmlNodeSetContent(),
which does not exhibit this behavior. This also saves us from manually
calculating the string length.