`tolower()` returns an `int`, so we must not convert to `char` which
may be `signed` and as such may be subject to overflow (actually,
implementation defined behavior).
Closes GH-6007
When normalizing tags to check whether they are contained in the set
of allowable tags, we must not strip slashes, unless they come
immediately after the opening `<`, or immediately before the closing
`>`.
When the strip tags state machine has been flattened, an if statement
has mistakenly been treated as else if. We fix this, and also simplify
a bit right away.
On some recent Windows systems, ext\pcre\tests\locales.phpt fails,
because 'pt_PT' is accepted by `setlocale()`, but not properly
supported by the ctype functions, which are used internally by PCRE2 to
build the localized character tables.
Since there appears to be no way to properly check whether a given
locale is fully supported, but we want to minimize BC impact, we filter
out typical Unix locale names, except for a few cases which have
already been properly supported on Windows. This way code like
setlocale(LC_ALL, 'de_DE.UTF-8', 'de_DE', 'German_Germany.1252');
should work like on older Windows systems.
It should be noted that the locale names causing trouble are not (yet)
documented as valid names anyway, see
<https://docs.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=vs-2019>.
The $Id$ keywords were used in Subversion where they can be substituted
with filename, last revision number change, last changed date, and last
user who changed it.
In Git this functionality is different and can be done with Git attribute
ident. These need to be defined manually for each file in the
.gitattributes file and are afterwards replaced with 40-character
hexadecimal blob object name which is based only on the particular file
contents.
This patch simplifies handling of $Id$ keywords by removing them since
they are not used anymore.
If the longest common substring is the leftmost common substring, there
is no need to check the string prefixes for further common substrings,
since there are none.