1
0
mirror of https://github.com/php/php-src.git synced 2026-04-17 21:11:02 +02:00
Commit Graph

9 Commits

Author SHA1 Message Date
Peter Kokot
02294f0c84 Make PHP development tools files and scripts executable
This patch makes several scripts and PHP development tools files
executable and adds more proper shebangs to the PHP scripts.

The `#!/usr/bin/env php` shebang provides running the script via
`./script.php` and uses env to find PHP script location on the system.
At the same time it still provides running the script with a user
defined PHP location using `php script.php`.
2018-08-29 20:58:17 +02:00
Nikita Popov
f4a1d9c821 Fixed bug #65544 and #71298 2017-07-28 14:57:08 +02:00
Nikita Popov
582a65b06f Implement full case mapping
Implement full case mapping according to SpecialCasing.txt and
also full case folding according to CaseFolding.txt (F). There
are a number of caveats:

* Only language-agnostic and unconditional full case mapping
  is implemented. The only language-agnostic conditional case
  mapping rule relates to Greek sigma in final position
  (Final_Sigma). Correctly handling this requires both arbitrary
  lookahead and lookbehind, which would require some larger
  changes to how the case mapping is implemented. This is a
  possible future extension.
* The only language-specific handling that is implemented is
  for Turkish dotted/undotted Is, if the ISO-8859-9 encoding
  is used. This matches the previous behavior and makes sure
  that no codepoints not supported by the encoding are
  produced. A future extension would be to also handle the
  Turkish mappings specified by SpecialCasing.txt based on
  the mbfl internal language.
* Full case folding is implemented, but case-insensitive mb_*
  operations continue to use simple case folding. The reason is
  that full case folding of the haystack string may change the
  position at which a match occurred. This would have to be
  mapped back into the position in the original string.
* mb_convert_case() exposes both the full and the simple case
  mapping / folding, where full is the default. The constants
  are:

   * MB_CASE_LOWER (used by mb_strtolower)
   * MB_CASE_UPPER (used by mb_strtolower)
   * MB_CASE_TITLE
   * MB_CASE_FOLD
   * MB_CASE_LOWER_SIMPLE
   * MB_CASE_UPPER_SIMPLE
   * MB_CASE_TITLE_SIMPLE
   * MB_CASE_FOLD_SIMPLE (used by case-insensitive operations)
2017-07-28 12:32:50 +02:00
Nikita Popov
9ac7c1e71d Use case-folding for case insensitive comparisons
Instead of using lowercasing.
2017-07-28 12:32:50 +02:00
Nikita Popov
80a0601fe5 Use MPH for case maps
Instead of performing a binary search, use a hashtable to store
the case maps. In particular a minimal perfect hash construction
is used, which does not require collision resolution (but does
use an auxiliary table for the hash perturbation).
2017-07-28 12:32:50 +02:00
Nikita Popov
eacd70f762 Don't store titlecase if same as uppercase
The totitle code already has a fallback for that case.
2017-07-28 12:32:50 +02:00
Nikita Popov
cedfc2f426 Drop implementation-specific character properties
No point in keeping around non-standard character properties if
we're not using them and most are not even being populated.
2017-07-28 12:32:50 +02:00
Nikita Popov
8ace7045e9 Handle character ranges in ucgendat generically
In particular, the previous implementation did not account for
Tangut Ideographs and CJK Ideograph extensions C through F.
2017-07-25 18:48:12 +02:00
Nikita Popov
0c0e35fedc Port ucgendat to PHP
Implemented such that the output is identical, including some
quirks that should be fixed subsequently.
2017-07-25 18:48:12 +02:00