The `php_serialize` decode function has to return `FAILURE`, if the
unserialization failed on anything but an empty string.
The `php` decode function has also to return `FAILURE`, if there is
trailing garbage in the string.
If the current character is a line break character, it cannot be a tab
or space character, so we would always fail with an invalid sequence
error. Obviously, these `scan_stat == 4` conditions are meant to be
exclusive.
Furthermore, if `in_pp == NULL || in_left_p == NULL` is true, we hit a
segfault if we are not returning right away. Obviously, the additional
constraints don't make sense, so we remove them.
This is a private property, so we are allowed to add a type.
The new declaration of the property is:
private array $trace = [];
This ensures that Exception::getTrace() does indeed return an array.
Userland code that was modifying the property through refleciton
may have to be adjusted to assign an array (instead of null,
for example).
Closes GH-5636.
If * is used for width/precision in printf, then the width/precision
is provided by a printf argument instead of being part of the format
string. Semantics generally match those of printf in C.
This can be used to easily reproduce PHP's float printing behavior:
// Locale-sensitive using precision ini setting.
// Used prior to PHP 8.0.
sprintf("%.*G", (int) ini_get('precision'), $float);
// Locale-insensitive using precision ini setting.
// Used since to PHP 8.0.
sprintf("%.*H", (int) ini_get('precision'), $float);
// Locale-insensitive using serialize_precision ini setting.
// Used in serialize(), json_encode() etc.
sprintf("%.*H", (int) ini_get('serialize_precision'), $float);
Closes GH-5432.
From now on, we always display the given object's type instead of just reporting "object".
Additionally, make the format of return type errors match the format of argument errors.
Closes GH-5625
This time a number of comments have been added to make it easy for new devs to understand
what is going on. Also adjusted error message to use colons rather than dashes.
proc_open can accept stream resources in the descriptorspec, like this:
proc_open("command", array(0 => $resource), $pipes);
Previously, if a resource which was *not* of type "stream" was passed, proc_open would
return without freeing dynamically allocated memory. It's fixed now.
Back in 2004, a feature was added to proc_open which allowed it to open a PTY,
connecting specific FDs in the child process to the slave end of the PTY and returning
the master end of the PTY (wrapped as a PHP stream) in the `$pipes` array. However,
this feature was disabled just about a month later. Little information is available
about why this was done, but from talking to the original implementer, it seems there
were portability problems with some rare flavors of Unix.
Re-enable this feature with a simplified implementation which uses openpty(). No
attempt is made to support PTYs if the platform does not have openpty(). The configure
script checks if linking with -lutil is necessary to use openpty(), but if anything
else is required, like including some special header or linking with some other library,
PTY support will be disabled.
The original PTY support for proc_open automatically daemonized the child process
(disassociating it from the TTY session and process group of the parent). However,
I don't think this is a good idea. Just because a user opens a child process in a
PTY, it doesn't mean they want it to continue running even when the parent process
is killed. Of course, if the child process is some kind of server, it will likely
daemonize itself; but we have no reason to preempt that decision.
It turns out that since 2015, there has been one test case for PTY support in
proc_open() in the test suite. This test was added in GitHub PR #1588
(https://github.com/php/php-src/pull/1588). That PR mentioned that the PHP
binary in the Debian/Ubuntu repositories is patched to *enable* PTY support. Checking
the Debian PHP repository (https://salsa.debian.org/php-team/php.git) shows that this
is still true. Debian's patch does not modify the implementation from 2004 in any
way; it just removes the #if 0 line which disables it.
Naturally, the test case is skipped if PTY support is not enabled. This means that ever
since it was added, every test run against the 'vanilla' PHP codebase has skipped it.
Interestingly, the test case which was added in 2015 fails on my Linux Mint PC... both
with this simplified implementation *and* when enabling the original implementation.
Investigation reveals the reason: when the child process using the slave end of the
PTY exits and its FDs are all closed, and all buffered data is read from the master
end of the PTY, any further attempt to read from the master end fails with EIO. The
test case seems to expect that reading from the master end will always return an
empty string if no data is available.
Likely this is because PHP's fread() was updated to report errors from the underlying
system calls only recently.
One way out of this dilemma: IF at least one FD referring to the slave end of the PTY is
kept open *in the parent process*, the failure with EIO will not occur even after the child
process exits. However, that would raise another issue: we would need a way to ensure the FD
will be closed eventually in long-running programs.
Another discovery made while testing this code is that fread() does not always return
all the data written to the slave end of the PTY in a single call, even if the data was
written with a single syscall and it is only a few bytes long.
Specifically, when the child process in the test case writes "foo\n" to the PTY, the parent
sometimes receives "foo" (3 bytes) and sometimes "foo\r\n" (5 bytes). (The "\r" is from the
TTY line discipline converting "\n" to "\r\n".) A second call to fread() does return the
remaining bytes, though sometimes all the data is read in the first call, and by the time
the second call is made, the child process has already exited. It seems that liberal use
of the @ operator is needed when using fread() on pipes.
Thanks to Nikita Popov for suggesting that we should just use openpty() rather than
grantpt(), unlockpt(), etc.
array_column() reimplements array key handling in a way that does
not match standard array key behavior in PHP. Avoid this by making
use of the standard API.
Of course, there is a minor backwards compatibilty break here,
e.g. people could be relying on objects getting cast to string
instead of throwing.
Closes GH-5487.
This is a funny one. I discovered that lstat_stat_variation10.phpt was failing every
now and again when the PHP test suite was run on my dev PC. The output from the failing
test showed that the atime (access time) of the directory created in the test was changing
between these lines:
$old_stat = stat($dirname);
clearstatcache();
sleep(1);
var_dump( is_dir($dirname) );
$new_stat = stat($dirname);
Could is_dir() be accessing the directory and changing the atime? strace showed that is_dir
was only issuing a single stat() syscall. Could stat() change the atime? No, no, that would
just be perverse. Nobody would be stupid enough to implement the kernel in that way.
Checked the kernel source, found that the function called when atime needs to be updated
appears to be touch_atime(). Broke out the BCC kernel tracing tools and ran this one
while running the flaky test case in a loop:
sudo trace -I<kernel src dir>/include/linux/path.h -I<same>/include/linux/dcache.h 'touch_atime(struct path *path) "%s", path->dentry->d_name.name'
Inspecting the results showed that something called "git_thread" was occcasionally updating
the atime on the directory in question!! What on earth...???
The PID shown by trace revealed that this was a background thread for Sublime Text 3.
Sublime now has git integration and shows when there are untracked or modified files. It
seems that it uses a background thread to regularly scan the project directory and look
for new and modified files. This was causing the atime to change.
Even though other developers may not be running ST3, there are any number of reasons why
a background process might recurse through various directories and could cause the atime
to change unexpectedly. Therefore, update the test case so it doesn't fail in such cases.
Closes GH-5553.
Test 04 and 15 are the same as 02 and 03, just for different
encodings. They don't add value, but their execution depends
on available locales, so they're easy to miss...
htmlentities() has nothing to do with mbstring and should not
depend on its ini settings. It should only respect the global
default_charset and internal_encoding settings. This is exactly
why they were introduced...
In some places, we need to make sure that no warnings are thrown
due to unknown encoding. The error reporting code tried to avoid
this by determining a "safe charset", but this introduces subtle
discrepancies in which charset is picked (normally
internal_encoding takes precedence). Avoid this by suppressing
the warning in the first place.
While here, use the fallback logic to print error messages with
substitution characters more consistently, to avoid skipping
parts of the error message entirely.
It has been observed that in rare cases, this regression test has spurious failures in CI.
Try increasing the threshold for failure a bit and see if this makes it pass consistently.
Treatment of locales in PHP is currently inconsistent: The LC_ALL
locale is set to "C", as is standard behavior on program startup.
The LC_CTYPE locale is set to "", which will inherit it from the
environment. However, the inherited LC_CTYPE locale will only be
used in some cases, while in other cases it is necessary to perform
an explicit setlocale() call in PHP first. This is the case for
the locale-sensitive handling in the PCRE extension.
Make things consistent by *never* inheriting any locales from the
environment. LC_ALL, including LC_CTYPE will be "C" on startup.
A locale can be set or inherited through an explicit setlocale()
call, at which point the behavior will be fully consistent and
predictable.
Closes GH-5488.
Always simply ignore (pass through) them. Previously the behavior
depended on where the invalid character occurred, as it messed
up the state management.
Partially reverts 846b647953: instead of
throwing, this skips uninitialized typed properties when serializing objects.
This makes serialize with __sleep() behave the same as serialize()
without __sleep().
As in the non-__sleep() case, unserialize(serialize($x)) identity
may not be preserved due to replacement of uninitialized/unset
properties with default values. Fixing this will require changes to
the serialization format.
Closes GH-5396.