This changes the signature of opcode handlers in the CALL VM so that the opline
is passed directly via arguments. This reduces the number of memory operations
on EX(opline), and makes the CALL VM considerably faster.
Additionally, this unifies the CALL and HYBRID VMs a bit, as EX(opline) is now
handled in the same way in both VMs.
This is a part of GH-17849.
Currently we have two VMs:
* HYBRID: Used when compiling with GCC. execute_data and opline are global
register variables
* CALL: Used when compiling with something else. execute_data is passed as
opcode handler arg, but opline is passed via execute_data->opline
(EX(opline)).
The Call VM looks like this:
while (1) {
ret = execute_data->opline->handler(execute_data);
if (UNEXPECTED(ret != 0)) {
if (ret > 0) { // returned by ZEND_VM_ENTER() / ZEND_VM_LEAVE()
execute_data = EG(current_execute_data);
} else { // returned by ZEND_VM_RETURN()
return;
}
}
}
// example op handler
int ZEND_INIT_FCALL_SPEC_CONST_HANDLER(zend_execute_data *execute_data) {
// load opline
const zend_op *opline = execute_data->opline;
// instruction execution
// dispatch
// ZEND_VM_NEXT_OPCODE():
execute_data->opline++;
return 0; // ZEND_VM_CONTINUE()
}
Opcode handlers return a positive value to signal that the loop must load a
new execute_data from EG(current_execute_data), typically when entering
or leaving a function.
Here I make the following changes:
* Pass opline as opcode handler argument
* Return next opline from opcode handlers
* ZEND_VM_ENTER / ZEND_VM_LEAVE return opline|(1<<0) to signal that
execute_data must be reloaded from EG(current_execute_data)
This gives us:
while (1) {
opline = opline->handler(execute_data, opline);
if (UNEXPECTED((uintptr_t) opline & ZEND_VM_ENTER_BIT) {
opline = opline & ~ZEND_VM_ENTER_BIT;
if (opline != 0) { // ZEND_VM_ENTER() / ZEND_VM_LEAVE()
execute_data = EG(current_execute_data);
} else { // ZEND_VM_RETURN()
return;
}
}
}
// example op handler
const zend_op * ZEND_INIT_FCALL_SPEC_CONST_HANDLER(zend_execute_data *execute_data, const zend_op *opline) {
// opline already loaded
// instruction execution
// dispatch
// ZEND_VM_NEXT_OPCODE():
return ++opline;
}
bench.php is 23% faster on Linux / x86_64, 18% faster on MacOS / M1.
Symfony Demo is 2.8% faster.
When using the HYBRID VM, JIT'ed code stores execute_data/opline in two fixed
callee-saved registers and rarely touches EX(opline), just like the VM.
Since the registers are callee-saved, the JIT'ed code doesn't have to
save them before calling other functions, and can assume they always
contain execute_data/opline. The code also avoids saving/restoring them in
prologue/epilogue, as execute_ex takes care of that (JIT'ed code is called
exclusively from there).
The CALL VM can now use a fixed register for execute_data/opline as well, but
we can't rely on execute_ex to save the registers for us as it may use these
registers itself. So we have to save/restore the two registers in JIT'ed code
prologue/epilogue.
Closes GH-17952
This changes make FPM always decode SCRIPT_FILENAME when Apache
ProxyPass or ProxyPassMatch is used. It also introduces a new INI
option fastcgi.script_path_encoded that allows using the previous
behavior of not decoding the path. The INI is introduced because
there is a chance that some users could use encoded file paths in
their file system as a workaround for the previous behavior.
Close GH-17896
This fixes a ZEND_RC_MOD_CHECK() assertion failure when building with
"-DZEND_RC_DEBUG=1 --enable-debug --enable-zts". php_dl() is called after
startup, and manipulates the refcount of persistent strings, which is not
allowed at this point of the lifecycle.
The dl() function disables the ZEND_RC_MOD_CHECK() assertion before calling
php_dl(). This change applies the same workaround in FPM.
Closes GH-18075
This functionality didn't actually work.
This was discussed on the mailing list [1] and no one objected.
[1] https://externals.io/message/126368
Closes GH-17883.
On Windows, the cli and phpdbg SAPIs have variants (cli-win32 and
phpdbgs, respectively) which are build by default. However, the
variants share some files, what leads to duplicate build rules in the
generated Makefile. NMake throws warning U4004[1], but proceeds
happily, ignoring the second build rule. That means that different
flags for duplicate rules are ignored, hinting at a potential problem.
We solve this by introducing an additional (optional) argument to
`SAPI()` and `ADD_SOURCES()` which can be used to avoid such duplicate
build rules. It's left to the SAPI maintainers to make sure that
appropriate rules are created. We fix this for phpdbgs right away,
which currently couldn't be build without phpdbg due to the missing
define; we remove the unused `PHP_PHPDBG_EXPORTS` flag altogether.
[1] <https://learn.microsoft.com/en-us/cpp/error-messages/tool-errors/nmake-warning-u4004>
Closes GH-17545.
The phpdbg issue is a real issue, although it's unlikely that harm can
be done due to stack alignment and little-endianess. The others seem
to be more cosmetic.
Internal function won't need their refcount increased as they outlive
the debugger session, and userland functions won't be unloaded either.
So no refcount management is necessary for registered functions.
Internal function won't need their refcount increased as they outlive
the debugger session, and userland functions won't be unloaded either.
So no refcount management is necessary for registered functions.
The reason this breaks is because of a type mismatch.
The following line uses fields of the timeval struct which are both 8 bytes on
Alpine 32-bit, which results in a computed value of also 8 bytes:
https://github.com/php/php-src/blob/b09ed9a0f25cda8c9eea9d140c01587cd50b4aa8/sapi/fpm/fpm/fpm_status.c#L611
However, it is passed to a format string which expects 4 bytes
(`unsigned long` and thus the `%lu` format specifier is 4 bytes on Alpine 32-bit),
resulting in argument corruption.
Since the value is generally small, truncating to 4 bytes is sufficient to fix this.
Closes GH-17286.