zend_runtime_jit() prevents concurrent compilation with
zend_shared_alloc_lock(), but this doesn't prevent blocked threads from
trying to compile the function again after they acquire the lock.
In the case of GH-19889, one of the function entries is compiled with
zend_jit_handler(), which fails when the op handler has already been replaced by
a JIT'ed handler.
Fix by marking compiled functions with a new flag ZEND_FUNC_JITED, and
skipping compilation of marked functions. The same fix is applied to
zend_jit_hot_func().
Fixes GH-19889
Closes GH-19971
This adds a new flag: ZEND_JIT_DEBUG_TRACE_EXIT_INFO_SRC. When the flag is set,
zend_jit_dump_exit_info() exposes the source of exit points, in debug builds.
Closes GH-19700
Introduce the TAILCALL VM, a more efficient variant of the CALL VM:
* Each opcode handler tailcalls the next opcode handler directly instead of
returning to the interpreter loop. This eliminates call and interpreter loop
overhead.
* Opcode handlers use the preserve_none calling convention to eliminate
register saving overhead.
* preserve_none uses non-volatile registers for its first arguments, so
execute_data and opline are usually kept in these registers and no code is
required to forward them to the next handlers.
Generated machine code is similar to a direct-threaded VM with register pinning,
like the HYBRID VM.
JIT+TAILCALL VM also benefits from this compared to JIT+CALL VM:
* JIT uses the registers of the execute_data and opline args as fixed regs,
eliminating the need to move them in prologue.
* Traces exit by tailcalling the next handler. No code is needed to forward
execute_data and opline.
* No register saving/restoring in epilogue/prologue.
The TAILCALL VM is used when the HYBRID VM is not supported, and the compiler
supports the musttail and preserve_none attributes: The HYBRID VM is used when
compiling with GCC, the TAILCALL VM when compiling with Clang>=19 on x86_64 or
aarch64, and the CALL VM otherwise.
This makes binaries built with Clang>=19 as fast as binaries built with GCC.
Before, these were considerably slower (by 2.8% to 44% depending on benchmark,
and by 5% to 77% before 76d7c616bb).
Closes GH-17849
Closes GH-18720
JIT used to not have a compile-time dependency on VM kind, such that a single
build of opcache could work with different VM kinds at runtime. This has been broken
over time and would be difficult to restore. Additionally, as opcache is now
built-in, this would not be useful anymore.
Remove the zend_jit_vm_kind variable.
Reuse the helper zend_foreach_op_array() that we move to the
zend_optimizer.h header to be usable in opcache.
Note that applying this to other op_array loops is not easy because they either:
- start from EG(persistent_classes_count)
- or only apply to classes
This reduces the chances of confusion between opcode handlers used by the
VM, and opcode handler functions used for tracing or debugging. Depending
on the VM, zend_vm_opcode_handler_t may not be a function. For instance in
the HYBRID VM this is a label pointer.
Closes GH-19006
Property hooks were not handled for JIT+trait+preloading.
Split the existing functions that handle op arrays, and add iterations
for property hooks.
Closes GH-18923.
ZEND_FUNC_INFO() can not be used on internal CE's. If preloading makes a
CE that's an alias of an internal class, the invalid access happens when
setting the FUNC_INFO.
While we could check the class type to be of user code, we can just skip
aliases altogether anyway which may be faster.
Closes GH-18915.
This changes the signature of opcode handlers in the CALL VM so that the opline
is passed directly via arguments. This reduces the number of memory operations
on EX(opline), and makes the CALL VM considerably faster.
Additionally, this unifies the CALL and HYBRID VMs a bit, as EX(opline) is now
handled in the same way in both VMs.
This is a part of GH-17849.
Currently we have two VMs:
* HYBRID: Used when compiling with GCC. execute_data and opline are global
register variables
* CALL: Used when compiling with something else. execute_data is passed as
opcode handler arg, but opline is passed via execute_data->opline
(EX(opline)).
The Call VM looks like this:
while (1) {
ret = execute_data->opline->handler(execute_data);
if (UNEXPECTED(ret != 0)) {
if (ret > 0) { // returned by ZEND_VM_ENTER() / ZEND_VM_LEAVE()
execute_data = EG(current_execute_data);
} else { // returned by ZEND_VM_RETURN()
return;
}
}
}
// example op handler
int ZEND_INIT_FCALL_SPEC_CONST_HANDLER(zend_execute_data *execute_data) {
// load opline
const zend_op *opline = execute_data->opline;
// instruction execution
// dispatch
// ZEND_VM_NEXT_OPCODE():
execute_data->opline++;
return 0; // ZEND_VM_CONTINUE()
}
Opcode handlers return a positive value to signal that the loop must load a
new execute_data from EG(current_execute_data), typically when entering
or leaving a function.
Here I make the following changes:
* Pass opline as opcode handler argument
* Return next opline from opcode handlers
* ZEND_VM_ENTER / ZEND_VM_LEAVE return opline|(1<<0) to signal that
execute_data must be reloaded from EG(current_execute_data)
This gives us:
while (1) {
opline = opline->handler(execute_data, opline);
if (UNEXPECTED((uintptr_t) opline & ZEND_VM_ENTER_BIT) {
opline = opline & ~ZEND_VM_ENTER_BIT;
if (opline != 0) { // ZEND_VM_ENTER() / ZEND_VM_LEAVE()
execute_data = EG(current_execute_data);
} else { // ZEND_VM_RETURN()
return;
}
}
}
// example op handler
const zend_op * ZEND_INIT_FCALL_SPEC_CONST_HANDLER(zend_execute_data *execute_data, const zend_op *opline) {
// opline already loaded
// instruction execution
// dispatch
// ZEND_VM_NEXT_OPCODE():
return ++opline;
}
bench.php is 23% faster on Linux / x86_64, 18% faster on MacOS / M1.
Symfony Demo is 2.8% faster.
When using the HYBRID VM, JIT'ed code stores execute_data/opline in two fixed
callee-saved registers and rarely touches EX(opline), just like the VM.
Since the registers are callee-saved, the JIT'ed code doesn't have to
save them before calling other functions, and can assume they always
contain execute_data/opline. The code also avoids saving/restoring them in
prologue/epilogue, as execute_ex takes care of that (JIT'ed code is called
exclusively from there).
The CALL VM can now use a fixed register for execute_data/opline as well, but
we can't rely on execute_ex to save the registers for us as it may use these
registers itself. So we have to save/restore the two registers in JIT'ed code
prologue/epilogue.
Closes GH-17952
The FETCH_OBJ_R VM handler has an optimization that directly enters into
a hook if it is a simpler getter hook. This is not compatible with the
minimal JIT because the minimal JIT will try to continue executing the
opcodes after the FETCH_OBJ_R.
To solve this, we check whether the opcode is still the expected one
after the execution of the VM handler. If it is not, we know that we are
going to execute a simple hook. In that case, exit to the VM.
Closes GH-17909.
The code to update the call_level in that case skips the opline itself,
as that's handled by the tail handler, and then wants to set the opline
to the last opline of the block because the code below the switch will
update the call_level for that opline.
However, the test has a block with a single opline (THROW). The block
after that has ZEND_INIT_FCALL, because `i` points to ZEND_INIT_FCALL
now, it erroneously causes the call_level after the switch.
Closes GH-17438.
* PHP-8.4:
Fix GH-17307: Internal closure causes JIT failure
Generate inline frameless icall handlers only if the optimization level is set to inline
Fix GH-15981: Segfault with frameless jumps and minimal JIT
Fix GH-15833: Segmentation fault (access null pointer) in ext/spl/spl_array.c
Minimal JIT shouldn't generate a call to the complex handler, but
instead rely on the VM and then check for a two-way jump.
This moves the frameless codegen under the check `JIT_G(opt_level) >=
ZEND_JIT_LEVEL_INLINE`.
This is a quick fix for the problem.
It'll work while all the JIT-ed functions have the same "fixed stack frame".
Unwinder uses hard-coded unwind data for this "fixed stack frame".
* Preallocate space for Win64 shadow args
* typo
* Setup unwinder for JIT functions
* Revert "Dynamically xfail test case which fails on CI"
This reverts commit 7cc327fd5a.
* Revert "Dynamically xfail test case which fails on CI"
This reverts commit bdde797159.
* Revert "Dynamically xfail test cases which fail on CI (GH-15710)"
This reverts commit 6d5962074f.
* Remove XFAIL sections
* Add hard-coded SEH unwind data for EXITCALL
* Fix unwind data
* Fix Windows multi-process support
* Typo
We intend to execute `MATCH_ERROR` in the VM and return to trace a hot
function in BB1. We generate a tail handler and skip all remaining
oplines of BB0. That means the `INIT_FCALL` in BB0 is missed and
`call_level` is not increased to 1. This leads to the assertion
failure.
This patch fixes the issue by updating the `call_level` for the skipped
oplines.
Closes GH-16939.
There doesn't seem to be a thread post-startup hook that runs after
zend_startup_cb() that could be used for this
this fix is similar to accel_startup_ok() as seen here: fc1db70f10/ext/opcache/ZendAccelerator.c (L2631-L2634)
Closes GH-16853.