Introduce the TAILCALL VM, a more efficient variant of the CALL VM:
* Each opcode handler tailcalls the next opcode handler directly instead of
returning to the interpreter loop. This eliminates call and interpreter loop
overhead.
* Opcode handlers use the preserve_none calling convention to eliminate
register saving overhead.
* preserve_none uses non-volatile registers for its first arguments, so
execute_data and opline are usually kept in these registers and no code is
required to forward them to the next handlers.
Generated machine code is similar to a direct-threaded VM with register pinning,
like the HYBRID VM.
JIT+TAILCALL VM also benefits from this compared to JIT+CALL VM:
* JIT uses the registers of the execute_data and opline args as fixed regs,
eliminating the need to move them in prologue.
* Traces exit by tailcalling the next handler. No code is needed to forward
execute_data and opline.
* No register saving/restoring in epilogue/prologue.
The TAILCALL VM is used when the HYBRID VM is not supported, and the compiler
supports the musttail and preserve_none attributes: The HYBRID VM is used when
compiling with GCC, the TAILCALL VM when compiling with Clang>=19 on x86_64 or
aarch64, and the CALL VM otherwise.
This makes binaries built with Clang>=19 as fast as binaries built with GCC.
Before, these were considerably slower (by 2.8% to 44% depending on benchmark,
and by 5% to 77% before 76d7c616bb).
Closes GH-17849
Closes GH-18720
JIT doesn't recognize that variables may be used after returning from a
trace due to YIELD, so some effects may never be stored to memory.
YIELD ops terminate trace recordings with ZEND_JIT_TRACE_STOP_RETURN, and are
handled mostly like RETURN. Here I change zend_jit_trace_execute() so that
YIELD terminates recordings with ZEND_JIT_TRACE_STOP_INTERPRETER instead,
to ensure that we recognize that variables may be used after returning from
the trace due to YIELD.
Fixes GH-19493
Closes GH-19515
Currently, configure fails when no SHM backend is available. Additionally,
even after bypassing the configure check, opcache emits a fatal error if no
SHM backend is available.
Make the configure check non-fatal (a warning is printed). At runtime, disable
opcache if no backend is available, in the same way we disable opcache by
default on CLI.
Closes GH-19350
Add a C function, opcache_preloading(), that returns true during
preloading. Extensions can use this to detect preloading, not only during
compilation/execution, but also in RINIT()/RSHUTDOWN().
Since opcache currently doesn't install any header, I'm adding a new one:
zend_accelerator_api.h. Header name is based on other files in ext/opcache.
Closes GH-19288
JIT used to not have a compile-time dependency on VM kind, such that a single
build of opcache could work with different VM kinds at runtime. This has been broken
over time and would be difficult to restore. Additionally, as opcache is now
built-in, this would not be useful anymore.
Remove the zend_jit_vm_kind variable.
Reuse the helper zend_foreach_op_array() that we move to the
zend_optimizer.h header to be usable in opcache.
Note that applying this to other op_array loops is not easy because they either:
- start from EG(persistent_classes_count)
- or only apply to classes
In the absence of `PHP_ARG_WITH([opcache],` the value of ext_shared is not
initialized while processing directives of ext/opcache/config.m4, causing
PHP_EVAL_LIBLINE() to add libs to OPCACHE_SHARED_LIBADD instead of LIBS.
Closes GH-19301
Introduced by GH-18541.
This path is hit when compilation fails with a compile error, rather than
bailout. If a non-fatal error is recorded, it will not be emitted nor freed.
Handle accordingly.
When opcache is enabled, error handling is altered in the following ways:
* Errors emitted during compilation bypass the user-defined error handler
* Exceptions emitted during class linking are turned into fatal errors
Changes here make the behavior consistent regardless of opcache being enabled or
not:
* Errors emitted during compilation and class linking are always delayed and
handled after compilation or class linking. During handling, user-defined
error handlers are not bypassed. Fatal errors emitted during compilation or
class linking cause any delayed errors to be handled immediately (without
calling user-defined error handlers, as it would be unsafe).
* Exceptions thrown by user-defined error handlers when handling class linking
error are not promoted to fatal errors anymore and do not prevent linking.
Fixes GH-17422.
Closes GH-18541.
Closes GH-17627.
Co-authored-by: Tim Düsterhus <tim@bastelstu.be>
This removes the --enable-opcache/--disable-opcache configure switch. OPcache
is now always builtin. The default value of opcache.enable and
opcache.enable_cli is unchanged.
RFC: https://wiki.php.net/rfc/make_opcache_required
Closes GH-18961.
Co-authored-by: Tim Düsterhus <tim@tideways-gmbh.com>
We use linker relocations to fetch the TLS index and offset of _tsrm_ls_cache.
When building Opcache statically, linkers may attempt to optimize that into a
more efficient code sequence (relaxing from "General Dynamic" to "Local Exec"
model [1]). Unfortunately, linkers will fail, rather than ignore our
relocations, when they don't recognize the exact code sequence they are
expecting.
This results in errors as reported by GH-15074:
TLS transition from R_X86_64_TLSGD to R_X86_64_GOTTPOFF against
`_tsrm_ls_cache' at 0x12fc3 in section `.text' failed"
Here I take a different approach:
* Emit the exact full code sequence expected by linkers
* Extract the TLS index/offset by inspecting the linked ASM code, rather than
executing it (execution would give us the thread-local address).
* We detect when the code was relaxed, in which case we can extract the TCB
offset instead.
* This is done in a conservative way so that if the linker did something we
didn't expect, we fallback to a safer (but slower) mechanism.
One additional benefit of that is we are now able to use the Local Exec model in
more cases, in JIT'ed code. This makes non-glibc builds faster in these cases.
Closes GH-18939.
Related RFC: https://wiki.php.net/rfc/make_opcache_required.
[1] https://www.akkadia.org/drepper/tls.pdf
This reduces the chances of confusion between opcode handlers used by the
VM, and opcode handler functions used for tracing or debugging. Depending
on the VM, zend_vm_opcode_handler_t may not be a function. For instance in
the HYBRID VM this is a label pointer.
Closes GH-19006
* opcache: Reset `accel_startup_ok` after shutting down
This is necessary for phpdbg, which runs multiple startup/shutdown cycles in
the same process.
* opcache: Disallow changing `opcache.memory_consumption` when SHM is set up
Normally changing the INI value is not possible after SHM is set up, since it
is `PHP_INI_SYSTEM`. FPM is a notable exception: SHM is set up in the master
process, but when spawning the individual pools, the `php_admin_value` config
option can be used to change `PHP_INI_SYSTEM` INIs on a per-pool basis. This
does not work for this option, since it will only be read on early start,
leading to misleading PHPInfo output, since the INI value appears to be
successfully set and since some of the calculated values are derived from the
INI value rather than the actual value.
Apple Silicon has stricter rules about rwx mmap regions. They need to be created
using the MAP_JIT flag. However, the MAP_JIT seems to be incompatible with
MAP_SHARED. ZTS requires MAP_SHARED so that some threads may execute code from a
page while another writes/appends to it. We did not find another solution, other
than completely disabling JIT for Apple Silicon + ZTS.
See discussion in https://github.com/php/php-src/pull/13351.
Co-authored-by: Peter Kokot <peterkokot@gmail.com>
Fixes GH-13400
Closes GH-13396