This changes the signature of opcode handlers in the CALL VM so that the opline
is passed directly via arguments. This reduces the number of memory operations
on EX(opline), and makes the CALL VM considerably faster.
Additionally, this unifies the CALL and HYBRID VMs a bit, as EX(opline) is now
handled in the same way in both VMs.
This is a part of GH-17849.
Currently we have two VMs:
* HYBRID: Used when compiling with GCC. execute_data and opline are global
register variables
* CALL: Used when compiling with something else. execute_data is passed as
opcode handler arg, but opline is passed via execute_data->opline
(EX(opline)).
The Call VM looks like this:
while (1) {
ret = execute_data->opline->handler(execute_data);
if (UNEXPECTED(ret != 0)) {
if (ret > 0) { // returned by ZEND_VM_ENTER() / ZEND_VM_LEAVE()
execute_data = EG(current_execute_data);
} else { // returned by ZEND_VM_RETURN()
return;
}
}
}
// example op handler
int ZEND_INIT_FCALL_SPEC_CONST_HANDLER(zend_execute_data *execute_data) {
// load opline
const zend_op *opline = execute_data->opline;
// instruction execution
// dispatch
// ZEND_VM_NEXT_OPCODE():
execute_data->opline++;
return 0; // ZEND_VM_CONTINUE()
}
Opcode handlers return a positive value to signal that the loop must load a
new execute_data from EG(current_execute_data), typically when entering
or leaving a function.
Here I make the following changes:
* Pass opline as opcode handler argument
* Return next opline from opcode handlers
* ZEND_VM_ENTER / ZEND_VM_LEAVE return opline|(1<<0) to signal that
execute_data must be reloaded from EG(current_execute_data)
This gives us:
while (1) {
opline = opline->handler(execute_data, opline);
if (UNEXPECTED((uintptr_t) opline & ZEND_VM_ENTER_BIT) {
opline = opline & ~ZEND_VM_ENTER_BIT;
if (opline != 0) { // ZEND_VM_ENTER() / ZEND_VM_LEAVE()
execute_data = EG(current_execute_data);
} else { // ZEND_VM_RETURN()
return;
}
}
}
// example op handler
const zend_op * ZEND_INIT_FCALL_SPEC_CONST_HANDLER(zend_execute_data *execute_data, const zend_op *opline) {
// opline already loaded
// instruction execution
// dispatch
// ZEND_VM_NEXT_OPCODE():
return ++opline;
}
bench.php is 23% faster on Linux / x86_64, 18% faster on MacOS / M1.
Symfony Demo is 2.8% faster.
When using the HYBRID VM, JIT'ed code stores execute_data/opline in two fixed
callee-saved registers and rarely touches EX(opline), just like the VM.
Since the registers are callee-saved, the JIT'ed code doesn't have to
save them before calling other functions, and can assume they always
contain execute_data/opline. The code also avoids saving/restoring them in
prologue/epilogue, as execute_ex takes care of that (JIT'ed code is called
exclusively from there).
The CALL VM can now use a fixed register for execute_data/opline as well, but
we can't rely on execute_ex to save the registers for us as it may use these
registers itself. So we have to save/restore the two registers in JIT'ed code
prologue/epilogue.
Closes GH-17952
Zend Engine
Zend memory manager
General
The goal of the new memory manager (available since PHP 5.2) is to reduce memory allocation overhead and speedup memory management.
Debugging
Normal:
sapi/cli/php -r 'leak();'
Zend MM disabled:
USE_ZEND_ALLOC=0 valgrind --leak-check=full sapi/cli/php -r 'leak();'
Shared extensions
Since PHP 5.3.11 it is possible to prevent shared extensions from unloading so
that valgrind can correctly track the memory leaks in shared extensions. For
this there is the ZEND_DONT_UNLOAD_MODULES environment variable. If set, then
DL_UNLOAD() is skipped during the shutdown of shared extensions.
ZEND_VM
ZEND_VM architecture allows specializing opcode handlers according to
op_type fields and using different execution methods (call threading, switch
threading and direct threading). As a result ZE2 got more than 20% speedup on
raw PHP code execution (with specialized executor and direct threading execution
method). As in most PHP applications raw execution speed isn't the limiting
factor but system calls and database calls are, your mileage with this patch
will vary.
Most parts of the old zend_execute.c go into zend_vm_def.h. Here you can find
opcode handlers and helpers. The typical opcode handler template looks like
this:
ZEND_VM_HANDLER(<OPCODE-NUMBER>, <OPCODE>, <OP1_TYPES>, <OP2_TYPES>)
{
<HANDLER'S CODE>
}
<OPCODE-NUMBER> is a opcode number (0, 1, ...)
<OPCODE> is an opcode name (ZEN_NOP, ZEND_ADD, :)
<OP1_TYPES> and <OP2_TYPES> are masks for allowed operand op_types.
Specializer will generate code only for defined combination of types. You can
use any combination of the following op_types UNUSED, CONST, VAR, TMP and CV
also you can use ANY mask to disable specialization according operand's op_type.
<HANDLER'S CODE> is a handler's code itself. For most handlers it stills the
same as in old zend_execute.c, but now it uses macros to access opcode
operands and some internal executor data.
You can see the conformity of new macros to old code in the following list:
EXECUTE_DATA
execute_data
ZEND_VM_DISPATCH_TO_HANDLER(<OP>)
return <OP>_helper(ZEND_OPCODE_HANDLER_ARGS_PASSTHRU)
ZEND_VM_DISPATCH_TO_HELPER(<NAME>)
return <NAME>(ZEND_OPCODE_HANDLER_ARGS_PASSTHRU)
ZEND_VM_DISPATCH_TO_HELPER_EX(<NAME>,<PARAM>,<VAL>)
return <NAME>(<VAL>, ZEND_OPCODE_HANDLER_ARGS_PASSTHRU)
ZEND_VM_CONTINUE()
return 0
ZEND_VM_NEXT_OPCODE()
NEXT_OPCODE()
ZEND_VM_SET_OPCODE(<TARGET>
SET_OPCODE(<TARGET>
ZEND_VM_INC_OPCODE()
INC_OPCOD()
ZEND_VM_RETURN_FROM_EXECUTE_LOOP()
RETURN_FROM_EXECUTE_LOOP()
ZEND_VM_C_LABEL(<LABEL>):
<LABEL>:
ZEND_VM_C_GOTO(<LABEL>)
goto <LABEL>
OP<X>_TYPE
opline->op<X>.op_type
GET_OP<X>_ZVAL_PTR(<TYPE>)
get_zval_ptr(&opline->op<X>, EX(Ts), &free_op<X>, <TYPE>)
GET_OP<X>_ZVAL_PTR_PTR(<TYPE>)
get_zval_ptr_ptr(&opline->op<X>, EX(Ts), &free_op<X>, <TYPE>)
GET_OP<X>_OBJ_ZVAL_PTR(<TYPE>)
get_obj_zval_ptr(&opline->op<X>, EX(Ts), &free_op<X>, <TYPE>)
GET_OP<X>_OBJ_ZVAL_PTR_PTR(<TYPE>)
get_obj_zval_ptr_ptr(&opline->op<X>, EX(Ts), &free_op<X>, <TYPE>)
IS_OP<X>_TMP_FREE()
IS_TMP_FREE(free_op<X>)
FREE_OP<X>()
FREE_OP(free_op<X>)
FREE_OP<X>_IF_VAR()
FREE_VAR(free_op<X>)
FREE_OP<X>_VAR_PTR()
FREE_VAR_PTR(free_op<X>)
Executor's helpers can be defined without parameters or with one parameter. This is done with the following constructs:
ZEND_VM_HELPER(<HELPER-NAME>, <OP1_TYPES>, <OP2_TYPES>)
{
<HELPER'S CODE>
}
ZEND_VM_HELPER_EX(<HELPER-NAME>, <OP1_TYPES>, <OP2_TYPES>, <PARAM_SPEC>)
{
<HELPER'S CODE>
}
The executors code is generated by the PHP script zend_vm_gen.php. It uses
zend_vm_def.h and zend_vm_execute.skl as input and produces
zend_vm_opcodes.h and zend_vm_execute.h. The first file is a list of opcode
definitions. It is included from zend_compile.h. The second one is an executor
code itself. It is included from zend_execute.c.
zend_vm_gen.php can produce different kind of executors. You can select a
different opcode threading model using --with-vm-kind=CALL|SWITCH|GOTO|HYBRID.
You can disable opcode specialization using --without-specializer.
At last you can debug the executor using the original zend_vm_def.h or the
generated zend_vm_execute.h file. Debugging with the original file requires
the --with-lines option. By default, Zend Engine uses the following
command to generate the executor:
# Default VM kind is HYBRID
php zend_vm_gen.php --with-vm-kind=HYBRID