We implement a simplified version. Every value with one single bit can
be encoded for logical instructions and we suppose this quick check can
cover a lot of common masks.
Besides, add macro TST_64_WITH_ONE since it's used often to test 64-bit
register with constant 1.
Change-Id: I850a6ac6acbe2d12f85180e407344580ee6fea61
As explained by Dmitry, 'ucomsd' in x86 sets 'p' flag, and it also
always sets 'z' and 'c' flags. [1]
Besides, remove one duplicate 'break'.
[1]. https://mudongliang.github.io/x86/html/file_module_x86_id_316.html
Change-Id: I767214c7ab8db31115801a3ae96b20320757899f
LONG MUL can be optimzied into left shift if either operand is a power
of two. Conditions "IS_SIGNED_32BIT()" and "is_power_of_two()" are used
to filter out invalid candidates. However, there exists one exception,
i.e. -2147483648(that is 0xffff,ffff,8000,0000). See the stand-alone
case[1].
Assume "a = 3; b = -2147483648;". The expected result of "a * b" is one
negative value. However, it would be optimized to "a << 31", which is
positive.
This trigger condition is refined.
1) For x86 implementation, another check for positive numbers is added.
Note that LONG type, i.e. zend_long, is defined as int32_t for x86 arch
and int64_t for x64 arch. This optimization only accepts values which
can be represented by int32_t type as default. See IS_SIGNED_32BIlT(),
2) For AArch64, we employ helper function zend_long_is_power_of_two()
since values of int64_t type are used.
Overflow detection for left shifting is added in this patch as well.
Note 1: bit helper functions are arch-independent and we move them into
zend_jit_internals.h.
Note 2: two test cases are added. Test case mul_003.phpt is used to
check the trigger condition and mul_004.phpt is designed to check
overflow detection.
Note 3: overflow detection for x86 is not implemented yet as I think
anotehr temporay register besides R0 is needed. Hence mul_004.phpt would
fail on x86 machine.
If we can use R1 as tmp_reg, the code can be updated as below.
```
| GET_ZVAL_LVAL result_reg, op1_addr
if (may_overflow) {
use_ovf_flag = 0;
/* Compare 'op' and '((op << n) >> n)' for overflow.
* Flag: jne -> overflow. je -> no overflow.
*/
tmp_reg = ZREG_R1
| mov Ra(tmp_reg), Ra(result_reg)
| shl Ra(tmp_reg), floor_log2(Z_LVAL_P(Z_ZV(op2_addr)))
| sar Ra(tmp_reg), floor_log2(Z_LVAL_P(Z_ZV(op2_addr)))
| cmp Ra(tmp_reg), Ra(result_reg)
}
| shl Ra(result_reg), floor_log2(Z_LVAL_P(Z_ZV(op2_addr)))
```
[1]. https://godbolt.org/z/1vKbfv8oG
Change-Id: Ie90e1d4e7c8b94a0c8f61386dfe650fa2c6879a1
Overflow detection for LONG MUL is added in this patch. Quite different
from 'subs' and 'adds' where overflow can be easily checked via the V
flags, LONG MUL wouldn't set the flags.
We use 'smulh' instruction to get the upper 64 bits of the 128-bit
result and check the top 65 bits to tell whether integer overflow
occurs. [1]
Note that LONG MUL can be substituted by 'adds' or 'lsl' in some cases.
Hence, flag 'use_mul' is introduced in order to select the proper
overflow check check instruction afterwards.
[1]
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/detecting-overflow-from-mul
Change-Id: I67e8287e9044c2a96b188d4bf6674736713abfe9