[Bug target/18154] Inefficient max/min code for PowerPC
segher at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Aug 23 21:53:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18154
--- Comment #11 from Segher Boessenkool <segher at gcc dot gnu.org> ---
The signed version can be done in four insns:
1: subfc r5,r3,r4
subfe r6,r6,r6
and r7,r6,r5
addc r8,r7,r3
(superopt finds 16 versions, all similar).
The unsigned version can be done in six:
33: subfc r5,r3,r4
srwi r6,r4,31
srwi r7,r3,31
subfe r8,r6,r7
and r9,r8,r5
addc r10,r9,r3
(superopt finds 240 versions, many with one or two xoris ,,0x8000
which doesn't work for 64 bit, and many with srawi as well, which
can be more expensive than srwi; all remaining are similar).
For 32-bit min/max on a 64-bit cpu, we can use only "cheap", non-carry
instructions:
extsw r3,r3
extsw r4,r4
subf r5,r4,r3
srdi r6,r5,32
and r7,r6,r5
add r8,r7,r4
(and unsigned exts for unsigned). Those extends often disappear into
surrounding insns, or because the ABI requires the regs to be extended
already, etc.
More information about the Gcc-bugs
mailing list