[Bug target/97503] Suboptimal use of cntlzw and cntlzd
cvs-commit at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Oct 21 08:54:54 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97503
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:5244b4af5e47bc98a2a9cf36f048981583a1b163
commit r11-4183-g5244b4af5e47bc98a2a9cf36f048981583a1b163
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Oct 21 10:51:33 2020 +0200
phiopt: Optimize x ? __builtin_clz (x) : 32 in GIMPLE [PR97503]
While we have at the RTL level noce_try_ifelse_collapse combined with
simplify_cond_clz_ctz, that optimization doesn't always trigger because
e.g. on powerpc there is an define_insn to compare a reg against zero and
copy that register to another one and so we end up with a different pseudo
in the simplify_cond_clz_ctz test and punt.
For targets that define C?Z_DEFINED_VALUE_AT_ZERO to 2 for certain modes,
we can optimize it already in phiopt though, just need to ensure that
we transform the __builtin_c?z* calls into .C?Z ifns because my recent
VRP changes codified that the builtin calls are always undefined at zero,
while ifns honor C?Z_DEFINED_VALUE_AT_ZERO equal to 2.
And, in phiopt we already have popcount handling that does pretty much the
same thing, except for always using a zero value rather than the one set
by C?Z_DEFINED_VALUE_AT_ZERO.
So, this patch extends that function to handle not just popcount, but also
clz and ctz.
2020-10-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/97503
* tree-ssa-phiopt.c: Include internal-fn.h.
(cond_removal_in_popcount_pattern): Rename to ...
(cond_removal_in_popcount_clz_ctz_pattern): ... this. Handle not
just
popcount, but also clz and ctz if it has C?Z_DEFINED_VALUE_AT_ZERO
2.
* gcc.dg/tree-ssa/pr97503.c: New test.
More information about the Gcc-bugs
mailing list