[Bug tree-optimization/99142] New: [11 Regression] __builtin_clz match.pd transformation too greedy
hp at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Feb 17 23:11:42 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99142
Bug ID: 99142
Summary: [11 Regression] __builtin_clz match.pd transformation
too greedy
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: hp at gcc dot gnu.org
Target Milestone: ---
Created attachment 50215
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50215&action=edit
test-case gcc.dg/tree-ssa/prXXXXX.c
See the attachment test-case, which is de-macroized from
gcc.target/cris/pr93372-31.c, which started regressing with d2eb616a0f7b
"match.pd: Add clz(X) == 0 -> (int)X < 0 etc. simpifications [PR94802]"
In the test-case, the result *is* used more than once (twice more besides the
transformed compare) and the match.pd matching expression *does* have the s
modifier: (op (clz:s @0) INTEGER_CST@1), but since the transformation doesn't
result in "an expression with more than one operator" (cf.
doc/match-and-simplify.texi), it's still performed.
The result is that the *input* is kept alive *after* the clz instruction. This
generally causes additional register pressure and throws away any re-use of
incidentally computed condition codes. Though the original observation was for
cris-elf, where the effect is more dramatic, the effect is visible even for
x86_64 and of the same kind: losing the re-use of non-zero condition codes from
the bsrl instruction, i.e. the transformation causes an additional instruction:
--- prXXXXX.s.64good 2021-02-17 02:26:57.646183108 +0100
+++ prXXXXX.s.64bad 2021-02-17 02:27:33.124979464 +0100
@@ -9,7 +9,8 @@ f:
bsrl %edi, %eax
xorl $31, %eax
movl %eax, (%rsi)
- je .L1
+ testl %edi, %edi
+ js .L1
movl %eax, (%rdx)
.L1:
ret
To wit, my conclusion is that the matching condition should better be gated by
single_use(clz result) *everywhere*.
Alternatively, the "s" modifier adjusted somehow, but I'm not sure besides
obviously just making it *exactly* single_use, and that suggestion has been
shot down before.
Maybe there should be an additional *reverse* version of the "simplification",
replacing "y = clz(x); if (x < 0) ...stuff using y but not x" -> "y = clz(x);
if (y != 0) ...stuff using y but not x"!
More information about the Gcc-bugs
mailing list