I noticed this while looking through a few bug reports but it deals with <=> too. Take: int f(int i) { int t = i ? -1 : 0; return t | 1; } int f1(int i) { return i ? -1 : 1; } ------- CUT ------ These two should produce the same code generation. Filing so I don't lose this while I am going through bug reports.
So it turns out you can make this generic and don't need to handle 1 specially diff --git a/gcc/match.pd b/gcc/match.pd index beb8d27535e..2af987278af 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3805,14 +3805,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (cond @0 INTEGER_CST@1 INTEGER_CST@2) (switch + /* a ? CST : -1 -> -(!a) | CST. */ + (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2)) + (with { + tree booltrue = constant_boolean_node (true, boolean_type_node); + } + (bit_ior (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))) @2))) + /* a ? -1 : CST -> -(a) | CST. */ + (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1)) + (with { + tree booltrue = constant_boolean_node (true, boolean_type_node); + } + (bit_ior (negate (convert (convert:boolean_type_node @0))) @2))) (if (integer_zerop (@2)) (switch /* a ? 1 : 0 -> a if 0 and 1 are integral types. */ (if (integer_onep (@1)) (convert (convert:boolean_type_node @0))) - /* a ? -1 : 0 -> -a. */ - (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1)) - (negate (convert (convert:boolean_type_node @0)))) /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */ (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1)) (with { @@ -3827,9 +3836,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* a ? 0 : 1 -> !a. */ (if (integer_onep (@2)) (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))) - /* a ? -1 : 0 -> -(!a). */ - (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2)) - (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))) /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */ (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2)) (with {
I decided that this should really go on the RTL level ....
I have the ifcvt.c patch which adds this.
Created attachment 51203 [details] ifcvt patch Patch which go into testing.
Comment on attachment 51203 [details] ifcvt patch This patch is wrong if STORE_FLAG_VALUE == -1.
Thinking about this some more, there is a canonicalization issue. We need to decide if we want to canonicalization to just a ? -1 : 1; or expand it out. a ? 1 : 0 makes sense to do (cast) a; So does "a ? 0 : 1". Does the current a ? -1 : 0 make sense or just add that to ifcvt. I am going to take a few days to think of this and such. There are other issues that deal with this. Even having a cmov existing makes it harder to decide. Even though for an example -(a == 0) can be optimized nicely on x86, it might not be nicely on other targets.
A few more canonicalization issues that need to be thought of: "a >>u (bitsize-1)" and "a <s 0" "a >>s (bitsize-1)" and "-(a <s 0)" (In reply to Andrew Pinski from comment #6) > Thinking about this some more, there is a canonicalization issue. We need to > decide if we want to canonicalization to just a ? -1 : 1; or expand it out. > a ? 1 : 0 makes sense to do (cast) a; So does "a ? 0 : 1". > > Does the current a ? -1 : 0 make sense or just add that to ifcvt. PR101339 is related to that canonicalization really. There are others. Even things like: (a == 0) + 2 Should that be: a == 0 ? 3 : 2 On the gimple level and then do the correct thing on the RTL level?
We generally want less stmts on GIMPLE, (-(type)a) | 1 is more than a ? -1 : 1 which means it should be a RTL expansion time optimization.
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>: https://gcc.gnu.org/g:f1652e3343b1ec47035370801d9b9aca1f8b613f commit r13-857-gf1652e3343b1ec47035370801d9b9aca1f8b613f Author: Roger Sayle <roger@nextmovesoftware.com> Date: Mon May 30 21:26:37 2022 +0100 PR rtl-optimization/101617: Use neg/sbb in ix86_expand_int_movcc. This patch resolves PR rtl-optimization/101617 where we should generate the exact same code for (X ? -1 : 1) as we do for ((X ? -1 : 0) | 1). The cause of the current difference on x86_64 is actually in ix86_expand_int_movcc that doesn't know that negl;sbbl can be used to create a -1/0 result depending on whether the input is zero/nonzero. So for Andrew Pinski's test case: int f1(int i) { return i ? -1 : 1; } GCC currently generates: f1: cmpl $1, %edi sbbl %eax, %eax // x ? 0 : -1 andl $2, %eax // x ? 0 : 2 subl $1, %eax // x ? -1 : 1 ret but with the attached patch, now generates: f1: negl %edi sbbl %eax, %eax // x ? -1 : 0 orl $1, %eax // x ? -1 : 1 ret To implement this I needed to add two expanders to i386.md to generate the required instructions (in both SImode and DImode) matching the pre-existing define_insns of the same name. 2022-05-30 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR rtl-optimization/101617 * config/i386/i386-expand.cc (ix86_expand_int_movcc): Add a special case (indicated by negate_cc_compare_p) to generate a -1/0 mask using neg;sbb. * config/i386/i386.md (x86_neg<mode>_ccc): New define_expand to generate an *x86_neg<mode>_ccc instruction. (x86_mov<mode>cc_0_m1_neg): Likewise, a new define_expand to generate a *x86_mov<mode>cc_0_m1_neg instruction. gcc/testsuite/ChangeLog PR rtl-optimization/101617 * gcc.target/i386/pr101617.c: New test case.
This should now be fixed (for x86) on mainline.