Bug 101617 - a ? -1 : 1 -> (-(type)a) | 1
Summary: a ? -1 : 1 -> (-(type)a) | 1
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 12.0
: P3 enhancement
Target Milestone: 13.0
Assignee: Andrew Pinski
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2021-07-25 05:47 UTC by Andrew Pinski
Modified: 2022-06-04 09:14 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-07-25 00:00:00


Attachments
ifcvt patch (655 bytes, patch)
2021-07-25 22:04 UTC, Andrew Pinski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2021-07-25 05:47:46 UTC
I noticed this while looking through a few bug reports but it deals with <=> too.
Take:
int f(int i)
{
  int t = i ? -1 : 0;
  return t | 1;
}

int f1(int i)
{
  return i ? -1 : 1;
}

------- CUT ------
These two should produce the same code generation.

Filing so I don't lose this while I am going through bug reports.
Comment 1 Andrew Pinski 2021-07-25 19:25:14 UTC
So it turns out you can make this generic and don't need to handle 1 specially
diff --git a/gcc/match.pd b/gcc/match.pd
index beb8d27535e..2af987278af 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3805,14 +3805,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (cond @0 INTEGER_CST@1 INTEGER_CST@2)
  (switch
+   /* a ? CST : -1 -> -(!a) | CST. */
+  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
+   (with {
+      tree booltrue = constant_boolean_node (true, boolean_type_node);
+    }
+    (bit_ior (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))) @2)))
+   /* a ? -1 : CST -> -(a) | CST. */
+  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
+   (with {
+      tree booltrue = constant_boolean_node (true, boolean_type_node);
+    }
+    (bit_ior (negate (convert (convert:boolean_type_node @0))) @2)))
   (if (integer_zerop (@2))
    (switch
     /* a ? 1 : 0 -> a if 0 and 1 are integral types. */
     (if (integer_onep (@1))
      (convert (convert:boolean_type_node @0)))
-    /* a ? -1 : 0 -> -a. */
-    (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
-     (negate (convert (convert:boolean_type_node @0))))
     /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */
     (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1))
      (with {
@@ -3827,9 +3836,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
      /* a ? 0 : 1 -> !a. */
      (if (integer_onep (@2))
       (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
-     /* a ? -1 : 0 -> -(!a). */
-     (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
-      (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))))
      /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
      (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
       (with {
Comment 2 Andrew Pinski 2021-07-25 20:47:18 UTC
I decided that this should really go on the RTL level ....
Comment 3 Andrew Pinski 2021-07-25 22:03:38 UTC
I have the ifcvt.c patch which adds this.
Comment 4 Andrew Pinski 2021-07-25 22:04:25 UTC
Created attachment 51203 [details]
ifcvt patch

Patch which go into testing.
Comment 5 Andrew Pinski 2021-07-25 22:35:57 UTC
Comment on attachment 51203 [details]
ifcvt patch

This patch is wrong if STORE_FLAG_VALUE == -1.
Comment 6 Andrew Pinski 2021-07-26 00:21:09 UTC
Thinking about this some more, there is a canonicalization issue. We need to decide if we want to canonicalization to just a ? -1 : 1; or expand it out.
a ? 1 : 0 makes sense to do (cast) a;  So does "a ? 0 : 1".

Does the current a ? -1 : 0 make sense or just add that to ifcvt.

I am going to take a few days to think of this and such.

There are other issues that deal with this.  Even having a cmov existing makes it harder to decide.  Even though for an example -(a == 0) can be optimized nicely on x86, it might not be nicely on other targets.
Comment 7 Andrew Pinski 2021-07-26 00:28:38 UTC
A few more canonicalization issues that need to be thought of:

"a >>u (bitsize-1)" and "a <s 0"
"a >>s (bitsize-1)" and "-(a <s 0)"

(In reply to Andrew Pinski from comment #6)
> Thinking about this some more, there is a canonicalization issue. We need to
> decide if we want to canonicalization to just a ? -1 : 1; or expand it out.
> a ? 1 : 0 makes sense to do (cast) a;  So does "a ? 0 : 1".
> 
> Does the current a ? -1 : 0 make sense or just add that to ifcvt.

PR101339 is related to that canonicalization really.

There are others.

Even things like:
(a == 0) + 2
Should that be:
a == 0 ? 3 : 2
On the gimple level
and then do the correct thing on the RTL level?
Comment 8 Richard Biener 2021-07-27 10:52:29 UTC
We generally want less stmts on GIMPLE, (-(type)a) | 1 is more than
a ? -1 : 1 which means it should be a RTL expansion time optimization.
Comment 9 GCC Commits 2022-05-30 20:28:28 UTC
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:f1652e3343b1ec47035370801d9b9aca1f8b613f

commit r13-857-gf1652e3343b1ec47035370801d9b9aca1f8b613f
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon May 30 21:26:37 2022 +0100

    PR rtl-optimization/101617: Use neg/sbb in ix86_expand_int_movcc.
    
    This patch resolves PR rtl-optimization/101617 where we should generate
    the exact same code for (X ? -1 : 1) as we do for ((X ? -1 : 0) | 1).
    The cause of the current difference on x86_64 is actually in
    ix86_expand_int_movcc that doesn't know that negl;sbbl can be used
    to create a -1/0 result depending on whether the input is zero/nonzero.
    
    So for Andrew Pinski's test case:
    
    int f1(int i)
    {
      return i ? -1 : 1;
    }
    
    GCC currently generates:
    
    f1:     cmpl    $1, %edi
            sbbl    %eax, %eax      // x ? 0 : -1
            andl    $2, %eax        // x ? 0 : 2
            subl    $1, %eax        // x ? -1 : 1
            ret
    
    but with the attached patch, now generates:
    
    f1:     negl    %edi
            sbbl    %eax, %eax      // x ? -1 : 0
            orl     $1, %eax        // x ? -1 : 1
            ret
    
    To implement this I needed to add two expanders to i386.md to generate
    the required instructions (in both SImode and DImode) matching the
    pre-existing define_insns of the same name.
    
    2022-05-30  Roger Sayle  <roger@nextmovesoftware.com>
    
    gcc/ChangeLog
            PR rtl-optimization/101617
            * config/i386/i386-expand.cc (ix86_expand_int_movcc): Add a
            special case (indicated by negate_cc_compare_p) to generate a
            -1/0 mask using neg;sbb.
            * config/i386/i386.md (x86_neg<mode>_ccc): New define_expand
            to generate an *x86_neg<mode>_ccc instruction.
            (x86_mov<mode>cc_0_m1_neg): Likewise, a new define_expand to
            generate a *x86_mov<mode>cc_0_m1_neg instruction.
    
    gcc/testsuite/ChangeLog
            PR rtl-optimization/101617
            * gcc.target/i386/pr101617.c: New test case.
Comment 10 Roger Sayle 2022-06-04 09:14:33 UTC
This should now be fixed (for x86) on mainline.