101617 – a ? -1 : 1 -> (-(type)a) | 1

Bug 101617 - a ? -1 : 1 -> (-(type)a) | 1

Summary: a ? -1 : 1 -> (-(type)a) | 1

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	12.0

Importance:	P3 enhancement
Target Milestone:	13.0
Assignee:	Andrew Pinski

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2021-07-25 05:47 UTC by Andrew Pinski
Modified:	2022-06-04 09:14 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2021-07-25 00:00:00

Attachments
ifcvt patch (655 bytes, patch) 2021-07-25 22:04 UTC, Andrew Pinski	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andrew Pinski 2021-07-25 05:47:46 UTC

I noticed this while looking through a few bug reports but it deals with <=> too.
Take:
int f(int i)
{
  int t = i ? -1 : 0;
  return t | 1;
}

int f1(int i)
{
  return i ? -1 : 1;
}

------- CUT ------
These two should produce the same code generation.

Filing so I don't lose this while I am going through bug reports.

Comment 1 Andrew Pinski 2021-07-25 19:25:14 UTC

So it turns out you can make this generic and don't need to handle 1 specially
diff --git a/gcc/match.pd b/gcc/match.pd
index beb8d27535e..2af987278af 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3805,14 +3805,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (cond @0 INTEGER_CST@1 INTEGER_CST@2)
  (switch
+   /* a ? CST : -1 -> -(!a) | CST. */
+  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
+   (with {
+      tree booltrue = constant_boolean_node (true, boolean_type_node);
+    }
+    (bit_ior (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))) @2)))
+   /* a ? -1 : CST -> -(a) | CST. */
+  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
+   (with {
+      tree booltrue = constant_boolean_node (true, boolean_type_node);
+    }
+    (bit_ior (negate (convert (convert:boolean_type_node @0))) @2)))
   (if (integer_zerop (@2))
    (switch
     /* a ? 1 : 0 -> a if 0 and 1 are integral types. */
     (if (integer_onep (@1))
      (convert (convert:boolean_type_node @0)))
-    /* a ? -1 : 0 -> -a. */
-    (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
-     (negate (convert (convert:boolean_type_node @0))))
     /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */
     (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1))
      (with {
@@ -3827,9 +3836,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
      /* a ? 0 : 1 -> !a. */
      (if (integer_onep (@2))
       (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
-     /* a ? -1 : 0 -> -(!a). */
-     (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
-      (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } ))))
      /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
      (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
       (with {

Comment 2 Andrew Pinski 2021-07-25 20:47:18 UTC

I decided that this should really go on the RTL level ....

Comment 3 Andrew Pinski 2021-07-25 22:03:38 UTC

I have the ifcvt.c patch which adds this.

Comment 4 Andrew Pinski 2021-07-25 22:04:25 UTC

Created attachment 51203 [details]
ifcvt patch

Patch which go into testing.

Comment 5 Andrew Pinski 2021-07-25 22:35:57 UTC

Comment on attachment 51203 [details]
ifcvt patch

This patch is wrong if STORE_FLAG_VALUE == -1.

Comment 6 Andrew Pinski 2021-07-26 00:21:09 UTC

Thinking about this some more, there is a canonicalization issue. We need to decide if we want to canonicalization to just a ? -1 : 1; or expand it out.
a ? 1 : 0 makes sense to do (cast) a;  So does "a ? 0 : 1".

Does the current a ? -1 : 0 make sense or just add that to ifcvt.

I am going to take a few days to think of this and such.

There are other issues that deal with this.  Even having a cmov existing makes it harder to decide.  Even though for an example -(a == 0) can be optimized nicely on x86, it might not be nicely on other targets.

Comment 7 Andrew Pinski 2021-07-26 00:28:38 UTC

A few more canonicalization issues that need to be thought of:

"a >>u (bitsize-1)" and "a <s 0"
"a >>s (bitsize-1)" and "-(a <s 0)"

(In reply to Andrew Pinski from comment #6)
> Thinking about this some more, there is a canonicalization issue. We need to
> decide if we want to canonicalization to just a ? -1 : 1; or expand it out.
> a ? 1 : 0 makes sense to do (cast) a;  So does "a ? 0 : 1".
> 
> Does the current a ? -1 : 0 make sense or just add that to ifcvt.

PR101339 is related to that canonicalization really.

There are others.

Even things like:
(a == 0) + 2
Should that be:
a == 0 ? 3 : 2
On the gimple level
and then do the correct thing on the RTL level?

Comment 8 Richard Biener 2021-07-27 10:52:29 UTC

We generally want less stmts on GIMPLE, (-(type)a) | 1 is more than
a ? -1 : 1 which means it should be a RTL expansion time optimization.

Comment 9 GCC Commits 2022-05-30 20:28:28 UTC

The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:f1652e3343b1ec47035370801d9b9aca1f8b613f

commit r13-857-gf1652e3343b1ec47035370801d9b9aca1f8b613f
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon May 30 21:26:37 2022 +0100

    PR rtl-optimization/101617: Use neg/sbb in ix86_expand_int_movcc.
    
    This patch resolves PR rtl-optimization/101617 where we should generate
    the exact same code for (X ? -1 : 1) as we do for ((X ? -1 : 0) | 1).
    The cause of the current difference on x86_64 is actually in
    ix86_expand_int_movcc that doesn't know that negl;sbbl can be used
    to create a -1/0 result depending on whether the input is zero/nonzero.
    
    So for Andrew Pinski's test case:
    
    int f1(int i)
    {
      return i ? -1 : 1;
    }
    
    GCC currently generates:
    
    f1:     cmpl    $1, %edi
            sbbl    %eax, %eax      // x ? 0 : -1
            andl    $2, %eax        // x ? 0 : 2
            subl    $1, %eax        // x ? -1 : 1
            ret
    
    but with the attached patch, now generates:
    
    f1:     negl    %edi
            sbbl    %eax, %eax      // x ? -1 : 0
            orl     $1, %eax        // x ? -1 : 1
            ret
    
    To implement this I needed to add two expanders to i386.md to generate
    the required instructions (in both SImode and DImode) matching the
    pre-existing define_insns of the same name.
    
    2022-05-30  Roger Sayle  <roger@nextmovesoftware.com>
    
    gcc/ChangeLog
            PR rtl-optimization/101617
            * config/i386/i386-expand.cc (ix86_expand_int_movcc): Add a
            special case (indicated by negate_cc_compare_p) to generate a
            -1/0 mask using neg;sbb.
            * config/i386/i386.md (x86_neg<mode>_ccc): New define_expand
            to generate an *x86_neg<mode>_ccc instruction.
            (x86_mov<mode>cc_0_m1_neg): Likewise, a new define_expand to
            generate a *x86_mov<mode>cc_0_m1_neg instruction.
    
    gcc/testsuite/ChangeLog
            PR rtl-optimization/101617
            * gcc.target/i386/pr101617.c: New test case.

Comment 10 Roger Sayle 2022-06-04 09:14:33 UTC

This should now be fixed (for x86) on mainline.