95034 – Failure to convert xor pattern (made out of or+and) to xor

Bug 95034 - Failure to convert xor pattern (made out of or+and) to xor

Summary: Failure to convert xor pattern (made out of or+and) to xor

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	11.0

Importance:	P3 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	easyhack, missed-optimization

Depends on:
Blocks:	19987 111275
	Show dependency tree / graph

Reported:	2020-05-10 09:48 UTC by Gabriel Ravier
Modified:	2023-10-25 13:14 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2021-08-15 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Gabriel Ravier 2020-05-10 09:48:44 UTC

bool combine(bool a, bool b)
{
    return (a || b) && !(a && b);
}

This can be converted to `a ^ b`. LLVM does this transformation, but GCC does not.

Comment 1 Jakub Jelinek 2021-01-12 10:33:09 UTC

For
int
combine (int a, int b)
{
  return (a | b) & ~(a & b);
}
we do optimize it, for the ||s and &&s the problem is that it is split accross multiple basic blocks, so match.pd is out of the picture, but on the other side reassoc which can handle conditions split across multiple bbs will handle only a single truth operation (so the && in this case) and so we'd need to look through the |s from there manually.

Ah, an option could be to add some truth_{and,ior,*} rules in match.pd, limited to GENERIC probably, as it won't trigger afterwards.
But of course that would handle only the above testcase and not e.g.
    bool t1 = a || b;
    bool t2 = !(a && b);
    return t1 && t2;
etc.

Richard, any ideas?

Comment 2 Jakub Jelinek 2021-01-12 10:56:41 UTC

On:
int
combine (int a, int b)
{
  int c = (a | b);
  int d = ~(a & b);
  return c & d;
}
it is the
/* (x | y) & ~(x & y) -> x ^ y */
(simplify
 (bit_and:c (bit_ior @0 @1) (bit_not (bit_and @0 @1)))
 (bit_xor @0 @1))
simplification (and there are similar around it).
So, if we go to doing this in GENERIC, we'd need
variants of this for truth_{and,ior} and truth_{and,ior}if (and for the latter only under the !TREE_SIDE_EFFECTS and !generic_expr_could_trap_p conditions, (though doesn't the more than one @1 already imply !TREE_SIDE_EFFECTS).
We do have truth_xor, but that will evaluate both arguments.

Comment 3 Jakub Jelinek 2021-01-12 11:03:23 UTC

Actually in this case we are probably fine even if b can trap, the reason is that
if a evaluated to true, then b in (a && b) is evaluated, and if a evaluates to false, then b in (a || b) is evaluated.  So it is all about a and b being the same expressions with no side-effects (which ensures that it isn't just e.g. calling the same function with the same arguments twice).

Comment 4 Jakub Jelinek 2021-01-12 12:12:02 UTC

Tried
--- match.pd.jj1	2021-01-05 16:33:21.809960885 +0100
+++ match.pd	2021-01-12 12:47:11.232713918 +0100
@@ -1228,6 +1228,64 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (bit_xor (bit_ior:c (bit_not @0) @1) (bit_ior:c @0 (bit_not @1)))
  (bit_xor @0 @1))
 
+#if GENERIC
+/* Repeat the above simplifications for truth* for GENERIC.
+   It can be done even for truth_*if as both x and y are always evaluated.
+   Unfortunately :c on truth_*if aren't allowed (they aren't commutative,
+   but in this case we don't care).  */
+(for andop (truth_and truth_andif)
+     orop (truth_or truth_orif)
+ /* (x || y) && !(x && y) -> x ^ y */
+ (simplify
+  (andop (orop @0 @1) (truth_not (andop @0 @1)))
+  (truth_xor @0 @1))
+ (simplify
+  (andop (truth_not (andop @0 @1)) (orop @0 @1))
+  (truth_xor @0 @1))
+
+ /* (x || y) && (!x ^ y) -> x && y */
+ (simplify
+  (andop (orop @0 @1) (truth_xor:c @1 (truth_not @0)))
+  (truth_and @0 @1))
+ (simplify
+  (andop (orop @1 @0) (truth_xor:c @1 (truth_not @0)))
+  (truth_and @0 @1))
+ (simplify
+  (andop (truth_xor:c @1 (truth_not @0)) (orop @0 @1))
+  (truth_and @0 @1))
+ (simplify
+  (andop (truth_xor:c @1 (truth_not @0)) (orop @1 @0))
+  (truth_and @0 @1))
+
+ /* (!x || y) && (x || !y) -> !(x ^ y) */
+ (simplify
+  (andop (orop:s (truth_not @0) @1) (orop:s @0 (truth_not @1)))
+  (truth_not (truth_xor @0 @1)))
+ (simplify
+  (andop (orop:s @1 (truth_not @0)) (orop:s @0 (truth_not @1)))
+  (truth_not (truth_xor @0 @1)))
+ (simplify
+  (andop (orop:s (truth_not @0) @1) (orop:s (truth_not @1) @0))
+  (truth_not (truth_xor @0 @1)))
+ (simplify
+  (andop (orop:s @1 (truth_not @0)) (orop:s (truth_not @1) @0))
+  (truth_not (truth_xor @0 @1)))
+
+ /* (!x || y) ^ (x || !y) -> x ^ y */
+ (simplify
+  (truth_xor (orop (truth_not @0) @1) (orop @0 (truth_not @1)))
+  (truth_xor @0 @1))
+ (simplify
+  (truth_xor (orop @1 (truth_not @0)) (orop @0 (truth_not @1)))
+  (truth_xor @0 @1))
+ (simplify
+  (truth_xor (orop (truth_not @0) @1) (orop (truth_not @1) @0))
+  (truth_xor @0 @1))
+ (simplify
+  (truth_xor (orop @1 (truth_not @0)) (orop (truth_not @1) @0))
+  (truth_xor @0 @1)))
+#endif
+
 /* ((x & y) - (x | y)) - 1 -> ~(x ^ y) */
 (simplify
  (plus (nop_convert1? (minus@2 (nop_convert2? (bit_and:c @0 @1))
but in addition to it being lengthy (as one can't use :c on andop or orop because truth_andif and truth_orif aren't commutative), it doesn't really work here, because e.g. fold_truth_andor_1 optimizes that x || y into (x | y) != 0
before we can optimize it.
Another option is to code it in fold_truth_andor or fold_truth_andor_1.

Comment 5 Andrew Pinski 2021-04-17 21:03:52 UTC

Confirmed.

Comment 6 Andrew Pinski 2023-08-23 02:16:53 UTC

So after phiopt2 we end up with:
```
  _1 = a_3(D) | b_4(D);
  if (_1 != 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 3> [local count: 536870912]:
  _5 = a_3(D) & b_4(D);
  _7 = ~_5;

  <bb 4> [local count: 1073741824]:
  # iftmp.0_2 = PHI <_7(3), 0(2)>
```

Phi-opt does not support more than one statement inside the middle BB so it does nothing here.

But if we rewrite it such that the 2 statements were not inside the middle BB by phiopt2, we get the xor as expected.
That is:
```
bool f(bool a, bool b)
{
  bool c = a | b;
  bool d = a & b;
  d = !d;
  return c ? d : false;
}
```
Will produce:
```
  _8 = a_3(D) ^ b_4(D);
  return _8;
```
From the phiopt2 dump:
```
Folded into the sequence:
_8 = a_3(D) ^ b_4(D);
```

I wonder if we could support more than 1 statement in the middle BBs iff the resulting simplifications only reference one SSA_NAME of the statements max ...

Comment 7 Andrew Pinski 2023-10-17 04:39:18 UTC

(In reply to Andrew Pinski from comment #6)
> So after phiopt2 we end up with:
> ```
>   _1 = a_3(D) | b_4(D);
>   if (_1 != 0)
>     goto <bb 3>; [50.00%]
>   else
>     goto <bb 4>; [50.00%]
> 
>   <bb 3> [local count: 536870912]:
>   _5 = a_3(D) & b_4(D);
>   _7 = ~_5;
> 
>   <bb 4> [local count: 1073741824]:
>   # iftmp.0_2 = PHI <_7(3), 0(2)>
> ```
> 
> Phi-opt does not support more than one statement inside the middle BB so it
> does nothing here.
> 
> But if we rewrite it such that the 2 statements were not inside the middle
> BB by phiopt2, we get the xor as expected.
> That is:
> ```
> bool f(bool a, bool b)
> {
>   bool c = a | b;
>   bool d = a & b;
>   d = !d;
>   return c ? d : false;
> }
> ```
> Will produce:
> ```
>   _8 = a_3(D) ^ b_4(D);
>   return _8;
> ```
> From the phiopt2 dump:
> ```
> Folded into the sequence:
> _8 = a_3(D) ^ b_4(D);
> ```
> 
> I wonder if we could support more than 1 statement in the middle BBs iff the
> resulting simplifications only reference one SSA_NAME of the statements max
> ...

Or we could support what we do for casts for `~` and push the ~ outside of the if statement to see if that improves the phi-opt.
That is get:
```
bool f(bool a, bool b)
{
        bool t = a | b;
        bool t1;
        if (t) t1 = a & b; else t1 = 1;
        return !t1;
}

```
Which is already known how to optimized to `t1 = a == b;`