Bug 51049 - A regression caused by "Improve handling of conditional-branches on targets with high branch costs"
Summary: A regression caused by "Improve handling of conditional-branches on targets w...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.7.0
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2011-11-09 07:37 UTC by Jiangning Liu
Modified: 2023-08-03 05:12 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-11-09 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jiangning Liu 2011-11-09 07:37:12 UTC
int f(char *i, int j)
{
        if (*i && j!=2)
                return *i;
        else
                return j;
}

Before the check-in r180109, we have

  D.4710 = *i;
  D.4711 = D.4710 != 0;
  D.4712 = j != 2;
  D.4713 = D.4711 & D.4712;
  if (D.4713 != 0) goto <D.4714>; else goto <D.4715>;
  <D.4714>:
  D.4710 = *i;
  D.4716 = (int) D.4710;
  return D.4716;
  <D.4715>:
  D.4716 = j;
  return D.4716;

After check-in r180109, we have

  D.4711 = *i;
  if (D.4711 != 0) goto <D.4712>; else goto <D.4710>;
  <D.4712>:
  if (j != 2) goto <D.4713>; else goto <D.4710>;
  <D.4713>:
  D.4711 = *i;
  D.4714 = (int) D.4711;
  return D.4714;
  <D.4710>:
  D.4714 = j;
  return D.4714;

the code below in function fold_truth_andor makes difference,

      /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B)	 into (A OR B).
	 For sequence point consistancy, we need to check for trapping,
	 and side-effects.  */
      else if (code == icode && simple_operand_p_2 (arg0)
               && simple_operand_p_2 (arg1))
         return fold_build2_loc (loc, ncode, type, arg0, arg1);

for "*i != 0" simple_operand_p(*i) returns false. Originally this is not checked by the code. 

Please refer to http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02445.html for discussion details.

This change accidently made some benchmarks significantly improved due to some other reasons, but Michael gave the comments below.

======Michael's comment======

It's nice that it caused a benchmark to improve significantly, but that should be done via a proper analysis and patch, not as a side effect of a supposed non-change.

======End of Michael's comment======

The potential impact would be hurting other scenarios on performance.

The key point is for this small case I gave RHS doesn't have side effect at all, so the optimization of changing it to AND doesn't violate C specification. 

======Kai's comment======

As for the case that left-hand side has side-effects but right-hand not, we aren't allowed to do this AND/OR merge.  For example 'if ((f = foo ()) != 0 && f < 24)' we aren't allowed to make this transformation.

This shouldn't be that hard.  We need to provide to simple_operand_p_2 an additional argument for checking trapping or not.

======End of Kai's comment======

This optimization change is blocking some other optimizations I am working on in back-ends. For example, the problem I described at http://gcc.gnu.org/ml/gcc/2011-09/msg00175.html disappeared. But it is not a proper behavior.

Thanks,
-Jiangning
Comment 1 Andrew Pinski 2016-01-06 22:46:25 UTC
I think this has been fixed or at least for most cases.  That is the & part is redone during the combineif pass.
Comment 2 Andrew Pinski 2023-08-03 05:12:55 UTC
So at -O1 we get what we expect:
```
  _1 = *i_4(D);
  _7 = j_5(D) != 2;
  _8 = _1 != 0;
  _9 = _7 & _8;
  if (_9 != 0)
    goto <bb 3>; [43.56%]
  else
    goto <bb 4>; [56.44%]

  <bb 3> [local count: 467721933]:
  _6 = (int) _1;

  <bb 4> [local count: 1073741824]:
  # _2 = PHI <_6(3), j_5(D)(2)>
```
But at -O2 we get:
```
  if (_1 != 0)
    goto <bb 3>; [66.00%]
  else
    goto <bb 5>; [34.00%]

  <bb 3> [local count: 708669599]:
  if (j_5(D) != 2)
    goto <bb 4>; [66.00%]
  else
    goto <bb 5>; [34.00%]

  <bb 4> [local count: 467721933]:
  _6 = (int) _1;

  <bb 5> [local count: 1073741824]:
  # _2 = PHI <_6(4), j_5(D)(3), j_5(D)(2)>
```

Because VRP comes along and replaces j_5(D) with 2 along the edge `3->5` and ifcombine does not do handle that as the phi entries are different (but maybe can be proved as the same).