This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/78496] New: Missed opportunities for jump threading

From: "ysrumyan at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Wed, 23 Nov 2016 15:02:29 +0000
Subject: [Bug tree-optimization/78496] New: Missed opportunities for jump threading
Auto-submitted: auto-generated

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78496

            Bug ID: 78496
           Summary: Missed opportunities for jump threading
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ysrumyan at gmail dot com
  Target Milestone: ---

Created attachment 40131
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40131&action=edit
test-case to reproduce, compile with -O3 option.

We noticed a huge performance drop on one important benchmark which is caused
by hoisting and collecting comparisons participated in conditional branches.
Here is comments provided by Richard on it:

Note this is a general issue with PRE which tends to
see partial redundancies when it can compute an expression to a
constant on one edge.  There is nothing wrong with that but the
particular example shows the lack of a cost model with respect
to register pressure (same applies to other GIMPLE optimization passes).

In this case we have a lot of expression anticipated from the same
blocks where on one incoming edge their value is constant.  Profitability
here really depends on the "distance" of the to be inserted PHI and
its use I guess.

We're missing quite some jump-threading here as well:

  <bb 16>:
  # x1_197 = PHI <x1_261(15), x1_435(123), x1_435(105)>
  # _407 = PHI <_16(15), _16(123), 0(105)>
  # aa1_410 = PHI <aa1_185(15), aa1_185(123), aa1_216(105)>
  # d1_413 = PHI <d1_191(15), d1_191(123), d1_432(105)>
  # w1_416 = PHI <w1_260(15), w1_260(123), 0(105)>
  # v1_377 = PHI <v1_558(15), v1_558(123), 0(105)>
  # oo1_371 = PHI <oo1_567(15), oo1_567(123), oo1_194(105)>
  # ss1_376 = PHI <ss1_576(15), ss1_576(123), ss1_192(105)>
  # r1_609 = PHI <r1_585(15), r1_585(123), r1_190(105)>
  # _612 = PHI <_596(15), _596(123), _188(105)>
  # out_ind_lsm.82_322 = PHI <out_ind_lsm.82_321(15),
out_ind_lsm.82_321(123), out_ind_lsm.82_532(105)>
  _549 = w1_416 <= 899;
  _548 = _407 > 839;
  _541 = _548 & _549;
  if (_541 != 0)
    goto <bb 17>;
  else
    goto <bb 124>;

here 105 -> 16 -> 124 (forwarder) -> 18 which would eventually
make PRE behave somewhat saner (avoding the far distances).

The case appears with phicprop1 (or rather DOM, itself missing
a followup transform with respect to folding a degenerate constant
PHI plus the followup secondary threading opportunities).  The
backwards threader doesn't exploit the above opportunity though.
Our forward threaders (like DOM) do.  Unfortunately it requires
quite a few iterations to get all opportunities exploited...
(inserting 9 DOM/phi-only-cprop pass pairs "helps")

I suggest to open a bugreport for this.  Jeff may want to look at
the threading issue (I believe the backward threader _does_ iterate).

I attach a test-case to reproduce an issue.

Follow-Ups:
- [Bug tree-optimization/78496] [7 Regression] Missed opportunities for jump threading
  - From: pinskia at gcc dot gnu.org

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]