[PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

Richard Biener richard.guenther@gmail.com
Fri Nov 19 09:21:49 GMT 2021

On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
> At issue here is the dynamic approach we currently use for outgoing edge
> calculations.  It isn't normally a problem, but once you get a very
> large number of possible outgoing values (ie very long unrolled blocks)
> with pairs of values on a statement, and individual queries for each
> one, it becomes exponential if they relate to each other.
> So.  We previously introduced a param for PR 100081 which limited the
> depth of logical expressions GORI would pursue because we always make 2
> or 4 queries for each logical expression, which accelerated the
> exponential behaviour much quicker.
> This patch reuses that to apply to any statement which has 2 ssa-names,
> as the potential for 2 queries can happen then as well..  leading to
> exponential behaviour.  Each statement we step back through with
> multiple ssa-names decreases the odds of calculating a useful outgoing
> range anyway.  This will remove ridiculous behaviour like this PR
> demonstrates.
> Next release I plan to revamp GORI to cache outgoing values, which will
> eliminate all this sort of behaviour, and we can remove all such
> restrictions.
> Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK for trunk?

I think it's reasonable.  SCEV tries to limit the overall number of SSA -> def
visits, capturing deep vs. wide in a more uniform (and single) way.
(--param scev-max-expr-size and the duplicate - huh - --param

For SCEV running into the limit means fatal fail of the analysis, for ranger
it only means less refinement?  So it might make a difference which path
you visit and which not (just thinking about reproducability of range analysis
when we say, swap operands of a stmt)?


> Andrew
> PS. This also points out something we are investigating with the
> threader.  There are no uses of _14 outside the block, so asking for its
> range is problematic and contributing to excess work and cache filling
> that we don't need to be doing.  The VRP passes do not demonstrate this
> behaviour unless, as observed, we ask for details output which queries
> *all* the names.

More information about the Gcc-patches mailing list