This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: fold() can't fold simple expressions?

On 09/15/2016 02:06 AM, Richard Biener wrote:
On Wed, Sep 14, 2016 at 9:49 PM, Jeff Law <> wrote:
On 09/14/2016 01:29 PM, Richard Biener wrote:

It's what match-and-simplify does as well.

I question the need to build GENERIC here though.  M-a-s happily
gets you a simplified expression as sequence of GIMPLE
statements.  (But does not yet provide a way to build a
simplified GENERIC expression from GIMPLE IL)

At a particular site we're going to have an expression where we
want to evaluate the operands (either to a constant or

For each operand in the expression we recursively raise the query
"what is this operand".  THe result (potentially another
expression) gets substituted in for the operand and the process
continues recursively. Recursion stops when you get a constant or
when you've walked back "too far" or the expression gets "too

Once you get something like (x+y)+z you're no longer gimple.  But
we need to store that expression in some form.

I'm not sure you absolutely need to store it -- it is present in the
IL (in unoptimized form) after all.
But we don't want to change the IL :-) We want to analyze the IL to determine properties of SSA_NAMEs on various paths through the CFG.

The current scheme of mapping range data to an SSA_NAME was an improvement and that range data is starting to be used in various places in the compiler. But that scheme loses extremely valuable data that warnings can and should be using.

We typically have ASSERT_EXPRs at branch points in the CFG. The ASSERT_EXPRs carry range information for a path. However we do not consider an ASSERT_EXPR a true copy. ie, after the merge point of the paths uses continue to refer to the original object (the RHS of the ASSERT_EXPRs).

Thus after VRP most of the path sensitive data is dropped on the floor.

Aldy explored making ASSERT_EXPRs true copies. It's a simple change to make and certainly gets PHIs at the merge point. So it seems like a good start.

The downside is there's a significant compile-time cost (via update_ssa) to this scheme. Furthermore, we believe that we only need path specific data for a subset of the SSA_NAMEs used in conditionals. So much of the time is wasted computing information we don't actually need.

Worse yet, copy-prop takes the copy created by the ASSERT_EXPR, propagates it to the RHS of the PHI and the PHI ends up as a degenerate and gets removed -- at which point we've lost the path specific range data again. Ugh.

If we tell copy-prop not to propagate those copies until we're done with the range data we run the real risk of making code worse and it's just a gross hack, similar how we have to test if an SSA_NAME is used in an abnormal PHI hack.

So the scheme we're exploring now is an on-demand generation of path specific range data using back substitution. It's primarily Andrew's ideas, but just happens to mirror parts of the predicate analysis one finds in tree-ssa-uninit.c and my thoughts on how we should be finding/exploiting path specific equivalences for threading and CSE.

That could be expressed as a vector of operators/operands.  It
could be expressed as a generic tree.  Whatever.  Generic trees as
a reasonably natural way to store the expression as its built, but
fold() those arbitrary expressions doesn't work.  m-a-s may be the
answer here, I haven't really thought about it yet.

The question is always full generality / complexity vs. working on a
production compiler where compile-time and memory use matter.  If you
can create a testcase, even if artificial, where we blow up worse
than N log N this is bad. [current VRP equivalences are such an

Oh, and if you are re-writing VRP range representation please get rid
of anti-ranges -- one "simple" extension of current VRP would get rid
of them by changing the current rep of a single range/anti-range to
an implicit union of two ranges (with that you can represent an
anti-range). Make the number of unioned ranges parametric and you end
up with sth quite general.
Andrew's scheme (among other things) is designed to give us that flexibility. One of the things we want to explore is how many ranges we really need in practice. It's at least 2 (so we can support anti-ranges), but we don't have a sense of how useful > 2 is in practice.

And symbolic ranges should be IMHO represented by maintaining a set
of active true/false predicates rather than trying to come up with a
"range" representation.  And for predicates we'd like to have a
mini-IL like tree-affine -- this would be generally useful elsewhere
in the compiler as well.
That would be possible as well -- essentially it's a representational issue. If you look in tree-ssa-uninit.c you'll see something like this. I'm still looking for an victim to rip tree-ssa-uninit.c apart so that its predicate analysis can be used in other places.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]