This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Serious performance regression -- some tree optimizer questions



but no pass recognizes that s2 == t2 ...

Uh, both of the value numbering passes (DOM and PRE) should recognize this.

That doesn't appear to be the case; from the .t44.pre file:


[snip]
Created value VH.17 for VH.15 + VH.16
Created value VH.18 for VH.14 + VH.17
[snip]
Created value VH.26 for VH.14 + VH.15
Created value VH.27 for VH.16 + VH.26
[snip]

It doesn't recognize that VH.18 and VH.27 are really the same value.
In fact, the comment at the head of tree-ssa-pre.c makes me think
this is by design:

  4. Our canonicalization of expressions during lookups don't take
     constants into account very well.  In particular, we don't fold
     anywhere, so we can get situations where we stupidly think
     something is a new value (a + 1 + 1 vs a + 2).  This is somewhat
     expensive to fix, but it does expose a lot more eliminations.
     It may or not be worth it, depending on how critical you
     consider PRE vs just plain GRE.

I have a patch to fix what i was talking about, but it may not be approriate for 4.0.

However I read your message wrong as to what you wanted optimized.
You are correct, this was deliberately not done.

The identities you are looking for were expensive (compile time wise) to implement, and nobody ever showed it was worth it.

Basically, when asked to add or lookup a value expression to the value numbering table, you can also backsubstitute the representative values (we know, given a value handle, the set of expressions that represent it) and add them to /look them up in the value numbering table as well.

IE we make VH.18 be VH.14 + VH.17 and VH.14 + VH.15 + VH.16
and VH.27 be VH.16 + VH.26 and VH.16 + VH.14 + VH.15.

This is easy to implement for the simple case, but it can cost a *lot* to fully backsubstitute and add all of the equivalences to the table.

I've attached a patch that will do the optimization you request.

The masturbation in "put_operands_in_right_order" is because both iterative_hash_expr, *and* operand_equal_p claim VH.0 + VH.1 + VH.3 is not the same as VH.1 + VH.0 + VH.3 because of the form of the expressions.

Note that this patch is just an example.
It would need a bunch of grunt work to be right and useful
i.e.

1. if tree-vn is going to use value_node, we should have a tree-vn.h that defines it instead of just copying an dpasting the definitions from tree-ssa-pre
2. We need to minimize the number of trees we build. For commutative operations, we should be able to just swap all the trees into the right places without building a new tree.


In the testident.c case, one of canonicalized forms is them is
<PLUS_EXPR <op0, <PLUS_EXPR, op1, op2>> and the other is
<PLUS_EXPR <PLUS_EXPR <op0, op1>>, op2>, so you actually need to move the PLUS_EXPR to operand 0, and change its operations.


3. All of the restrictions in backsubstitute were for convenience of demonstration (IE that the tree code of the thing we are backsubstituting has the same tree code of the base expression we are backsubstituting into. This restriction).

4. other obvious things.

It should, however, work for your testcase.

It works for the attached testcase, which is the same type of thing.


--Dan

Attachment: vnident.diff
Description: Text document

Attachment: testident.c
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]