This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Serious performance regression -- some tree optimizer questions
- From: Daniel Berlin <dberlin at dberlin dot org>
- To: Ulrich Weigand <uweigand at de dot ibm dot com>
- Cc: gcc at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org
- Date: Fri, 17 Dec 2004 20:10:38 -0500 (EST)
- Subject: Re: Serious performance regression -- some tree optimizer questions
- References: <200412172249.iBHMngZH010315@53v30g16.boeblingen.de.ibm.com>
but no pass recognizes that s2 == t2 ...
Uh, both of the value numbering passes (DOM and PRE) should recognize this.
That doesn't appear to be the case; from the .t44.pre file:
[snip]
Created value VH.17 for VH.15 + VH.16
Created value VH.18 for VH.14 + VH.17
[snip]
Created value VH.26 for VH.14 + VH.15
Created value VH.27 for VH.16 + VH.26
[snip]
It doesn't recognize that VH.18 and VH.27 are really the same value.
In fact, the comment at the head of tree-ssa-pre.c makes me think
this is by design:
4. Our canonicalization of expressions during lookups don't take
constants into account very well. In particular, we don't fold
anywhere, so we can get situations where we stupidly think
something is a new value (a + 1 + 1 vs a + 2). This is somewhat
expensive to fix, but it does expose a lot more eliminations.
It may or not be worth it, depending on how critical you
consider PRE vs just plain GRE.
I have a patch to fix what i was talking about, but it may not be
approriate for 4.0.
However I read your message wrong as to what you wanted optimized.
You are correct, this was deliberately not done.
The identities you are looking for were expensive (compile time wise) to
implement, and nobody ever showed it was worth it.
Basically, when asked to add or lookup a value expression to the value
numbering table, you can also backsubstitute the representative values (we
know, given a value handle, the set of expressions that represent it) and add
them to /look them up in the value numbering table as well.
IE we make VH.18 be VH.14 + VH.17 and VH.14 + VH.15 + VH.16
and VH.27 be VH.16 + VH.26 and VH.16 + VH.14 + VH.15.
This is easy to implement for the simple case, but it can cost a *lot* to
fully backsubstitute and add all of the equivalences to the table.
I've attached a patch that will do the optimization you request.
The masturbation in "put_operands_in_right_order" is because both
iterative_hash_expr, *and* operand_equal_p claim VH.0 + VH.1 + VH.3 is not
the same as VH.1 + VH.0 + VH.3 because of the form of the expressions.
Note that this patch is just an example.
It would need a bunch of grunt work to be right and useful
i.e.
1. if tree-vn is going to use value_node, we should have a tree-vn.h that
defines it instead of just copying an dpasting the definitions from
tree-ssa-pre
2. We need to minimize the number of trees we build. For commutative
operations, we should be able to just swap all the trees into the right
places without building a new tree.
In the testident.c case, one of canonicalized forms is them is
<PLUS_EXPR <op0, <PLUS_EXPR, op1, op2>> and the other is
<PLUS_EXPR <PLUS_EXPR <op0, op1>>, op2>, so you actually need to move the
PLUS_EXPR to operand 0, and change its operations.
3. All of the restrictions in backsubstitute were for convenience of
demonstration (IE that the tree code of the thing we are backsubstituting
has the same tree code of the base expression we are backsubstituting
into. This restriction).
4. other obvious things.
It should, however, work for your testcase.
It works for the attached testcase, which is the same type of thing.
--DanAttachment:
vnident.diff
Description: Text document
Attachment:
testident.c
Description: Text document