User account creation filtered due to spam.
Created attachment 22452 [details]
the problem here seems to be worse regalloc and also
# D.4060_6 = PHI <-1(2), -1(9), -1(11), -1(14), 0(15), -1(10)>
used to be optimized into since set of var to -1 (4 bytes), while now we
produce 3 different copies.
Crossjumping would unify it, but very late in the game. The problem is that
ifcvt actually moves the set before conditoinal guarding the BB in question, so
the individual sets are drifted earlier to different places in the program.
Doing so might also complicate the regalloc.
Michael, perhaps we can tell out-of-ssa to unify such cases? They are not that
infrequent (and I think old tree based out-of-ssa did that?)
Yep. That's one optimization that was removed (out-of-SSA did that) and
we thought of doing this reverse mergephi optimization as a separate pass
# _6 = PHI <-1(17), 0(15), -1(10)>
It is not as bad on the trunk now but I don't think it has been fixed.
I am going to try to work on this so it can be in for stage 1 of 7.
PRE does some of it via tail merge:
find_duplicates: <bb 7> duplicate of <bb 8>
find_duplicates: <bb 7> duplicate of <bb 9>
Removing basic block 8
Removing basic block 9
Obviously if you have more complex code it won't do it.
This an example where PRE does it:
int f(int a, int b, int c)
The idea is to add forwarder blocks here. Of course doing this too aggressively may be bad, not sure (extra jumps instead of extra copies). Eventually the
targets want some control on this.
GCC 7.1 has been released.