Created attachment 22452 [details]
the problem here seems to be worse regalloc and also
# D.4060_6 = PHI <-1(2), -1(9), -1(11), -1(14), 0(15), -1(10)>
used to be optimized into since set of var to -1 (4 bytes), while now we
produce 3 different copies.
Crossjumping would unify it, but very late in the game. The problem is that
ifcvt actually moves the set before conditoinal guarding the BB in question, so
the individual sets are drifted earlier to different places in the program.
Doing so might also complicate the regalloc.
Michael, perhaps we can tell out-of-ssa to unify such cases? They are not that
infrequent (and I think old tree based out-of-ssa did that?)
Yep. That's one optimization that was removed (out-of-SSA did that) and
we thought of doing this reverse mergephi optimization as a separate pass