This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][RFC] Add FRE in pass_vectorize


On 06/10/2015 08:02 AM, Richard Biener wrote:

The following patch adds FRE after vectorization which is needed
for IVOPTs to remove redundant PHI nodes (well, I'm testing a
patch for FRE that will do it already there).
Redundant or degenerates which should be propagated?

I believe Alan Lawrence has run into similar issues (unpropagated degenerates) with his changes to make loop header copying more aggressive. Threading will also create them. The phi-only propagator may be the solution. It ought to be cheaper than FRE.



The patch also makes FRE preserve loop-closed SSA form and thus
make it suitable for use in the loop pipeline.
Loop optimizations will tend to create opportunities for redundancy elimination, so the ability to use FRE in the loop pipeline seems like a good thing. We ran into this in RTL land, so I'm not surprised to see it occurring in the gimple optimizers and thus I'm not opposed to running FRE in the loop pipeline.




With the placement in the vectorizer sub-pass FRE will effectively
be enabled by -O3 only (well, or if one requests loop vectorization).
I've considered placing it after complete_unroll instead but that
would enable it at -O1 already.  I have no strong opinion on the
exact placement, but it should help all passes between vectorizing
and ivopts for vectorized loops.
For -O3/vectorization it seems like a no-brainer. -O1 less so. IIRC we conditionalize -frerun-cse-after-loop on -O2 which seems more appropriate than doing it with -O1.


Any other suggestions on pass placement?  I can of course key
that FRE run on -O3 explicitely.  Not sure if we at this point
want to start playing fancy games like setting a property
when a pass (likely) generated redundancies that are worth
fixing up and then key FRE on that one (it gets harder and
less predictable what transforms are run on code).
RTL CSE is bloody expensive and so many times I wanted the ability to know a bit about what the loop optimizer had done (or not done) so that I could conditionally skip the second CSE pass. We never built that, but it's something I've wanted for decades.

Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]