[PATCH][RFC] Add FRE in pass_vectorize
Jeff Law
law@redhat.com
Thu Jul 2 17:52:00 GMT 2015
On 07/02/2015 05:40 AM, Alan Lawrence wrote:
> Jeff Law wrote:
>> On 06/24/2015 01:59 AM, Richard Biener wrote:
>>> And then there is the possibility of making passes generate less
>>> needs to perform cleanups after them - like in the present case
>>> with the redundant IVs make them more appearant redundant by
>>> CSEing the initial value and step during vectorizer code generation.
>>> I'm playing with the idea of adding a simple CSE machinery to
>>> the gimple_build () interface (aka match-and-simplify). It
>>> eventually invokes (well, not currently, but that can be fixed)
>>> maybe_push_res_to_seq which is a good place to maintain a
>>> table of already generated expressions. That of course only
>>> works if you either always append to the same sequence or at least
>>> insert at the same place.
>> As you know we've gone back and forth on this in the past. It's
>> always a trade-off. I still ponder from time to time putting the
>> simple CSE and cprop bits back into the SSA rewriting phase to avoid
>> generating all kinds of garbage that just needs to be cleaned up later
>> -- particularly for incremental SSA updates.
>
> Coming to this rather late, and without the background knowledge about
> having gone back and forth, sorry! But what are the arguments against
> this? Am I right in thinking that the "SSA Rewriting" phase would not
> trigger as often as gimple_build(), or are these the same thing?
It's the into-ssa and incremental update phases. The basic idea is it
is very inexpensive to do const/copy propagation and simple CSE at that
point.
When processing an assignment, after rewriting the inputs from _DECL
nodes to SSA_NAMEs, you lookup the RHS in your hash table. If you get a
hit, you replace the expression with the SSA_NAME from the hash table
and record that the destination has an equivalence.
Diego took this out several years ago with the idea that the into-ssa &
updates should be kept separate from optimizations. With the ongoing
need for early cleanups to make IPA more effective, I think it's time to
revisit that decision as we get a lot of the obvious redundancies out of
the stream by just being smart during into-ssa. Which in turn means we
don't have to do as much in the early optimizations before IPA.
>
> Presumably when you say "simple CSE machinery" you'd have to bail out
> quickly from tricky cases like, say:
>
> if (P)
> {
> use ...expr...
> }
> ...
> if (Q)
> {
> now building a new ...expr... here
> }
Not sure the problem here. The simple CSE/cprop occurs as we're going
into SSA form -- because into-ssa is inherently a dominator walk and
we're rewriting operands as we go, we can trivially determine that we've
already seen a given expression earlier in the dominator tree and that
the result of that expression hasn't changed (by the nature of SSA).
Jeff
More information about the Gcc-patches
mailing list