This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH]: Fix PR tree-optimization/27755, add partial anticipation


This is the remainder of the non-VN PRE changes i had planned for stage1.

It turns out that the thesis algorithm missed something very important
about ANTIC calclulation that all other PRE's don't.  In particular,
because ANTIC is an intersection problem, it needs to be initialized
to the maximal set of values, rather than nothing, or else, it won't
get the maximal fixed point possible.

This is what causes the failure to move a value in PR 27755, we end up
not noticing a value can be ANTIC.

To fix this flaw, we need to keep a set containing the maximal set of
expressions that could be ANTIC, and use it to initialize the ANTIC
problem.

This maximal set often contains a very large number of values.
hundreds on small functions, thousands on medium size functions.

Phi translation of this set often takes a very long time.  To avoid
phi translation of the maximal set, we end up deferring blocks during
dataflow calculation until their successors have been processed (in
effect, we are changing the block processing order slightly), which
means we will never have to phi translate the maximal set.  The
successor set has already been cleaned up impossible values, and thus,
is usually an order of magnitude or more smaller.

To give you some idea of how much this matters, on tramp3d-v4, without
doing this, PRE takes 26 seconds (without this patch at all, it takes
11 seconds).  With deferring, we get 11 seconds (IE no time at all).

Besides this algorithmic fix, i've also added support for partial
anticipation.  Partial anticipation is what happens when the value can
be made live on some paths leading to a block, instead of all paths
leading to a block.  Supporting this is just an optimality issue, and
because it requires calculation of another dataflow problem (IE slows
down PRE by about 20-30%), i've made it only enabled at -O3.  This is
not really a knob that is worth adding, the amount of partial
anticipation related redundancies that occurs in real programs is
small (<10%).  I've yet to find a case where it truly matters, which
is why it is relegated to -O3 :).  It is more for completeness than
anything else.

Bootstrapped and regtested on i686-darwin

Committed to mainline.

Attachment: 27755.diff.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]