[RFC] Improve Tree-SSA if-conversion - convergence of efforts

trevor_smigiel@playstation.sony.com trevor_smigiel@playstation.sony.com
Wed Sep 12 22:10:00 GMT 2007


Tehila asked me a while ago to comment based on my experience with the
RTL if convert pass and the discussions some of us had at the GCC
summit.  Sorry it took me so long to respond.

The target I care about (Cell SPU) has some things that make an
aggressive if convert very useful and profitable.  It has conditional
moves for every mode (there is a single, unified register file), never
traps on illegal addresses (addresses always wrap to the 256KB address
space), and branches are expensive (there is no hardware cache).

The main limitation with the RTL if-convert pass is that it only
recognizes specific patterns.  It is easy to write a complicated if
statement (just using normal C with &&/||) that would never get
converted because it ends up with basic blocks that have many in edges
that if-convert generally doesn't handle.   (In our internal tree we
modified the RTL pass to handle some cases of multiple in-edges, and can
handle any number of insns in a basic block.)

I haven't looked at the tree-SSA if-convert code yet, but based on what
was described to me at the summit it seemed to be taking the same
approach as the RTL pass.  Recognize certain patterns and convert it.

I would like to see an approach that is able to take an arbitrary flow
graph without backedges and convert it to straight line code, limited
only by the cost model and impossible cases (e.g., inline asm).  

I'm not sure how that would be achieved in a target neutral way.

Trevor


* Tehila Meyzels <TEHILA@il.ibm.com> [2007-07-31 06:50]:
> 
> Hi,
> 
> I'd like to bring up on the list a discussion that a bunch of people (most
> of those CC-ed above) started at the GCC Summit:
> 
> Lately, there were few efforts, that are not necessarily related to each
> other, but are all relevant to if-conversion.
> Each of them has its own restriction, like a specific control-flow, target
> dependent information, permission to transform speculative loads, etc.
> 
> Few patches that I'm aware of are:
> 1.  Conditional store sinking, by Michael Matz:
> http://gcc.gnu.org/ml/gcc-patches/2007-04/msg00724.html
> 
> 2. If -conversion for multiple IF_THEN_ELSE clauses, by Victor Kaplansky:
> http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00265.html
> Also mentioned here:  http://gcc.gnu.org/wiki/AutovectBranchOptimizations
> (2.3.3)
> 
> 3.  (unconditional) Store sinking (4.1.1 based), by Revital Eres and Victor
> Kaplansky:
> http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00265.html (same patch as
> previous)
> Also mentioned here:  http://gcc.gnu.org/wiki/AutovectBranchOptimizations
> (2.3.2)
> 
> 4. Conditional load hoisting (4.1.1 based), by myself:
> http://gcc.gnu.org/ml/gcc-patches/2007-07/msg02168.html
> 
> 5. Maybe more?
> 
> You're welcome to share your/others related works here...
> 
> 
> I'd like to suggest to converge all these efforts into a single improved
> tree-level if-conversion pass (i.e., on the top of tree-if-conv.c).
> Currently, the tree-level if-conversion pass is quite limited in several
> ways, and mostly with respect to handling of loads/stores (it basically
> doesn't handle them), but not only.
> 
> There are several reasons why to store-sinking and load-hoisting should be
> combined with the if-conversion pass:
> 1. Store-sinking/load hoisting effect one another and they both can create
> new opportunities for if-conversion (not only in vectorizable loops, for
> example).
>     Currently, load-store motion pass happens too late and thus don't help
> the (tree-ssa) if-converter.
> 2. Store-sinking/load hoisting may have an overhead and may degrade
> performance unless the relevant conditional branch gets if-converted.
> 
> Issues/Questions to be considered and discussed:
> 1. Cost model and machine dependency issues:
>     - When is it profitable to perform these motions? What is the algorithm
> to decide whether there is a good chance for if-conversion?
>     - Target dependency - What to check?
>       A. Are there scalar select/cmove/predicated instructions  (like in
> SPU)?
>       B. Are there vector select/cmove/predicated instructions (like in
> PowerPC)? + will  the loop be vectorized?
>       C. Are speculative loads allowed? Do memory accesses trap?
>       D. More?
> 
> 2. Which transformations we want to take care of in this pass?
>    A. Conditional/unconditional loads/stores.
>    B. PHI nodes with operands that are neither constants nor SSA NAMES
> (Currently, this is not supported in tree-if-conv.c).
>    C. PHIOPT transformations (i.e., merge the PHIOPT pass into this pass
> maybe)?
>    D More?
> 3. New control-flow graphs we want to support (besides the regular
> IF_THEN_ELSE, diamond-based):
>     A. Nested diamonds.
>     B. Sequential diamonds.
>     C. More?
> 4. After we complete this pass, will the RTL-level ifcvt be needed?
>     I guess the answer is yes, but I would like to hear more opinions.
> 
> Any comments/ideas/thoughts are really appreciated.
> 
> Thanks,
> Tehila.
> 
> 



More information about the Gcc-patches mailing list