[PATCH] Don't lower MIN_EXPR and MAX_EXPR [was Re: GCC Benchmarks (coybench), AMD64 and i686, 14 August 2004]

Daniel Berlin dberlin@dberlin.org
Thu Aug 19 14:40:00 GMT 2004


On Aug 19, 2004, at 10:11 AM, Paolo Bonzini wrote:

>> I've no problem with the lowering or not of MIN_EXPR, MAX_EXPR and
>> ABS_EXPR, but it seems to me that the more fundamental problem is
>> that we're failing to CSE the phi nodes themselves.
>
> Yes.  I know this is only a work around.
>
>> For example, it would be nice to optimize
>> 	x = a ? b : c;
>> 	y = a ? b : c;
>> into the equivalent
>> 	x = a ? b : c;
>> 	y = x;
>
> Not only nice -- I think I would have expected it even before 
> tree-SSA...

Doesn't GVN-PRE catch this?

Oh, wait, i never value numbered the phi node arguments as the phi node 
result, and didn't make eliminate look up phi nodes, so it won't catch 
it :).
But i could make it do so in about 2 hours if you guys want it.
:)



>
>> Alas I'm not a tree-ssa expert but the above transformations would
>> appear doable with GCC's current infrastructure.
>
> I don't know.  I'd expect that PRE/FRE do this, but these things are 
> not trivial with SSA because the basic blocks in the first phi node do 
> not dominate those in the second phi node.

Right.


>   As far as I could understand, SSA makes many optimizations easier, 
> but they are still going to require solving data flow equations, etc., 
> unless only follow dominator->dominated edges.
>
> For example, in this case
>
>    if (a1 < 0)
>      z1 = a1 + b1;
>    else
>      z2 = b1;
>    z3 = PHI (z1, z2)
>
>    if (a1 < 0)
>      w1 = a1 + b1;
>    else
>      w2 = a1;
>    w3 = PHI (w1, w2)
>
> You cannot simply transform "w1 = z1" in the assignment for w1 because 
> this does not respect SSA's dominance property (all defs must dominate 
> their uses).
>

This is a partial partial case, i think, (a1 + b1 is partially 
available, partially anticipated), which we could handle in PRE, but 
it's very expensive, and benchmarks I've done (and that Thomas Van 
Drujen had done during his thesis work on GVN-PRE) show its not worth 
it.
For example, it triggered a total of three times during bootstrap (and 
0 times during SPEC) when I had it implemented in GVN-PRE, but made 
GVN-PRE 2x slower overall (in the general case).
That's pretty much within my definition of "not worth it".
:)


> mole's testcase is simpler but, in concrete terms, it looks like SSA 
> excels at handling complex nests of if's and while's more than at 
> optimizing adjacent ones.  Again, I'm not sure if this is right, but 
> given this particular optimizer deficiency, it seems plausible.
>
> Paolo
>



More information about the Gcc-patches mailing list