0005-Search-all-dominated-blocks-for-expressions-to-hoist.patch

Fri Jul 2 16:08:00 GMT 2010

On 7/1/10 8:36 PM, Jeff Law wrote:
> On 06/30/10 13:30, Maxim Kuvyrkov wrote:
>>>
>>> It looks like you've got a bi-directional dataflow problem to clean up
>>> VBEout, in general we really want to avoid bi-directional problems. Is
>>> there some reason you're not using a lowest common ancestor algorithm
>>> here?
>>
>>
>> A dataflow problem seems simpler for this case. The problem uses
>> bi-directional links to compute a set of expressions that will be
>> subtracted from VBEout on each iteration, it never adds new
>> expressions to destination sets. I, therefore, claim that the fact
>> that this particular problem uses bi-directional links does not really
>> have any significant negative impact.
> While this specific case may be reasonably safe, in general we really
> want to avoid bi-directional dataflow solvers as it's often hard to
> prove termination, it's hard to evaluate their compile-time performance,
> they're typically more difficult to understand for anyone reading the
> code, and (personal opinion here) often they're a symptom of not solving
> the right problem.
>
>> Interesting. While working on implementing cleanup of VBEout sets I
>> oscillated several times between bottom-up dominator tree walk and
>> iterative walk. At some point I got myself convinced that dominator
>> walk would do just as fine the job as iterative walk. Then I come up
>> with the above case which needs more than a single walk to get to the
>> right solution. Now you pointed out that the above case is invalid,
>> which makes me think that a dominator walk would suffice after all.
>>
>> Still, the iterative solution looks better to me as it makes it
>> crystal clear that only expressions that definitely won't be hoisted
>> to BB will be removed from BB's VBEout.
>>
>> Does this make sense?
> A little bit. But it's still unclear why we're not using lowest common
> ancestor here.

It appears we were thinking about different approaches to using LCA to 
solve this problem, the algorithm I thought you were suggesting would've 
been bulky and slow.

Now I see that the problem can be solved reasonably fast with LCA too. 
I don't yet have all the details figured out, so if you have a clear 
picture of the algorithm in mind, please don't hold it to yourself ;)

> 1. Why precisely to do we need transpout for hoisting. We should only be
> hoisting a very busy expression into a block which dominates the
> original evaluations. We don't insert on edges, so we don't have to
> worry about splitting abnormals. The only thing I can think of is
> perhaps the hoisting might require new edges in the destination block in
> the expression potentially traps?!?

I can't give a definitive answer to if and why hoisting needs transpout. 
  It seems to me transpout can be safely removed if we just avoid 
propagating data across complex edges in compute_vbeinout and 
hoist_expr_reaches_here_p.

That said, I would not check in such a clean up in the same patch as 
improving code hoisting to look into non-immediately-dominated blocks. 
Let's keep bugs these two changes can introduce separate.

>
> 2. Assuming there's a good reason for #1, for correctness we should drop
> transpout and instead use the same method as PRE.

I don't think there is.  Anyway, we will find out once I or someone else 
implement removal of transpout.

Jeff, do you remember why transpout sets were introduced in the first place?

Thanks,

-- 
Maxim Kuvyrkov
CodeSourcery
maxim@codesourcery.com
(650) 331-3385 x724