This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: A case exposing code sink issue


On Thu, Dec 29, 2011 at 9:50 AM, Jiangning Liu <jiangning.liu@arm.com> wrote:
>
>
>> -----Original Message-----
>> From: Jiangning Liu
>> Sent: Wednesday, December 28, 2011 5:38 PM
>> To: Jiangning Liu; 'Richard Guenther'
>> Cc: Michael Matz; gcc@gcc.gnu.org
>> Subject: RE: A case exposing code sink issue
>>
>>
>>
>> > -----Original Message-----
>> > From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf
>> Of
>> > Jiangning Liu
>> > Sent: Tuesday, December 27, 2011 5:10 PM
>> > To: 'Richard Guenther'
>> > Cc: Michael Matz; gcc@gcc.gnu.org
>> > Subject: RE: A case exposing code sink issue
>> >
>> > >
>> > > The job to do this is final value replacement, not sinking (we do
>> not
>> > > sink non-invariant expressions - you'd have to translate them
>> through
>> > > the loop-closed SSA exit PHI node, certainly doable, patches
>> > > welcome ;)).
>> > >
>> >
>> > Richard,
>> >
>> > In final value replacement, expression "&a + D.xxxx" can be figured
>> out,
>> > while "&a[i_xxx]" failed to be CHRECed, so I'm wondering if we should
>> > lower
>> > &a[i_xxx] to "&a + unitsize(a) * i_xxx" first? It seems GCC intends
>> to
>> > keep
>> > &a[i_xxx] until cfgexpand pass. Or we have to directly modify CHREC
>> > algorithm to get it calculated?
>> >
>> > Appreciate your kindly help in advance!
>> >
>>
>> Richard,
>>
>> Now I have a patch working for the case of step "i++", by directly
>> modifying
>> scalar evolution algorithm. the following code would be generated after
>> SCCP,
>> l
>> ? # i_13 = PHI <i_6(7), k_2(D)(4)>
>> ? a_p.0_4 = &a[i_13];
>> ? MEM[(int *)&a][i_13] = 100;
>> ? i_6 = i_13 + 1;
>> ? if (i_6 <= 999)
>> ? ? goto <bb 7>;
>> ? else
>> ? ? goto <bb 6>;
>>
>> <bb 6>:
>> ? a_p_lsm.5_11 = &MEM[(void *)&a + 3996B];
>> ? a_p = a_p_lsm.5_11;
>> ? goto <bb 3>;
>>
>> It looks good, but I still have problem when the case has step "i+=k".
>>
>> For this case the value of variable i exiting loop isn't invariant, the
>> algorithm below in scalar evolution doesn't work on it,
>>
>> compute_overall_effect_of_inner_loop()
>> {
>> ? ...
>> ? ? ? ? tree nb_iter = number_of_latch_executions (inner_loop);
>>
>> ? ? ? ? if (nb_iter == chrec_dont_know)
>> ? ? ? ? ? return chrec_dont_know;
>> ? ? ? ? else
>> ? ? ? ? ? {
>> ? ? ? ? ? ? tree res;
>>
>> ? ? ? ? ? ? /* evolution_fn is the evolution function in LOOP. ?Get
>> ? ? ? ? ? ? ? ?its value in the nb_iter-th iteration. ?*/
>> ? ? ? ? ? ? res = chrec_apply (inner_loop->num, evolution_fn, nb_iter);
>>
>> ? ? ? ? ? ? if (chrec_contains_symbols_defined_in_loop (res, loop->num))
>> ? ? ? ? ? ? ? res = instantiate_parameters (loop, res);
>>
>> ? ? ? ? ? ? /* Continue the computation until ending on a parent of LOOP.
>> */
>> ? ? ? ? ? ? return compute_overall_effect_of_inner_loop (loop, res);
>> ? ? ? ? ? }
>> }
>>
>> In theory, we can still have the transformation like below even if the
>> step
>> is "i+=k",
>>
>> ? # i_13 = PHI <i_6(7), k_2(D)(4)>
>> ? i_14 = i_13,
>> ? a_p.0_4 = &a[i_13];
>> ? MEM[(int *)&a][i_13] = 100;
>> ? i_6 = i_13 + k_2(D); ? ? // i+=k
>> ? if (i_6 <= 999)
>> ? ? goto <bb 7>;
>> ? else
>> ? ? goto <bb 6>;
>>
>> <bb 6>:
>> ? a_p_lsm.5_11 = &a[i_14];
>> ? a_p = a_p_lsm.5_11;
>> ? goto <bb 3>;
>>
>> But I realize this is not a loop closed SSA form at all, because i_14
>> is
>> being used out of the loop. Where could we extend the liverange of
>> variable
>> i in GCC infrastructure and finally solve this problem?
>>
>
> It seems many people are still in the happy of the upcoming 2012 New Year.
> :-)
>
> Following my story, I find the following code in tree-ssa-copy.c
>
> ? ? ?/* Avoid copy propagation from an inner into an outer loop.
> ? ? ? ? Otherwise, this may move loop variant variables outside of
> ? ? ? ? their loops and prevent coalescing opportunities. ?If the
> ? ? ? ? value was loop invariant, it will be hoisted by LICM and
> ? ? ? ? exposed for copy propagation.
> ? ? ? ? ??? ?The value will be always loop invariant.
> ? ? ? ? In loop-closed SSA form do not copy-propagate through
> ? ? ? ? PHI nodes in blocks with a loop exit edge predecessor. ?*/
> ? ? ?if (current_loops
> ? ? ? ? ?&& TREE_CODE (arg_value) == SSA_NAME
> ? ? ? ? ?&& (loop_depth_of_name (arg_value) > loop_depth_of_name (lhs)
> ? ? ? ? ? ? ?|| (loops_state_satisfies_p (LOOP_CLOSED_SSA)
> ? ? ? ? ? ? ? ? ?&& loop_exit_edge_p (e->src->loop_father, e))))
> ? ? ? ?{
> ? ? ? ? ?phi_val.value = lhs;
> ? ? ? ? ?break;
> ? ? ? ?}
>
> Here http://gcc.gnu.org/ml/gcc-patches/2006-12/msg00066.html, Dan said
>
> "The original check was not because of coalescing, but because we would
> copy prop in-loop variables outside the loop, causing *more*
> invariantness in nested loops."
>
> Can anybody give me a concrete example to explain this statement?

I can neither make sense of the code nor of Dans comment.  But of
course the part about loop-closed-SSA is still true.  I _suppose_ that
the situation is like

 tem = stuff;
 for ()
   x = tem;
 foo = x;

and Dan expects LIM to hoist x = tem (instead of "sinking" it by copyprop).
Then we'd coalesce tem and x (but it would be still live across the loop),
instead of coalescing x and foo (keeping tem live over the loop).  LIM of
course does not hoist this kind of stuff because it appears too cheap.

> Anyway, for my simple case, I don't see bad thing would happen when
> propagate &a[i] out of the loop. Also, after this propagation, "a_p.0_4 =
> &a[i_13];" within the loop would be dead code and removed in the passes
> afterwards. In my opinion, that way the computation would be reduced in the
> loop. Did I make any mistake?
>
>> > Thanks,
>> > -Jiangning
>> >
>> >
>> >
>>
>>
>
>
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]