This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, 1/2] shrink wrap a function with a single loop: copy propagation
- From: Zhenqiang Chen <zhenqiang dot chen at linaro dot org>
- To: Jeff Law <law at redhat dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 9 May 2014 15:30:21 +0800
- Subject: Re: [PATCH, 1/2] shrink wrap a function with a single loop: copy propagation
- Authentication-results: sourceware.org; auth=none
- References: <CACgzC7C5gtZKnuLoVc53NFOEbNa81e-Mh75BrEOq-DdyVNfr3A at mail dot gmail dot com> <536C6EF1 dot 6080702 at redhat dot com>
On 9 May 2014 14:00, Jeff Law <law@redhat.com> wrote:
> On 05/08/14 02:06, Zhenqiang Chen wrote:
>>
>> Hi,
>>
>> Similar issue was discussed in thread
>> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01145.html. The patches
>> are close to Jeff's suggestion: "sink just the moves out of the
>> incoming argument registers".
>>
>> The patch and following one try to shrink wrap a function with a
>> single loop, which can not be handled by
>> split_live_ranges_for_shrink_wrap and prepare_shrink_wrap, since the
>> induction variable has more than one definitions. Take the test case
>> in the patch as example, the pseudo code before shrink-wrap is like:
>>
>> p = p2
>> if (!p) goto return
>> L1: ...
>> p = ...
>> ...
>> goto L1
>> ...
>> return:
>>
>> Function prepare_shrink_wrap does PRE like optimization to sink some
>> copies from entry block to the live block. The patches enhance
>> prepare_shrink_wrap with:
>> (1) Replace the reference of p to p2 in the entry block. (This patch)
>> (2) Create a new basic block on the live edge to hold the copy "p =
>> p2". (Next patch)
>>
>> After shrink-wrap, the pseudo code would like:
>>
>> if (!p2) goto return
>> p = p2
>> L1: ...
>> p = ...
>> ...
>> goto L1
>> return:
>
> Right. Seems like a reasonably useful transformation. Not the totally
> general solution, but one that clearly has value.
>
>
>
>> 2014-05-08 Zhenqiang Chen <zhenqiang.chen@linaro.org>
>>
>> * function.c (last_or_compare_p, try_copy_prop): new functions.
>> (move_insn_for_shrink_wrap): try copy propagation.
>> (prepare_shrink_wrap): Separate last_uses from uses.
>>
>> testsuite/ChangeLog:
>> 2014-05-08 Zhenqiang Chen <zhenqiang.chen@linaro.org>
>>
>> * shrink-wrap-loop.c: New test case.
>
> So first at a high level, Steven B.'s recommendation to pull the
> shrink-wrapping bits out of function.c is a good one. I'd really like to
> see that happen as well.
I will do it.
>> +/* Check whether INSN is the last insn in BB or
>> + a COMPARE for the last insn in BB. */
>> +
>> +static bool
>> +last_or_compare_p (basic_block bb, rtx insn)
>> +{
>> + rtx x = single_set (insn);
>> +
>> + if ((insn == BB_END (bb))
>> + || ((insn == PREV_INSN (BB_END (bb)))
>> + && x && REG_P (SET_DEST (x))
>> + && GET_MODE_CLASS (GET_MODE (SET_DEST (x))) == MODE_CC))
>> + return true;
>> +
>> + return false;
>
> So you don't actually check if the insn is a compare, just that it's
> destination is MODE_CC. That's probably close enough, but please note it in
> the comment. Folks are going to read the comment first, not the
> implementation.
>
> Q. Do we have to do anything special for HAVE_cc0 targets here?
I have not think about it. I will recheck it.
>> +}
>> +
>> +/* Try to copy propagate INSN with SRC and DEST in BB to the last COMPARE
>> + or JUMP insn, which use registers in LAST_USES. */
>
> So why restrict this to just cases where we have to propagate into a COMPARE
> at the end of a block? So in your example, assume the first block looks
> like
prepare_shrink_wrap will move_insn_for_shrink_wrap in BB_INSNS_REVERSE
order. Current prepare_shrink_wrap should handle the case since there
is no data flow dependence.
> p = p2;
> if (!q) return;
> [ ... ]
>
> We ought to be able to turn that into
>
> if (!q) return
> p = p2;
> [ ... ]
>
>
>
>> +
>> +static bool
>> +try_copy_prop (basic_block bb, rtx insn, rtx src, rtx dest,
>> + HARD_REG_SET *last_uses)
>
> My first thought here was that we must have some code which does 90% of what
> you need. Did you look at any of the existing RTL optimization
> infrastructure to see if there was code you could extend to handle this?
Most the codes are from function copyprop_hardreg_forward_1 in
"cprop_hardreg" pass. Previously I had a patch to reuse it:
http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00305.html. What do you
think about the old patch?
Thanks!
-Zhenqiang