This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Question about how to fix PR69052
- From: "Bin.Cheng" <amker dot cheng at gmail dot com>
- To: Bernd Schmidt <bernds_cb1 at t-online dot de>
- Cc: Jeff Law <law at redhat dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Tue, 26 Jan 2016 09:48:11 +0000
- Subject: Re: Question about how to fix PR69052
- Authentication-results: sourceware.org; auth=none
- References: <CAHFci280ObgaSxzCBiKKBfKONeYsXOue8i84hH=pDwynhC4PNQ at mail dot gmail dot com> <56A67CD9 dot 70405 at redhat dot com> <56A68017 dot 9020304 at t-online dot de>
On Mon, Jan 25, 2016 at 8:05 PM, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> On 01/25/2016 08:51 PM, Jeff Law wrote:
>>
>> No, the combiner works within a basic block only. There was a group, I
>> believe in Moscow, that worked on a cross-block combiner. It was
>> discussed at the Cauldron in California a few years back. I don't know
>> if they did any further work on those ideas.
>>
>> Is the issue here not knowing when the loop invariant can be forward
>> propagated back into the memory reference -- and for complex RTL like
>> you cited, you ultimately determine that no it can't be propagated and
>> thus increase its invariant cost?
>
>
> Is this just a pass ordering problem? What happens if you do loop-invariant
> after combine?
Yes, I moved whole loop pass (also the pass_web) after combine and it
worked. A combine pass before loop-invariant can fix this problem.
Below passes are currently between loop transform and combine:
NEXT_PASS (pass_web);
NEXT_PASS (pass_rtl_cprop);
NEXT_PASS (pass_cse2);
NEXT_PASS (pass_rtl_dse1);
NEXT_PASS (pass_rtl_fwprop_addr);
NEXT_PASS (pass_inc_dec);
NEXT_PASS (pass_initialize_regs);
NEXT_PASS (pass_ud_rtl_dce);
I think pass_web needs to be after loop transform because it's used to
handle unrolled register live range.
pass_fwprop_addr and pass_inc_dec should stay where they are now. And
putting pass_inc_dec before loop unroll may be helpful to keep more
auto increment addressing mode chosen by IVO.
We should not need to duplicate pass_initialize_regs.
So what's about pass_rtl_cprop, cse2 and dse1. Should these pass be
duplicated after loop transform thus loop transformed code can be
cleaned up?
Thanks,
bin
>
>
> Bernd
>