This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: fwprop changes to fix pr41081
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: gcc-patches at gcc dot gnu dot org, Uros Bizjak <ubizjak at gmail dot com>, Alan Modra <amodra at gmail dot com>
- Date: Wed, 26 Jan 2011 07:45:47 -0800
- Subject: Re: fwprop changes to fix pr41081
- References: <20090822141848.GB17804@bubble.grove.modra.org> <AANLkTinXhPkrKZad6PcxNdinc_PitwVbzh1NE79+E6kK@mail.gmail.com>
On Wed, Jan 26, 2011 at 7:45 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sat, Aug 22, 2009 at 7:18 AM, Alan Modra <amodra@bigpond.net.au> wrote:
>> This patch teaches fwprop to replace (subreg (zero_extend (reg))) with
>> (reg) when the subreg is the low part and same mode as reg. ?Ditto for
>> sign_extend. ?On powerpc64 this helps dramatically with the sha1
>> testcase in PR 41081 where many of the unnecessary zero_extends are
>> not removed by combine. ?(I think it would be possible to make combine
>> operate on insns that feed into more than one other insn, but that's a
>> bigger change. ?Also, this fwprop optimisation is very similar to that
>> already done for paradoxical subregs.) ?The only slightly tricky thing
>> with this fwprop substitution is that you don't want to remove a
>> zero_extend that is free by virue of LOAD_EXTEND_OP style memory
>> loads. ?If you do, a sequence like
>>
>> ?(set (reg:SI x) (mem:SI ..))
>> ?(set (reg:DI y) (zero_extend:DI (reg:SI x)))
>> ?(.. (subreg:SI (reg:DI y)) ..)
>> ?(.. (reg:DI y) ..)
>>
>> ie. a reg loaded from mem and used in both narrow and wide modes, gets
>> replaced with
>>
>> ?(set (reg:SI x) (mem:SI ..))
>> ?(set (reg:DI y) (zero_extend:DI (reg:SI x)))
>> ?(.. (reg:SI x) ..)
>> ?(.. (reg:DI y) ..)
>>
>> and now combine won't merge the first two insns due to the later use
>> of x, meaning we are left with an explicit zero_extend.
>>
>> The change to try_fwprop_subst fixes a latent bug introduced with the
>> fix for PR34012. ?Perhaps I should have gone a little further and not
>> called rtx_cost for any of the substitutions done by
>> forward_propagate_subreg? ?Seems to me the new rtx ought to never
>> cost more..
>>
>> Bootstrapped and regression tested powerpc-linux, powerpc64-linux and
>> i686-linux. ?OK to apply?
>>
>> ? ? ? ?PR target/41081
>> ? ? ? ?* fwprop.c (try_fwprop_subst): Allow multiple sets.
>> ? ? ? ?(get_reg_use_in): New function.
>> ? ? ? ?(forward_propagate_subreg): Propagate through subreg of zero_extend
>> ? ? ? ?or sign_extend.
>>
>
> This patch may be bad for x86-64. There is
>
> (define_insn "*lea_2_zext"
> ?[(set (match_operand:DI 0 "register_operand" "=r")
> ? ? ? ?(zero_extend:DI
> ? ? ? ? ?(subreg:SI (match_operand:DI 1 "no_seg_address_operand" "p") 0)))]
> ?"TARGET_64BIT"
> ?"lea{l}\t{%a1, %k0|%k0, %a1}"
> ?[(set_attr "type" "lea")
> ? (set_attr "mode" "SI")])
>
> This change makes this pattern not usable.
>
See:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47379
--
H.J.