This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch, RTL] Eliminate redundant vec_select moves.
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Tejas Belagod <tbelagod at arm dot com>
- Cc: Kirill Yukhin <kirill dot yukhin at gmail dot com>, "H.J. Lu" <hjl dot tools at gmail dot com>, "Yukhin\, Kirill" <kirill dot yukhin at intel dot com>, Jeff Law <law at redhat dot com>, Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>, "gcc-patches\ at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>, Richard Henderson <rth at redhat dot com>, Jakub Jelinek <jakub at redhat dot com>
- Date: Mon, 09 Dec 2013 12:00:46 +0000
- Subject: Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Authentication-results: sourceware.org; auth=none
- References: <878uwwdnx0 dot fsf at talisman dot default> <52962733 dot 7030005 at arm dot com> <87r4a0c1ul dot fsf at talisman dot default> <529F5318 dot 1030505 at arm dot com> <CAMe9rOr9GqTEjfTPXB7gJLNRVkdjCZ_RMzWBOYFX8q4K8n=RJg at mail dot gmail dot com> <529F666F dot 4000507 at redhat dot com> <CAMe9rOo+2LnE=T9y7bmoxfWov+T4WDizTmpU5jFhpYe_xadgXA at mail dot gmail dot com> <52A07CF6 dot 6010003 at arm dot com> <CAMe9rOpZ41Qe-PqoqyJaVaYPSQfQXSkXPJeUQa23v2=0UabSXA at mail dot gmail dot com> <20131205134000 dot GG44339 at msticlxl57 dot ims dot intel dot com> <20131209064909 dot GA21317 at msticlxl57 dot ims dot intel dot com> <52A593B1 dot 6080406 at arm dot com>
Tejas Belagod <tbelagod@arm.com> writes:
> Kirill Yukhin wrote:
>> Hello,
>>
>> On 05 Dec 16:40, Kirill Yukhin wrote:
>>> On 05 Dec 05:30, H.J. Lu wrote:
>>>> Kirill, can you take a look why it doesn't work for x86?
>>> Okay, I'll look at this.
>>
>> I've looked at this. It seems that `CANNOT_CHANGE_MODE_CLASS'
>> is too conservative for x86.
>>
>> In rtlanal.c we have `simplify_subreg_regno' which call target
>> hook `REG_CANNOT_CHANGE_MODE_P'. It takes only 3 arguments:
>> from mode, to mode and regclass.
>>
>> Hook in x86 called `ix86_cannot_change_mode_class' and comment
>> says that we cannot change mode for nonzero offsets, which sounds
>> quite reasonable. That is why this hook returns `true' for this
>> tuple <V4SF, SF, FIRST_SSE_REG> and `simplify_subreg_regno'
>> prohibits simplification of that:
>> (set (reg:SF 21 xmm0 [orig:86 D.1816 ] [86])
>> (vec_select:SF (reg:V4SF 21 xmm0 [87])
>> (parallel [(const_int 0 [0])])))
>>
>> I think we can extend the hook and add `offset in frommode' to it.
>> We may set it to -1 for the cases where it is unknown and work
>> conservatively in the target hook.
>> For most cases offset is known and we could pass it to the hook.
>> This will require changes throughout all targets though.
>>
>> Alternatively, we may introduce another target hook, say
>> `CANNOT_CHANGE_MODE_CLASS_OFFSET' with same args as
>> `CANNOT_CHANGE_MODE_CLASS' + offset and which will be defaulted to it.
>> For x86 (and possibly other targets) we'll implement this hook, which
>> will checko ffset.
>>
>> What do you think?
>>
>
> I don't think CANNOT_CHANGE_MODE_CLASS has been designed with an
> intention to consider offsets. I thought all that magic about
> BYTE_OFFSET resolution into representable hardregs was done by
> subreg_get_info() where the info.representable is set to false if the
> BYTE_OFFSET of the subreg didn't map to a full hardreg. So if your
> (subreg:ymode (reg:xmode) off) maps to a full hardreg,
> simplify_subreg_regno should be returning the yregno automatically.
I agree. A subreg only reduces to a single hard register if the subreg
logically refers to the low part of the hard register. That's a target-
independent requirement so the hook shouldn't need to worry about it.
I'm just speculating, but maybe the problem is that this was traditionally
keyed off word size. If a subreg is smaller than a word then it must
correspond to the low part of the containing word. So if words
are 32 bits or wider, things like (subreg:QI (reg:SI X) 1) and
(subreg:QI (reg:SI X) 2) are always invalid, even for pseudo Xs.
But that doesn't stop things like (subreg:QI (reg:DI X) 4) on 32-bit
little-endian targets. So we can run into trouble when dealing with
wider-than-word registers, since whether the byte offset is representable
depends on the class. And things like IRA would need this to be trapped
at the class level, rather than just for specific hard registers.
If that was the problem though, it still sounds like something that could
be handled in a target-independent way, via things like class_max_nregs.
Thanks,
Richard