This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, RTL] Eliminate redundant vec_select moves.


Tejas Belagod <tbelagod@arm.com> writes:
> Kirill Yukhin wrote:
>> Hello,
>> 
>> On 05 Dec 16:40, Kirill Yukhin wrote:
>>> On 05 Dec 05:30, H.J. Lu wrote:
>>>> Kirill, can you take a look why it doesn't work for x86?
>>> Okay, I'll look at this.
>> 
>> I've looked at this. It seems that `CANNOT_CHANGE_MODE_CLASS'
>> is too conservative for x86.
>> 
>> In rtlanal.c we have `simplify_subreg_regno' which call target
>> hook `REG_CANNOT_CHANGE_MODE_P'. It takes only 3 arguments:
>> from mode, to mode and regclass.
>> 
>> Hook in x86 called `ix86_cannot_change_mode_class' and comment
>> says that we cannot change mode for nonzero offsets, which sounds
>> quite reasonable. That is why this hook returns `true' for this
>> tuple <V4SF, SF, FIRST_SSE_REG> and `simplify_subreg_regno'
>> prohibits simplification of that:
>>   (set (reg:SF 21 xmm0 [orig:86 D.1816 ] [86])
>>        (vec_select:SF (reg:V4SF 21 xmm0 [87])
>>           (parallel [(const_int 0 [0])])))
>> 
>> I think we can extend the hook and add `offset in frommode' to it.
>> We may set it to -1 for the cases where it is unknown and work
>> conservatively in the target hook.
>> For most cases offset is known and we could pass it to the hook.
>> This will require changes throughout all targets though.
>> 
>> Alternatively, we may introduce another target hook, say
>> `CANNOT_CHANGE_MODE_CLASS_OFFSET' with same args as
>> `CANNOT_CHANGE_MODE_CLASS' + offset and which will be defaulted to it.
>> For x86 (and possibly other targets) we'll implement this hook, which
>> will checko ffset.
>> 
>> What do you think?
>> 
>
> I don't think CANNOT_CHANGE_MODE_CLASS has been designed with an
> intention to consider offsets. I thought all that magic about
> BYTE_OFFSET resolution into representable hardregs was done by
> subreg_get_info() where the info.representable is set to false if the
> BYTE_OFFSET of the subreg didn't map to a full hardreg. So if your
> (subreg:ymode (reg:xmode) off) maps to a full hardreg,
> simplify_subreg_regno should be returning the yregno automatically.

I agree.  A subreg only reduces to a single hard register if the subreg
logically refers to the low part of the hard register.  That's a target-
independent requirement so the hook shouldn't need to worry about it.

I'm just speculating, but maybe the problem is that this was traditionally
keyed off word size.  If a subreg is smaller than a word then it must
correspond to the low part of the containing word.  So if words
are 32 bits or wider, things like (subreg:QI (reg:SI X) 1) and
(subreg:QI (reg:SI X) 2) are always invalid, even for pseudo Xs.
But that doesn't stop things like (subreg:QI (reg:DI X) 4) on 32-bit
little-endian targets.  So we can run into trouble when dealing with
wider-than-word registers, since whether the byte offset is representable
depends on the class.  And things like IRA would need this to be trapped
at the class level, rather than just for specific hard registers.

If that was the problem though, it still sounds like something that could
be handled in a target-independent way, via things like class_max_nregs.

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]