This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch, RTL] Eliminate redundant vec_select moves.
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Tejas Belagod <tbelagod at arm dot com>
- Cc: Richard Henderson <rth at redhat dot com>, Kirill Yukhin <kirill dot yukhin at gmail dot com>, "Yukhin, Kirill" <kirill dot yukhin at intel dot com>, Jeff Law <law at redhat dot com>, Bill Schmidt <wschmidt at linux dot vnet dot ibm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>, Jakub Jelinek <jakub at redhat dot com>, Richard Sandiford <rdsandiford at googlemail dot com>
- Date: Wed, 11 Dec 2013 08:34:52 -0800
- Subject: Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Authentication-results: sourceware.org; auth=none
- References: <529F666F dot 4000507 at redhat dot com> <20131205134000 dot GG44339 at msticlxl57 dot ims dot intel dot com> <20131209064909 dot GA21317 at msticlxl57 dot ims dot intel dot com> <52A593B1 dot 6080406 at arm dot com> <CAMe9rOod87YRhu5vYfHUvDEtG_7_VJHafmUUGc=2Sj9q92SAtQ at mail dot gmail dot com> <CAMe9rOrbyJku55xx0RNFaathvRPSJXwZ5g6ad5v9q+NGPdg9tg at mail dot gmail dot com> <CAMe9rOoCz-9QM8-zMsPkxKnzJ2=M8D9LYKuRFAjwKKP4EU4acg at mail dot gmail dot com> <20131210160532 dot GB25880 at msticlxl57 dot ims dot intel dot com> <CAMe9rOo2FtK7Xk1-f__UMgxS7Q8reG8on2MxboiLGMQDSO64Mg at mail dot gmail dot com> <87bo0o7fn3 dot fsf at talisman dot default> <CAMe9rOr3QBQT6EcvthtgAKFFoFb9YpJpZLZZ_3Wrmxy1UURHeQ at mail dot gmail dot com> <8738m07eaj dot fsf at talisman dot default> <CAMe9rOq1hCZgrzHBjAdv6RUk3PJa_9JS_nz0xXaya16L11M-2w at mail dot gmail dot com> <87y53s5yvr dot fsf at talisman dot default> <52A77C02 dot 80306 at redhat dot com> <87ppp36985 dot fsf at talisman dot default> <CAMe9rOr2QC+G_fBOZ7FwAhXC+u1ZyseSSKn196b7LbSjCjk5cg at mail dot gmail dot com> <87lhzr5qwg dot fsf at talisman dot default> <CAMe9rOrOvVxSkiLCgsUWkFM1cZw0xDoWb-_7=UPGEOvA7VB5Mw at mail dot gmail dot com> <52A8921E dot 9060402 at arm dot com>
On Wed, Dec 11, 2013 at 8:26 AM, Tejas Belagod <tbelagod@arm.com> wrote:
> H.J. Lu wrote:
>>
>> On Wed, Dec 11, 2013 at 7:49 AM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>>
>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>>
>>>> On Wed, Dec 11, 2013 at 1:13 AM, Richard Sandiford
>>>> <rdsandiford@googlemail.com> wrote:
>>>>>
>>>>> Richard Henderson <rth@redhat.com> writes:
>>>>>>
>>>>>> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
>>>>>>>
>>>>>>> Sorry, I don't understand. I never said it was invalid. I said
>>>>>>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
>>>>>>> a single register. On a little-endian target, the offset cannot be
>>>>>>> anything other than 0 in that case.
>>>>>>>
>>>>>>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
>>>>>>> something that is always invalid, regardless of the target. That
>>>>>>> kind
>>>>>>> of situation should be rejected by target-independent code instead.
>>>>>>
>>>>>> But, we want to disable the subreg before we know whether or not
>>>>>> (reg:V4SF X)
>>>>>> will be allocated to a single hard register. That is something that
>>>>>> we can't
>>>>>> know in target-independent code before register allocation.
>>>>>
>>>>> I was thinking that if we've got a class, we've also got things like
>>>>> CLASS_MAX_NREGS. Maybe that doesn't cope with padding properly though.
>>>>> But even in the padding cases an offset-based check in C_C_M_C could
>>>>> be derived from other information.
>>>>>
>>>>> subreg_get_info handles padding with:
>>>>>
>>>>> nregs_xmode = HARD_REGNO_NREGS_WITH_PADDING (xregno, xmode);
>>>>> if (GET_MODE_INNER (xmode) == VOIDmode)
>>>>> xmode_unit = xmode;
>>>>> else
>>>>> xmode_unit = GET_MODE_INNER (xmode);
>>>>> gcc_assert (HARD_REGNO_NREGS_HAS_PADDING (xregno, xmode_unit));
>>>>> gcc_assert (nregs_xmode
>>>>> == (GET_MODE_NUNITS (xmode)
>>>>> * HARD_REGNO_NREGS_WITH_PADDING (xregno,
>>>>> xmode_unit)));
>>>>> gcc_assert (hard_regno_nregs[xregno][xmode]
>>>>> == (hard_regno_nregs[xregno][xmode_unit]
>>>>> * GET_MODE_NUNITS (xmode)));
>>>>>
>>>>> /* You can only ask for a SUBREG of a value with holes in the
>>>>> middle
>>>>> if you don't cross the holes. (Such a SUBREG should be done
>>>>> by
>>>>> picking a different register class, or doing it in memory if
>>>>> necessary.) An example of a value with holes is XCmode on
>>>>> 32-bit
>>>>> x86 with -m128bit-long-double; it's represented in 6 32-bit
>>>>> registers,
>>>>> 3 for each part, but in memory it's two 128-bit parts.
>>>>> Padding is assumed to be at the end (not necessarily the 'high
>>>>> part')
>>>>> of each unit. */
>>>>> if ((offset / GET_MODE_SIZE (xmode_unit) + 1
>>>>> < GET_MODE_NUNITS (xmode))
>>>>> && (offset / GET_MODE_SIZE (xmode_unit)
>>>>> != ((offset + GET_MODE_SIZE (ymode) - 1)
>>>>> / GET_MODE_SIZE (xmode_unit))))
>>>>> {
>>>>> info->representable_p = false;
>>>>> rknown = true;
>>>>> }
>>>>>
>>>>> and I wouldn't really want to force targets to individually reproduce
>>>>> that kind of logic at the class level. If the worst comes to the worst
>>>>> we could cache the difficult cases.
>>>>>
>>>> My case is x86 CANNOT_CHANGE_MODE_CLASS only needs
>>>> to know if the subreg byte is zero or not. It doesn't care about mode
>>>> padding. You are concerned about information passed to
>>>> CANNOT_CHANGE_MODE_CLASS is too expensive for target
>>>> to process. It isn't the case for x86.
>>>
>>> No, I'm concerned that by going this route, we're forcing every target
>>> (or at least every target with wider-than-word registers, which is most
>>> of the common ones) to implement the same target-independent restriction.
>>> This is not an x86-specific issue.
>>>
>>
>> So you prefer a generic solution which makes
>> CANNOT_CHANGE_MODE_CLASS return true
>> for vector mode subreg if subreg byte != 0. Is this
>> correct?
>
>
> Do you mean a generic solution for C_C_M_C to return true for non-zero
> byte_offset vector subregs in the context of x86?
>
> I want to clarify because in the context of 32-bit ARM little-endian, a
> non-zero byte-offset vector subreg is still a valid full hardreg. eg. for
>
> (subreg:DI (reg:V4SF) 8)
>
> C_C_M_C can return 'false' as this can be resolved to a full D-reg.
>
Does that mean subreg byte interpretation is endian-dependent?
Both llittle endian
subreg:DI (reg:V4SF) 0)
and big endian
subreg:DI (reg:V4SF) MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT)
refer to the same lower 64 bits of reg:V4SF. Is this correct?
--
H.J.
- References:
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
- Re: [Patch, RTL] Eliminate redundant vec_select moves.