This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, RTL] Eliminate redundant vec_select moves.


H.J. Lu wrote:
On Wed, Dec 11, 2013 at 8:26 AM, Tejas Belagod <tbelagod@arm.com> wrote:
H.J. Lu wrote:
On Wed, Dec 11, 2013 at 7:49 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
"H.J. Lu" <hjl.tools@gmail.com> writes:
On Wed, Dec 11, 2013 at 1:13 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
Richard Henderson <rth@redhat.com> writes:
On 12/10/2013 10:44 AM, Richard Sandiford wrote:
Sorry, I don't understand.  I never said it was invalid.  I said
(subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
a single register.  On a little-endian target, the offset cannot be
anything other than 0 in that case.

So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
something that is always invalid, regardless of the target.  That
kind
of situation should be rejected by target-independent code instead.
But, we want to disable the subreg before we know whether or not
(reg:V4SF X)
will be allocated to a single hard register.  That is something that
we can't
know in target-independent code before register allocation.
I was thinking that if we've got a class, we've also got things like
CLASS_MAX_NREGS.  Maybe that doesn't cope with padding properly though.
But even in the padding cases an offset-based check in C_C_M_C could
be derived from other information.

subreg_get_info handles padding with:

      nregs_xmode = HARD_REGNO_NREGS_WITH_PADDING (xregno, xmode);
      if (GET_MODE_INNER (xmode) == VOIDmode)
        xmode_unit = xmode;
      else
        xmode_unit = GET_MODE_INNER (xmode);
      gcc_assert (HARD_REGNO_NREGS_HAS_PADDING (xregno, xmode_unit));
      gcc_assert (nregs_xmode
                  == (GET_MODE_NUNITS (xmode)
                      * HARD_REGNO_NREGS_WITH_PADDING (xregno,
xmode_unit)));
      gcc_assert (hard_regno_nregs[xregno][xmode]
                  == (hard_regno_nregs[xregno][xmode_unit]
                      * GET_MODE_NUNITS (xmode)));

      /* You can only ask for a SUBREG of a value with holes in the
middle
         if you don't cross the holes.  (Such a SUBREG should be done
by
         picking a different register class, or doing it in memory if
         necessary.)  An example of a value with holes is XCmode on
32-bit
         x86 with -m128bit-long-double; it's represented in 6 32-bit
registers,
         3 for each part, but in memory it's two 128-bit parts.
         Padding is assumed to be at the end (not necessarily the 'high
part')
         of each unit.  */
      if ((offset / GET_MODE_SIZE (xmode_unit) + 1
           < GET_MODE_NUNITS (xmode))
          && (offset / GET_MODE_SIZE (xmode_unit)
              != ((offset + GET_MODE_SIZE (ymode) - 1)
                  / GET_MODE_SIZE (xmode_unit))))
        {
          info->representable_p = false;
          rknown = true;
        }

and I wouldn't really want to force targets to individually reproduce
that kind of logic at the class level.  If the worst comes to the worst
we could cache the difficult cases.

My case is x86 CANNOT_CHANGE_MODE_CLASS only needs
to know if the subreg byte is zero or not.  It doesn't care about mode
padding.  You are concerned about information passed to
CANNOT_CHANGE_MODE_CLASS is too expensive for target
to process.  It isn't the case for x86.
No, I'm concerned that by going this route, we're forcing every target
(or at least every target with wider-than-word registers, which is most
of the common ones) to implement the same target-independent restriction.
This is not an x86-specific issue.

So you prefer a generic solution which makes
CANNOT_CHANGE_MODE_CLASS return true
for vector mode subreg if subreg byte != 0. Is this
correct?

Do you mean a generic solution for C_C_M_C to return true for non-zero
byte_offset vector subregs in the context of x86?

I want to clarify because in the context of 32-bit ARM little-endian, a
non-zero byte-offset vector subreg is still a valid full hardreg. eg. for

   (subreg:DI (reg:V4SF) 8)

C_C_M_C can return 'false' as this can be resolved to a full D-reg.


Does that mean subreg byte interpretation is endian-dependent?
Both llittle endian

subreg:DI (reg:V4SF) 0)

and big endian

subreg:DI (reg:V4SF) MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT)

refer to the same lower 64 bits of reg:V4SF.  Is this correct?


If my understanding of endianness representation in RTL registers is correct, yes.

I said little-endian because C_C_M_C is currently gated on TARGET_BIG_ENDIAN in arm.h.

Thanks,
Tejas.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]