This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [Patch, RTL] Eliminate redundant vec_select moves.

From: Tejas Belagod <tbelagod at arm dot com>
To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, rdsandiford at googlemail dot com
Date: Thu, 07 Nov 2013 17:55:29 +0000
Subject: Re: [Patch, RTL] Eliminate redundant vec_select moves.
Authentication-results: sourceware.org; auth=none
References: <527A4309 dot 70209 at arm dot com> <8738n9sj8o dot fsf at talisman dot default> <527A5EF4 dot 5090505 at arm dot com> <87y551r01p dot fsf at talisman dot default> <527A7612 dot 2080406 at arm dot com> <877gcll9ht dot fsf at talisman dot default> <527BA073 dot 30900 at arm dot com> <87zjpg1d5p dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com>

Richard Sandiford wrote:

Tejas Belagod <tbelagod@arm.com> writes:

Richard Sandiford wrote:

Tejas Belagod <tbelagod@arm.com> writes:

Richard Sandiford wrote:

Tejas Belagod <tbelagod@arm.com> writes:

Richard Sandiford wrote:

Tejas Belagod <tbelagod@arm.com> writes:

+  /* This is big-endian-safe because the elements are kept in target
+ memory order.  So, for eg. PARALLEL element value of 2 is the
same in
+     either endian-ness.  */
+  if (GET_CODE (src) == VEC_SELECT
+      && REG_P (XEXP (src, 0)) && REG_P (dst)
+      && REGNO (XEXP (src, 0)) == REGNO (dst))
+    {
+      rtx par = XEXP (src, 1);
+      int i;
+
+      for (i = 0; i < XVECLEN (par, 0); i++)
+	{
+	  rtx tem = XVECEXP (par, 0, i);
+	  if (!CONST_INT_P (tem) || INTVAL (tem) != i)
+	    return 0;
+	}
+      return 1;
+    }
+

I think for big endian it needs to be:

    INTVAL (tem) != i + base

where base is something like:

    int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0);

E.g. a big-endian V4HI looks like:

    msb          lsb
    0000111122223333

and shortening it to say V2HI only gives the low 32 bits:

            msb  lsb
            22223333

But, in this case we want

         msb  lsb
         00001111

It depends on whether the result occupies a full register or not.
I was thinking of the case where it didn't, but I realise now you were
thinking of the case where it did.  And yeah, my suggestion doesn't
cope with that...

I was under the impression that the const vector parallel for vec_selectrepresents the element indexes of the array in memory order.
Therefore, in bigendian,

          msb             lsb
          0000 1111 2222 3333
element  a[0] a[1] a[2] a[3]

and in littleendian

          msb             lsb
          3333 2222 1111 0000
element  a[3] a[2] a[1] a[0]

Right.  But if an N-bit value is stored in a register, it's assumed to
occupy the lsb of the register and the N-1 bits above that.  The other
bits in the register are don't-care.

E.g., leaving vectors to one side, if you have:

   (set (reg:HI N) (truncate:SI (reg:SI N)))

on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this:

   msb  lsb
   01234567
       VVVV
   xxxx4567

rather than:

   msb  lsb
   01234567
   VVVV
   0123xxxx

for both endiannesses.  The same principle applies to vectors.
The lsb of the register is always assumed to be significant.

So maybe the original patch was correct for partial-register and
full-register results on little-endian, but only for full-register
results on big-endian.

Ah, ok! I think I get it. By eliminating
	set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]))))

using the check INTVAL (tem) != i, I'm essentially making subsequent
operations
use (reg:V2DI n) in DI mode which is a partial register result and this
gives me

the wrong set of lanes in bigendian. So, if I want to use (reg n) in partialregister mode, I have to make sure the correct elements coincide with

the lsb in
big-endian...

Thanks for your input, I'll apply the offset correction for big-endian yousuggested. I'll respin the patch.

Thanks.  Just for avoidance of doubt, the result might be a full or
partial register, depending on the mode and target.  I was trying to
correct myself by agreeing that your original was right and mine was
wrong for big-endian if the result is a full register.

What I had in mind when I implemented this was a partial-reg result, butobviously it was wrong.

Sorry, I'm going to take a step back - I'm trying to figure out what a fullregister result would look like. Looking at the pattern,


   set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]))))

the result is always a mode smaller than the src if the vec_select selects asubset of lanes. Plus if the src and dst are the same reg, because we'rere-writing the src reg, wouldn't it always end up being a partial-reg?

In this case, wouldn't (reg:DI n) always represent

      msb  lsb
      22223333


The problem is that one reg rtx can span several hard registers.
E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32),
but it might instead represent two 32-bit registers (nos. 32 and 33).
Obviously the latter's not very likely for vectors this small,
but more likely for larger ones (including on NEON IIRC).

So if we had 2 32-bit registers being treated as a V4HI, it would be:

   <--32--><--33-->
   msb          lsb
   0000111122223333
   VVVVVVVV
   00001111
   msb  lsb
   <--32-->

for big endian and:

   <--33--><--32-->
   msb          lsb
   3333222211110000
           VVVVVVVV
           11110000
           msb  lsb
           <--32-->

for little endian.


Ah, ok, that makes things clearer. Thanks for that.

I can't find any helper function that figures out if we're writing partial orfull result regs. Would something like


    REGNO (src) == REGNO (dst) &&
    HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1

be a sane check for partial result regs?

Thanks,
Tejas.

Follow-Ups:
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Richard Sandiford

References:
- [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Tejas Belagod
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Richard Sandiford
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Tejas Belagod
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Richard Sandiford
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Tejas Belagod
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Richard Sandiford
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Tejas Belagod
- Re: [Patch, RTL] Eliminate redundant vec_select moves.
  - From: Richard Sandiford

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]