This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Richard Sandiford wrote:
Tejas Belagod <tbelagod@arm.com> writes:Richard Sandiford wrote:What I had in mind when I implemented this was a partial-reg result, but obviously it was wrong.Tejas Belagod <tbelagod@arm.com> writes:Richard Sandiford wrote:Tejas Belagod <tbelagod@arm.com> writes:Richard Sandiford wrote:Tejas Belagod <tbelagod@arm.com> writes:+ /* This is big-endian-safe because the elements are kept in target + memory order. So, for eg. PARALLEL element value of 2 is the same in + either endian-ness. */ + if (GET_CODE (src) == VEC_SELECT + && REG_P (XEXP (src, 0)) && REG_P (dst) + && REGNO (XEXP (src, 0)) == REGNO (dst)) + { + rtx par = XEXP (src, 1); + int i; + + for (i = 0; i < XVECLEN (par, 0); i++) + { + rtx tem = XVECEXP (par, 0, i); + if (!CONST_INT_P (tem) || INTVAL (tem) != i) + return 0; + } + return 1; + } +I think for big endian it needs to be: INTVAL (tem) != i + base where base is something like: int base = GET_MODE_NUNITS (GET_MODE (XEXP (src, 0))) - XVECLEN (par, 0); E.g. a big-endian V4HI looks like: msb lsb 0000111122223333 and shortening it to say V2HI only gives the low 32 bits: msb lsb 22223333But, in this case we want msb lsb 00001111It depends on whether the result occupies a full register or not. I was thinking of the case where it didn't, but I realise now you were thinking of the case where it did. And yeah, my suggestion doesn't cope with that...I was under the impression that the const vector parallel for vec_select represents the element indexes of the array in memory order.Therefore, in bigendian, msb lsb 0000 1111 2222 3333 element a[0] a[1] a[2] a[3] and in littleendian msb lsb 3333 2222 1111 0000 element a[3] a[2] a[1] a[0]Right. But if an N-bit value is stored in a register, it's assumed to occupy the lsb of the register and the N-1 bits above that. The other bits in the register are don't-care. E.g., leaving vectors to one side, if you have: (set (reg:HI N) (truncate:SI (reg:SI N))) on a 32-bit !TRULY_NOOP_TRUNCATION target, it shortens like this: msb lsb 01234567 VVVV xxxx4567 rather than: msb lsb 01234567 VVVV 0123xxxx for both endiannesses. The same principle applies to vectors. The lsb of the register is always assumed to be significant. So maybe the original patch was correct for partial-register and full-register results on little-endian, but only for full-register results on big-endian.Ah, ok! I think I get it. By eliminating set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0])))) using the check INTVAL (tem) != i, I'm essentially making subsequent operations use (reg:V2DI n) in DI mode which is a partial register result and this gives methe wrong set of lanes in bigendian. So, if I want to use (reg n) in partial register mode, I have to make sure the correct elements coincide withthe lsb in big-endian...Thanks for your input, I'll apply the offset correction for big-endian you suggested. I'll respin the patch.Thanks. Just for avoidance of doubt, the result might be a full or partial register, depending on the mode and target. I was trying to correct myself by agreeing that your original was right and mine was wrong for big-endian if the result is a full register.Sorry, I'm going to take a step back - I'm trying to figure out what a full register result would look like. Looking at the pattern,set( (reg:DI n) vec_select:DI ( (reg:V2DI n) (parallel [const 0]))))the result is always a mode smaller than the src if the vec_select selects a subset of lanes. Plus if the src and dst are the same reg, because we're re-writing the src reg, wouldn't it always end up being a partial-reg?In this case, wouldn't (reg:DI n) always represent msb lsb 22223333The problem is that one reg rtx can span several hard registers. E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32), but it might instead represent two 32-bit registers (nos. 32 and 33). Obviously the latter's not very likely for vectors this small, but more likely for larger ones (including on NEON IIRC). So if we had 2 32-bit registers being treated as a V4HI, it would be: <--32--><--33--> msb lsb 0000111122223333 VVVVVVVV 00001111 msb lsb <--32--> for big endian and: <--33--><--32--> msb lsb 3333222211110000 VVVVVVVV 11110000 msb lsb <--32--> for little endian.
Ah, ok, that makes things clearer. Thanks for that.I can't find any helper function that figures out if we're writing partial or full result regs. Would something like
REGNO (src) == REGNO (dst) && HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1 be a sane check for partial result regs? Thanks, Tejas.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |