[Patch, AArch64] Fix shuffle for big-endian.

Alan Lawrence alan.lawrence@arm.com
Mon Mar 24 17:45:00 GMT 2014


Further to that - all looks good after one-liner 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01142.html has gone in. (Without 
that, enabling the code in Tejas' patch causes a regression in 
gcc.dg/torture/vshuf-v4hi.c as loading a vector constant goes wrong).

I'll send a patch to enable this and fix up the testsuite shortly...

Cheers, Alan

Alan Lawrence wrote:
> I've been doing some local testing using this patch as a basis for some of my 
> own work on NEON intrinsics, and it seems good to me. A couple of points:
> 
> (1) Re. the comment that "If two vectors, we end up with a wierd mixed-endian 
> mode on NEON": firstly "wierd" should be spelt "weird";
> secondly, if I understand right, this comment belongs with the next "if
> (!d->one_vector_p...)" rather than the "if (BYTES_BIG_ENDIAN)" before which it's
> written.
> 
> (2) as you say, this code is not exercised, unless you do something to remove 
> the 'if (BYTES_BIG_ENDIAN) return false;' earlier in that same function. Can I 
> politely suggest you do that here in this patch?
> 
> (3) In my own regression testing, with const_vec_perm enabled on big_endian, I 
> see 2*PASS->FAIL, namely
> 
> gcc.dg/vect/vect-114.c scan-tree-dump-times vect "vectorized 0 loops" 1
> 
> gcc.dg/vect/vect-114.c -flto -ffat-lto-objects  scan-tree-dump-times
> vect "vectorized 0 loops" 1
> 
> These are essentially noise, but the noise is removed and I see no other 
> problems, if (after this patch) I re-enable the testsuite's "vect_perm" target 
> selector for aarch64 big-endian (testsuite/lib/target-supports.exp). Would you 
> like a separate patch for that or roll it in here?
> 
> Cheers, Alan
> 
> Tejas Belagod wrote:
>> Hi,
>>
>> When a shuffle of more than one input happens, on NEON we end up with a 
>> 'mixed-endian' format in the register list which TBL operates on. We don't make 
>> this correction in RTL and therefore the shuffle operation gets it incorrect. 
>> Here is a patch that fixes-up the index table in the selector rtx in RTL to also 
>> be mixed-endian to reflect what's happening on NEON.
>>
>> As trunk stands, this patch will not be exercised as constant vector permute for 
>> Big-endian is disabled. I've tested this by locally enabling const vec_perm and 
>> it fixes the some regressions we have on big-endian:
>>
>> aarch64_be-none-elf:
>> FAIL->PASS: gcc.c-torture/execute/loop-11.c execution,  -O3 -fomit-frame-pointer
>> FAIL->PASS: gcc.c-torture/execute/loop-11.c execution,  -O3 -fomit-frame-pointer 
>> -funroll-all-loops -finline-functions
>> FAIL->PASS: gcc.c-torture/execute/loop-11.c execution,  -O3 -fomit-frame-pointer 
>> -funroll-loops
>> FAIL->PASS: gcc.c-torture/execute/loop-11.c execution,  -O3 -g
>> FAIL->PASS: gcc.dg/torture/vector-shuffle1.c  -O0  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v16qi.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v2df.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v2di.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v2sf.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v2si.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v4sf.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v4si.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v8hi.c  -O2  execution test
>> FAIL->PASS: gcc.dg/torture/vshuf-v8qi.c  -O2  execution test
>> FAIL->PASS: gcc.dg/vect/vect-114.c -flto -ffat-lto-objects execution test
>> FAIL->PASS: gcc.dg/vect/vect-114.c execution test
>> FAIL->PASS: gcc.dg/vect/vect-15.c -flto -ffat-lto-objects execution test
>> FAIL->PASS: gcc.dg/vect/vect-15.c execution test
>>
>> Also regressed on aarch64-none-elf.
>>
>> OK for stage-1?
>>
>> Thanks,
>> Tejas.
>>
>> 2014-02-21  Tejas Belagod  <tejas.belagod@arm.com>
>>
>> gcc/
>> 	* config/aarch64/aarch64.c (aarch64_evpc_tbl): Fix index vector for
>> 	big-endian when dealing with more than one input shuffle vector.
>>
> 
> 




More information about the Gcc-patches mailing list