This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch, AArch64] Fix shuffle for big-endian.
- From: Alan Lawrence <alan dot lawrence at arm dot com>
- To: Tejas Belagod <tbelagod at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>
- Date: Wed, 12 Mar 2014 14:52:50 +0000
- Subject: Re: [Patch, AArch64] Fix shuffle for big-endian.
- Authentication-results: sourceware.org; auth=none
- References: <53077F31 dot 8070003 at arm dot com>
I've been doing some local testing using this patch as a basis for some of my
own work on NEON intrinsics, and it seems good to me. A couple of points:
(1) Re. the comment that "If two vectors, we end up with a wierd mixed-endian
mode on NEON": firstly "wierd" should be spelt "weird";
secondly, if I understand right, this comment belongs with the next "if
(!d->one_vector_p...)" rather than the "if (BYTES_BIG_ENDIAN)" before which it's
written.
(2) as you say, this code is not exercised, unless you do something to remove
the 'if (BYTES_BIG_ENDIAN) return false;' earlier in that same function. Can I
politely suggest you do that here in this patch?
(3) In my own regression testing, with const_vec_perm enabled on big_endian, I
see 2*PASS->FAIL, namely
gcc.dg/vect/vect-114.c scan-tree-dump-times vect "vectorized 0 loops" 1
gcc.dg/vect/vect-114.c -flto -ffat-lto-objects scan-tree-dump-times
vect "vectorized 0 loops" 1
These are essentially noise, but the noise is removed and I see no other
problems, if (after this patch) I re-enable the testsuite's "vect_perm" target
selector for aarch64 big-endian (testsuite/lib/target-supports.exp). Would you
like a separate patch for that or roll it in here?
Cheers, Alan
Tejas Belagod wrote:
> > Hi,
> >
> > When a shuffle of more than one input happens, on NEON we end up with a
> > 'mixed-endian' format in the register list which TBL operates on. We don't
make
> > this correction in RTL and therefore the shuffle operation gets it incorrect.
> > Here is a patch that fixes-up the index table in the selector rtx in RTL to
also
> > be mixed-endian to reflect what's happening on NEON.
> >
> > As trunk stands, this patch will not be exercised as constant vector
permute for
> > Big-endian is disabled. I've tested this by locally enabling const vec_perm
and
> > it fixes the some regressions we have on big-endian:
> >
> > aarch64_be-none-elf:
> > FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3
-fomit-frame-pointer
> > FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3
-fomit-frame-pointer
> > -funroll-all-loops -finline-functions
> > FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3
-fomit-frame-pointer
> > -funroll-loops
> > FAIL->PASS: gcc.c-torture/execute/loop-11.c execution, -O3 -g
> > FAIL->PASS: gcc.dg/torture/vector-shuffle1.c -O0 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v16qi.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v2df.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v2di.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v2sf.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v2si.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v4sf.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v4si.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v8hi.c -O2 execution test
> > FAIL->PASS: gcc.dg/torture/vshuf-v8qi.c -O2 execution test
> > FAIL->PASS: gcc.dg/vect/vect-114.c -flto -ffat-lto-objects execution test
> > FAIL->PASS: gcc.dg/vect/vect-114.c execution test
> > FAIL->PASS: gcc.dg/vect/vect-15.c -flto -ffat-lto-objects execution test
> > FAIL->PASS: gcc.dg/vect/vect-15.c execution test
> >
> > Also regressed on aarch64-none-elf.
> >
> > OK for stage-1?
> >
> > Thanks,
> > Tejas.
> >
> > 2014-02-21 Tejas Belagod <tejas.belagod@arm.com>
> >
> > gcc/
> > * config/aarch64/aarch64.c (aarch64_evpc_tbl): Fix index vector for
> > big-endian when dealing with more than one input shuffle vector.
> >