Consecutive vzip, vuzp or vtrn intrinsic overwrite destination register. Compiler: gcc -v Using built-in specs. COLLECT_GCC=/usr/local/gcc/4.6/bin/gcc COLLECT_LTO_WRAPPER=/usr/local/gcc/4.6/libexec/gcc/armv7l-unknown-linux-gnueabi/4.6.0/lto-wrapper Target: armv7l-unknown-linux-gnueabi Configured with: ../gcc-4.6/configure --prefix=/usr/local/gcc/4.6 --enable-languages=c --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 Thread model: posix gcc version 4.6.0 20110323 (prerelease) (GCC) Test case: #include <arm_neon.h> #include <stdio.h> int main(void) { uint8x8_t v1 = {1, 1, 1, 1, 1, 1, 1, 1}; uint8x8_t v2 = {2, 2, 2, 2, 2, 2, 2, 2}; uint8x8x2_t vd1, vd2; union {uint8x8_t v; uint8_t buf[8];} d1, d2, d3, d4; int i; vd1 = vzip_u8(v1, vdup_n_u8(0)); vd2 = vzip_u8(v2, vdup_n_u8(0)); vst1_u8(d1.buf, vd1.val[0]); vst1_u8(d2.buf, vd1.val[1]); vst1_u8(d3.buf, vd2.val[0]); vst1_u8(d4.buf, vd2.val[1]); printf(" d1 d2 d3 d4\n"); for (i = 0; i < 8; i++) { printf("%4d%4d%4d%4d\n", d1.buf[i], d2.buf[i], d3.buf[i], d4.buf[i]); } return 0; } --------------------- Compile flags: -mfloat-abi=softfp -mfpu=neon -O2 Output: d1 d2 d3 d4 1 1 2 1 0 0 0 0 1 1 2 1 0 0 0 0 1 1 2 1 0 0 0 0 1 1 2 1 0 0 0 0 d4 is wrong.
Some additional info about the gcc version tested. URL: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch Revision: 171340
Confirmed . This IIRC is something for which Ira had a patch. Adding her to the CC. ramana
Created attachment 23881 [details] patch
I attached the patch. I am going to test and submit it. Ira
Author: irar Date: Mon Apr 18 07:14:22 2011 New Revision: 172639 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172639 Log: PR target/48252 * config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments to match neon_vzip/vuzp/vtrn_internal. * config/arm/neon.md (neon_vtrn<mode>_internal): Make both outputs explicitly dependent on both inputs. (neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise. Added: trunk/gcc/testsuite/gcc.target/arm/pr48252.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.c trunk/gcc/config/arm/neon.md trunk/gcc/testsuite/ChangeLog
Author: irar Date: Thu May 5 07:34:59 2011 New Revision: 173417 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173417 Log: Backport from mainline: 2011-04-18 Ulrich Weigand <ulrich.weigand@linaro.org> Ira Rosen <ira.rosen@linaro.org> PR target/48252 * config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments to match neon_vzip/vuzp/vtrn_internal. * config/arm/neon.md (neon_vtrn<mode>_internal): Make both outputs explicitly dependent on both inputs. (neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.target/arm/pr48252.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/config/arm/arm.c branches/gcc-4_5-branch/gcc/config/arm/neon.md branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
Author: irar Date: Thu May 5 08:39:40 2011 New Revision: 173418 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173418 Log: Backport from mainline: 2011-04-18 Ulrich Weigand <ulrich.weigand@linaro.org> Ira Rosen <ira.rosen@linaro.org> PR target/48252 * config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments to match neon_vzip/vuzp/vtrn_internal. * config/arm/neon.md (neon_vtrn<mode>_internal): Make both outputs explicitly dependent on both inputs. (neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.target/arm/pr48252.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/arm/arm.c branches/gcc-4_6-branch/gcc/config/arm/neon.md branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
Fixed on 4.5, 4.6 and 4.7.
Author: ramana Date: Fri May 6 10:21:26 2011 New Revision: 173480 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173480 Log: 2011-05-06 Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org> PR target/48252 * config/arm/neon.md (neon_vtrn<mode>): Fix typo from earlier commit. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/arm/neon.md
*** Bug 49061 has been marked as a duplicate of this bug. ***