Bug 48252 - ARM neon: problem with consecutive vzip, vuzp and vtrn
Summary: ARM neon: problem with consecutive vzip, vuzp and vtrn
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 49061 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-03-23 12:52 UTC by Johan Kristell
Modified: 2011-05-20 07:52 UTC (History)
5 users (show)

See Also:
Host:
Target: arm-linux-gnueabi
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-04-04 21:21:06


Attachments
patch (669 bytes, patch)
2011-04-05 09:33 UTC, Ira Rosen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Johan Kristell 2011-03-23 12:52:44 UTC
Consecutive vzip, vuzp or vtrn intrinsic overwrite destination register.

Compiler:

gcc -v
Using built-in specs.
COLLECT_GCC=/usr/local/gcc/4.6/bin/gcc
COLLECT_LTO_WRAPPER=/usr/local/gcc/4.6/libexec/gcc/armv7l-unknown-linux-gnueabi/4.6.0/lto-wrapper
Target: armv7l-unknown-linux-gnueabi
Configured with: ../gcc-4.6/configure --prefix=/usr/local/gcc/4.6 --enable-languages=c --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16
Thread model: posix
gcc version 4.6.0 20110323 (prerelease) (GCC) 


Test case:

#include <arm_neon.h>
#include <stdio.h>

int main(void)
{
    uint8x8_t v1 = {1, 1, 1, 1, 1, 1, 1, 1};
    uint8x8_t v2 = {2, 2, 2, 2, 2, 2, 2, 2};
    uint8x8x2_t vd1, vd2;
    union {uint8x8_t v; uint8_t buf[8];} d1, d2, d3, d4;
    int i;

    vd1 = vzip_u8(v1, vdup_n_u8(0));
    vd2 = vzip_u8(v2, vdup_n_u8(0));

    vst1_u8(d1.buf, vd1.val[0]);
    vst1_u8(d2.buf, vd1.val[1]);
    vst1_u8(d3.buf, vd2.val[0]);
    vst1_u8(d4.buf, vd2.val[1]);

    printf("  d1  d2  d3  d4\n");
    for (i = 0; i < 8; i++) {
        printf("%4d%4d%4d%4d\n",
        d1.buf[i],
        d2.buf[i],
        d3.buf[i],
        d4.buf[i]);
    }

    return 0;
}

---------------------

Compile flags: -mfloat-abi=softfp -mfpu=neon -O2

Output:

  d1  d2  d3  d4
   1   1   2   1
   0   0   0   0
   1   1   2   1
   0   0   0   0
   1   1   2   1
   0   0   0   0
   1   1   2   1
   0   0   0   0

d4 is wrong.
Comment 1 Johan Kristell 2011-03-29 07:18:37 UTC
Some additional info about the gcc version tested.

URL: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch
Revision: 171340
Comment 2 Ramana Radhakrishnan 2011-04-04 21:21:06 UTC
Confirmed . This IIRC is something for which Ira had a patch. Adding her to the CC.

ramana
Comment 3 Ira Rosen 2011-04-05 09:33:40 UTC
Created attachment 23881 [details]
patch
Comment 4 Ira Rosen 2011-04-05 09:34:37 UTC
I attached the patch. I am going to test and submit it.

Ira
Comment 5 irar 2011-04-18 07:14:26 UTC
Author: irar
Date: Mon Apr 18 07:14:22 2011
New Revision: 172639

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172639
Log:

	PR target/48252
	* config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments
	to match neon_vzip/vuzp/vtrn_internal.
	* config/arm/neon.md (neon_vtrn<mode>_internal): Make both
	outputs explicitly dependent on both inputs.
	(neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise.


Added:
    trunk/gcc/testsuite/gcc.target/arm/pr48252.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm.c
    trunk/gcc/config/arm/neon.md
    trunk/gcc/testsuite/ChangeLog
Comment 6 irar 2011-05-05 07:35:03 UTC
Author: irar
Date: Thu May  5 07:34:59 2011
New Revision: 173417

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173417
Log:

	Backport from mainline:
	2011-04-18  Ulrich Weigand  <ulrich.weigand@linaro.org>
		    Ira Rosen  <ira.rosen@linaro.org>

	PR target/48252
	* config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments
	to match neon_vzip/vuzp/vtrn_internal.
	* config/arm/neon.md (neon_vtrn<mode>_internal): Make both
	outputs explicitly dependent on both inputs.
	(neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise.


Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/arm/pr48252.c
Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/config/arm/arm.c
    branches/gcc-4_5-branch/gcc/config/arm/neon.md
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
Comment 7 irar 2011-05-05 08:39:47 UTC
Author: irar
Date: Thu May  5 08:39:40 2011
New Revision: 173418

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173418
Log:

	Backport from mainline:
	2011-04-18  Ulrich Weigand  <ulrich.weigand@linaro.org>
		    Ira Rosen  <ira.rosen@linaro.org>

	PR target/48252
	* config/arm/arm.c (neon_emit_pair_result_insn): Swap arguments
	to match neon_vzip/vuzp/vtrn_internal.
	* config/arm/neon.md (neon_vtrn<mode>_internal): Make both
	outputs explicitly dependent on both inputs.
	(neon_vzip<mode>_internal, neon_vuzp<mode>_internal): Likewise.


Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/arm/pr48252.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/arm/arm.c
    branches/gcc-4_6-branch/gcc/config/arm/neon.md
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
Comment 8 Ira Rosen 2011-05-05 08:40:53 UTC
Fixed on 4.5, 4.6 and 4.7.
Comment 9 Ramana Radhakrishnan 2011-05-06 10:21:30 UTC
Author: ramana
Date: Fri May  6 10:21:26 2011
New Revision: 173480

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173480
Log:

2011-05-06  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>

        PR target/48252
        * config/arm/neon.md (neon_vtrn<mode>): Fix typo
        from earlier commit.


Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/arm/neon.md
Comment 10 Ira Rosen 2011-05-20 07:52:14 UTC
*** Bug 49061 has been marked as a duplicate of this bug. ***