70963 – vec_cts/vec_ctf intrinsics produce wrong results for 64-bit floating point

Bug 70963 - vec_cts/vec_ctf intrinsics produce wrong results for 64-bit floating point

Summary: vec_cts/vec_ctf intrinsics produce wrong results for 64-bit floating point

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	5.3.1

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	wrong-code

Depends on:
Blocks:

Reported:	2016-05-05 16:06 UTC by Konstantinos Margaritis
Modified:	2016-05-10 17:25 UTC (History)
CC List:	4 users (show)

See Also:
Host:	powerpc64le-linux-gnu
Target:	powerpc64le-linux-gnu
Build:	powerpc64le-linux-gnu
Known to work:
Known to fail:	5.3.1, 7.0
Last reconfirmed:	2016-05-08 00:00:00

Attachments
small test program to verify vec_cts/vec_ctf working on doubles (482 bytes, text/x-csrc) 2016-05-05 16:06 UTC, Konstantinos Margaritis	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Konstantinos Margaritis 2016-05-05 16:06:25 UTC

Created attachment 38420 [details]
small test program to verify vec_cts/vec_ctf working on doubles

I've noticed some tests in Eigen failing for VSX 64-bit doubles code, when the algorithm looked perfectly normal. Further investigation -and comparison with direct output using same inputs from an SSE test program, showed that the conversion to integer yielded 0 results in the case of VSX (vec_cts basically returned 0 vectors). I've written a small test case that verifies this on both big/little endian VSX-capable systems (compiled with -m64 -mvsx). When using the intrinsic the result is wrong, when using the inline asm version, it works as expected. I could not test it in a more recent gcc so it may well be fixed, however it would be great if this would be backported to gcc 5.

Some asm output follows from test program (attached):

vec_cts:
    1000066c:   60 67 00 f0     xvcvdpsxds vs0,vs12
    10000670:   50 02 00 f0     xxswapd vs0,vs0


asm:
    10000674:   51 02 00 f0     xxswapd vs32,vs0
    10000678:   60 07 00 f0     xvcvdpsxds vs0,vs0
    1000067c:   56 02 00 f0     xxswapd vs0,vs32


vec_ctf:
    100006f8:   50 02 00 f0     xxswapd vs0,vs0
    100006fc:   e0 07 00 f0     xvcvsxddp vs0,vs0
    10000700:   50 02 00 f0     xxswapd vs0,vs0

asm:
    100006f8:   51 02 00 f0     xxswapd vs32,vs0
    100006fc:   e0 07 00 f0     xvcvsxddp vs0,vs0
    10000700:   56 02 00 f0     xxswapd vs0,vs32

Comment 1 David Edelsohn 2016-05-08 03:24:39 UTC

Confirmed.

Comment 2 Bill Schmidt 2016-05-09 22:05:33 UTC

The xxswapd's are a bit of a red herring.  These are part of the little-endian normalization code that are required with the funky lxvd2x and stxvd2x instructions.  The problem appears to be the register assignment on the instructions generated for vec_cts and vec_ctf.  The use of vs12 on vec_cts is an obvious problem, since vs12 doesn't contain any value assigned in the function.  The code for vec_ctf looks fine.  So we need to figure out what happened with the register number on xvcvdpsxds.

The problem still exists on trunk.

Comment 3 Bill Schmidt 2016-05-09 22:11:55 UTC

Note also that your asm constraints are wrong.  You need VSX registers, not Altivec registers, so you should be using the "wa" constraint instead of the "v" constraint.  This is why you get some apparently wrong register numbers with your asm results.

Comment 4 Bill Schmidt 2016-05-09 22:47:46 UTC

OK, there is an obvious bug in the define_expand for vsx_xvcvdpsxds_scale.  If the scale factor is 0, wrong code is always generated.  I'll get a patch going.

Comment 5 Konstantinos Margaritis 2016-05-10 06:35:48 UTC

Ack, thanks for the heads up on VSX registers, it does print more reasonable results now.

Comment 6 Bill Schmidt 2016-05-10 14:27:44 UTC

Author: wschmidt
Date: Tue May 10 14:27:12 2016
New Revision: 236082

URL: https://gcc.gnu.org/viewcvs?rev=236082&root=gcc&view=rev
Log:
[gcc]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* config/rs6000/vsx.md (vsx_xvcvdpsxds_scale): Generate correct
	code for a zero scale factor.
	(vsx_xvcvdpuxds_scale): Likewise.

[gcc/testsuite]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* gcc.target/powerpc/pr70963.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/powerpc/pr70963.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/vsx.md
    trunk/gcc/testsuite/ChangeLog

Comment 7 Bill Schmidt 2016-05-10 16:07:37 UTC

Author: wschmidt
Date: Tue May 10 16:07:04 2016
New Revision: 236089

URL: https://gcc.gnu.org/viewcvs?rev=236089&root=gcc&view=rev
Log:
[gcc]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* config/rs6000/vsx.md (vsx_xvcvdpsxds_scale): Generate correct
	code for a zero scale factor.
	(vsx_xvcvdpuxds_scale): Likewise.

[gcc/testsuite]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* gcc.target/powerpc/pr70963.c: New.


Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.target/powerpc/pr70963.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/config/rs6000/vsx.md
    branches/gcc-5-branch/gcc/testsuite/ChangeLog

Comment 8 Bill Schmidt 2016-05-10 16:10:00 UTC

Author: wschmidt
Date: Tue May 10 16:09:28 2016
New Revision: 236091

URL: https://gcc.gnu.org/viewcvs?rev=236091&root=gcc&view=rev
Log:
[gcc]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* config/rs6000/vsx.md (vsx_xvcvdpsxds_scale): Generate correct
	code for a zero scale factor.
	(vsx_xvcvdpuxds_scale): Likewise.

[gcc/testsuite]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* gcc.target/powerpc/pr70963.c: New.


Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/powerpc/pr70963.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/config/rs6000/vsx.md
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog

Comment 9 Bill Schmidt 2016-05-10 17:25:04 UTC

Author: wschmidt
Date: Tue May 10 17:24:32 2016
New Revision: 236097

URL: https://gcc.gnu.org/viewcvs?rev=236097&root=gcc&view=rev
Log:
[gcc]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* config/rs6000/vsx.md (vsx_xvcvdpsxds_scale): Generate correct
	code for a zero scale factor.
	(vsx_xvcvdpuxds_scale): Likewise.

[gcc/testsuite]

2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline
	2016-05-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/70963
	* gcc.target/powerpc/pr70963.c: New.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr70963.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/config/rs6000/vsx.md
    branches/gcc-6-branch/gcc/testsuite/ChangeLog

Comment 10 Bill Schmidt 2016-05-10 17:25:36 UTC

Fixed everywhere.