Bug 48605 - gcc.target/i386/sse4_1-insertps-2.c FAILs with -mtune=geode - instruction insertps with memory operands behaves differently
Summary: gcc.target/i386/sse4_1-insertps-2.c FAILs with -mtune=geode - instruction ins...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords: ssemmx, wrong-code
Depends on:
Blocks:
 
Reported: 2011-04-14 10:28 UTC by Zdenek Sojka
Modified: 2011-04-16 11:32 UTC (History)
1 user (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: i686-pc-linux-gnu, x86_64-pc-linux-gnu
Build:
Known to work:
Known to fail: 4.5.3, 4.6.1, 4.7.0
Last reconfirmed: 2011-04-14 13:45:46


Attachments
reduced testcase (352 bytes, text/plain)
2011-04-14 10:28 UTC, Zdenek Sojka
Details
simpler testcase (221 bytes, text/plain)
2011-04-14 10:43 UTC, Zdenek Sojka
Details
gcc46-pr48605.patch (1.42 KB, patch)
2011-04-14 15:17 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2011-04-14 10:28:14 UTC
Created attachment 23978 [details]
reduced testcase

Output:
$ gcc -m32 -msse4.1 -O testcase.c -mtune=geode
$ ./a.out 
Aborted

When comparing the asm output with/out -mtune=geode, it looks very similiar.
But there seems to be a different behaviour of "insertps" when the source operand is a memory location. In that case, the "COUNT_S" part is ignored, and the offset has to be encoded in the "address of memory operand" part of the instruction.

Specifically:
insertps	xmm1, XMMWORD PTR [esp+64], 78	# tmp117, val.x,
behaves the same as:
insertps	xmm1, XMMWORD PTR [esp+64], 14	# tmp114, val.x,
and instead should be used:
insertps	xmm1, XMMWORD PTR [esp+68], 14

This is what Intel's docs say as well:
IF (SRC = REG) THEN COUNT_S  imm8[7:6]
ELSE COUNT_S  0
Comment 1 Zdenek Sojka 2011-04-14 10:43:21 UTC
Created attachment 23979 [details]
simpler testcase

$ gcc testcase2.c -msse4.1 -O
$ ./a.out 
Aborted
Comment 2 Jakub Jelinek 2011-04-14 13:45:46 UTC
Mine.
Comment 3 Jakub Jelinek 2011-04-14 15:17:17 UTC
Created attachment 23981 [details]
gcc46-pr48605.patch

Untested fix.
Comment 4 Jakub Jelinek 2011-04-14 21:30:41 UTC
Author: jakub
Date: Thu Apr 14 21:30:37 2011
New Revision: 172458

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172458
Log:
	PR target/48605
	* config/i386/sse.md (sse4_1_insertps): If operands[2] is a MEM,
	offset it as needed based on top 2 bits in operands[3], change
	MEM mode to SFmode and mask those 2 bits away from operands[3].

	* gcc.target/i386/sse4_1-insertps-3.c: New test.
	* gcc.target/i386/sse4_1-insertps-4.c: New test.
	* gcc.target/i386/avx-insertps-3.c: New test.
	* gcc.target/i386/avx-insertps-4.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/avx-vinsertps-3.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vinsertps-4.c
    trunk/gcc/testsuite/gcc.target/i386/sse4_1-insertps-3.c
    trunk/gcc/testsuite/gcc.target/i386/sse4_1-insertps-4.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog
Comment 5 Jakub Jelinek 2011-04-15 10:21:04 UTC
Author: jakub
Date: Fri Apr 15 10:21:00 2011
New Revision: 172483

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172483
Log:
	PR target/48605
	* config/i386/sse.md (avx_insertps, sse4_1_insertps): If operands[2]
	is a MEM, offset it as needed based on top 2 bits in operands[3],
	change MEM mode to SFmode and mask those 2 bits away from operands[3].

	* gcc.target/i386/sse4_1-insertps-3.c: New test.
	* gcc.target/i386/sse4_1-insertps-4.c: New test.
	* gcc.target/i386/avx-insertps-3.c: New test.
	* gcc.target/i386/avx-insertps-4.c: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/avx-vinsertps-3.c
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/avx-vinsertps-4.c
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/sse4_1-insertps-3.c
    branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/sse4_1-insertps-4.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/i386/sse.md
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
Comment 6 Jakub Jelinek 2011-04-15 10:34:46 UTC
Fixed for 4.6+ so far.
Comment 7 Jakub Jelinek 2011-04-16 07:53:43 UTC
Author: jakub
Date: Sat Apr 16 07:53:39 2011
New Revision: 172538

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172538
Log:
	Backported from 4.6 branch
	2011-04-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/48605
	* config/i386/sse.md (avx_insertps, sse4_1_insertps): If operands[2]
	is a MEM, offset it as needed based on top 2 bits in operands[3],
	change MEM mode to SFmode and mask those 2 bits away from operands[3].

	* gcc.target/i386/sse4_1-insertps-3.c: New test.
	* gcc.target/i386/sse4_1-insertps-4.c: New test.
	* gcc.target/i386/avx-insertps-3.c: New test.
	* gcc.target/i386/avx-insertps-4.c: New test.

Added:
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/avx-vinsertps-3.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/avx-vinsertps-4.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/sse4_1-insertps-3.c
    branches/gcc-4_5-branch/gcc/testsuite/gcc.target/i386/sse4_1-insertps-4.c
Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/config/i386/sse.md
    branches/gcc-4_5-branch/gcc/testsuite/ChangeLog
Comment 8 Jakub Jelinek 2011-04-16 10:03:57 UTC
Author: jakub
Date: Sat Apr 16 10:03:53 2011
New Revision: 172583

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=172583
Log:
	Backported from 4.6 branch
	2011-04-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/48605
	* config/i386/sse.md (avx_insertps, sse4_1_insertps): If operands[2]
	is a MEM, offset it as needed based on top 2 bits in operands[3],
	change MEM mode to SFmode and mask those 2 bits away from operands[3].

	* gcc.target/i386/sse4_1-insertps-3.c: New test.
	* gcc.target/i386/sse4_1-insertps-4.c: New test.
	* gcc.target/i386/avx-insertps-3.c: New test.
	* gcc.target/i386/avx-insertps-4.c: New test.

Added:
    branches/gcc-4_4-branch/gcc/testsuite/gcc.target/i386/avx-vinsertps-3.c
    branches/gcc-4_4-branch/gcc/testsuite/gcc.target/i386/avx-vinsertps-4.c
    branches/gcc-4_4-branch/gcc/testsuite/gcc.target/i386/sse4_1-insertps-3.c
    branches/gcc-4_4-branch/gcc/testsuite/gcc.target/i386/sse4_1-insertps-4.c
Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/config/i386/sse.md
    branches/gcc-4_4-branch/gcc/testsuite/ChangeLog
Comment 9 Jakub Jelinek 2011-04-16 11:32:01 UTC
Fixed.