Bug 89523 - Incorrect AVX instructions with VSIB address
Summary: Incorrect AVX instructions with VSIB address
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 7.4.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-27 17:40 UTC by H.J. Lu
Modified: 2019-03-17 09:28 UTC (History)
1 user (show)

See Also:
Host:
Target: x32
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2019-02-27 17:40:48 UTC
32-bit indices in VSIB address are sign-extended to 64 bits. In x32,
when 32-bit indices are used as addresses, like

%ymm7, 0(,%ymm6,1), %ymm0

0xf7fa3010 is sign-extended to 0xfffffffff7fa3010 which leads to
invalid address in x32.
Testcase:

[hjl@gnu-4 00000001]$ cat foo.i
void foo (void);

extern float *ncost;

float
bar (int type, int num)
{
  int i;
  float cost;

  cost = 0;
  for (i = 0; i < num; i++)
    if (type)
      cost += ncost[i];
    else
      foo ();
  return (cost);
}
[hjl@gnu-4 00000001]$ gcc -S -mx32 -Ofast -funroll-loops -march=haswell foo.i 
[hjl@gnu-4 00000001]$ grep gather foo.s
	vgatherdps	%ymm7, 0(,%ymm6,1), %ymm0
	vgatherdps	%ymm11, 0(,%ymm10,1), %ymm12
	vgatherdps	%ymm15, 0(,%ymm14,1), %ymm5
	vgatherdps	%ymm7, 0(,%ymm9,1), %ymm6
	vgatherdps	%ymm12, 0(,%ymm11,1), %ymm13
	vgatherdps	%ymm5, 0(,%ymm15,1), %ymm2
	vgatherdps	%ymm7, 0(,%ymm10,1), %ymm6
	vgatherdps	%ymm14, 0(,%ymm13,1), %ymm15
	vgatherdps	%ymm7, 0(,%ymm10,1), %ymm6
	vgatherdps	%ymm13, 0(,%ymm12,1), %ymm14
	vgatherdps	%ymm10, 0(,%ymm9,1), %ymm7
	vgatherdps	%ymm12, 0(,%ymm11,1), %ymm13
	vgatherdps	%ymm9, 0(,%ymm2,1), %ymm10
	vgatherdps	%ymm12, 0(,%ymm11,1), %ymm6
	vgatherdps	%ymm5, 0(,%ymm15,1), %ymm2
[hjl@gnu-4 00000001]$
Comment 1 hjl@gcc.gnu.org 2019-03-14 08:50:26 UTC
Author: hjl
Date: Thu Mar 14 08:49:54 2019
New Revision: 269673

URL: https://gcc.gnu.org/viewcvs?rev=269673&root=gcc&view=rev
Log:
x32: Add addr32 prefix to VSIB address

32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
when 32-bit indices are used as addresses, like in

vgatherdps %ymm7, 0(,%ymm9,1), %ymm6

32-bit indices, 0xf7fa3010, is sign-extended to 0xfffffffff7fa3010 which
is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
for x32 if there is no base register nor symbol.

This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with

-Ofast -funroll-loops -march=haswell

gcc/

	PR target/89523
	* config/i386/i386.c (ix86_print_operand): Handle 'M' to add
	addr32 prefix to VSIB address for X32.
	* config/i386/sse.md (*avx512pf_gatherpf<mode>sf_mask): Prepend
	"%M2" to opcode.
	(*avx512pf_gatherpf<mode>df_mask): Likewise.
	(*avx512pf_scatterpf<mode>sf_mask): Likewise.
	(*avx512pf_scatterpf<mode>df_mask): Likewise.
	(*avx2_gathersi<mode>): Prepend "%M3" to opcode.
	(*avx2_gathersi<mode>_2): Prepend "%M2" to opcode.
	(*avx2_gatherdi<mode>): Prepend "%M3" to opcode.
	(*avx2_gatherdi<mode>_2): Prepend "%M2" to opcode.
	(*avx2_gatherdi<mode>_3): Prepend "%M3" to opcode.
	(*avx2_gatherdi<mode>_4): Prepend "%M2" to opcode.`
	(*avx512f_gathersi<mode>): Prepend "%M4" to opcode.
	(*avx512f_gathersi<mode>_2): Prepend "%M3" to opcode.
	(*avx512f_gatherdi<mode>): Prepend "%M4" to opcode.
	(*avx512f_gatherdi<mode>_2): Prepend "%M3" to opcode.
	(*avx512f_scattersi<mode>): Prepend "%M0" to opcode.
	(*avx512f_scatterdi<mode>): Likewise.

gcc/testsuite/

	PR target/89523
	* gcc.target/i386/pr89523-1a.c: New test.
	* gcc.target/i386/pr89523-1b.c: Likewise.
	* gcc.target/i386/pr89523-2.c: Likewise.
	* gcc.target/i386/pr89523-3.c: Likewise.
	* gcc.target/i386/pr89523-4.c: Likewise.
	* gcc.target/i386/pr89523-5.c: Likewise.
	* gcc.target/i386/pr89523-6.c: Likewise.
	* gcc.target/i386/pr89523-7.c: Likewise.
	* gcc.target/i386/pr89523-8.c: Likewise.
	* gcc.target/i386/pr89523-9.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr89523-1a.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-1b.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-3.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-4.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-5.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-6.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-7.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-8.c
    trunk/gcc/testsuite/gcc.target/i386/pr89523-9.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog
Comment 2 hjl@gcc.gnu.org 2019-03-17 09:11:54 UTC
Author: hjl
Date: Sun Mar 17 09:11:22 2019
New Revision: 269738

URL: https://gcc.gnu.org/viewcvs?rev=269738&root=gcc&view=rev
Log:
x32: Add addr32 prefix to VSIB address

32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
when 32-bit indices are used as addresses, like in

vgatherdps %ymm7, 0(,%ymm9,1), %ymm6

32-bit indices, 0xf7fa3010, is sign-extended to 0xfffffffff7fa3010 which
is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
for x32 if there is no base register nor symbol.

This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with

-Ofast -funroll-loops -march=haswell

gcc/

	Backport from mainline
	2019-03-14  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/89523
	* config/i386/i386.c (ix86_print_operand): Handle 'M' to add
	addr32 prefix to VSIB address for X32.
	* config/i386/sse.md (*avx512pf_gatherpf<mode>sf_mask): Prepend
	"%M2" to opcode.
	(*avx512pf_gatherpf<mode>df_mask): Likewise.
	(*avx512pf_scatterpf<mode>sf_mask): Likewise.
	(*avx512pf_scatterpf<mode>df_mask): Likewise.
	(*avx2_gathersi<mode>): Prepend "%M3" to opcode.
	(*avx2_gathersi<mode>_2): Prepend "%M2" to opcode.
	(*avx2_gatherdi<mode>): Prepend "%M3" to opcode.
	(*avx2_gatherdi<mode>_2): Prepend "%M2" to opcode.
	(*avx2_gatherdi<mode>_3): Prepend "%M3" to opcode.
	(*avx2_gatherdi<mode>_4): Prepend "%M2" to opcode.`
	(*avx512f_gathersi<mode>): Prepend "%M4" to opcode.
	(*avx512f_gathersi<mode>_2): Prepend "%M3" to opcode.
	(*avx512f_gatherdi<mode>): Prepend "%M4" to opcode.
	(*avx512f_gatherdi<mode>_2): Prepend "%M3" to opcode.
	(*avx512f_scattersi<mode>): Prepend "%M0" to opcode.
	(*avx512f_scatterdi<mode>): Likewise.

gcc/testsuite/

	Backport from mainline
	2019-03-14  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/89523
	* gcc.target/i386/pr89523-1a.c: New test.
	* gcc.target/i386/pr89523-1b.c: Likewise.
	* gcc.target/i386/pr89523-2.c: Likewise.
	* gcc.target/i386/pr89523-3.c: Likewise.
	* gcc.target/i386/pr89523-4.c: Likewise.
	* gcc.target/i386/pr89523-5.c: Likewise.
	* gcc.target/i386/pr89523-6.c: Likewise.
	* gcc.target/i386/pr89523-7.c: Likewise.
	* gcc.target/i386/pr89523-8.c: Likewise.
	* gcc.target/i386/pr89523-9.c: Likewise.

Added:
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-1a.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-1b.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-2.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-3.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-4.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-5.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-6.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-7.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-8.c
    branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-9.c
Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/config/i386/i386.c
    branches/gcc-8-branch/gcc/config/i386/sse.md
    branches/gcc-8-branch/gcc/testsuite/ChangeLog
Comment 3 hjl@gcc.gnu.org 2019-03-17 09:28:27 UTC
Author: hjl
Date: Sun Mar 17 09:27:56 2019
New Revision: 269739

URL: https://gcc.gnu.org/viewcvs?rev=269739&root=gcc&view=rev
Log:
x32: Add addr32 prefix to VSIB address

32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
when 32-bit indices are used as addresses, like in

vgatherdps %ymm7, 0(,%ymm9,1), %ymm6

32-bit indices, 0xf7fa3010, is sign-extended to 0xfffffffff7fa3010 which
is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
for x32 if there is no base register nor symbol.

This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with

-Ofast -funroll-loops -march=haswell

gcc/

	Backport from mainline
	2019-03-14  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/89523
	* config/i386/i386.c (ix86_print_operand): Handle 'M' to add
	addr32 prefix to VSIB address for X32.
	* config/i386/sse.md (*avx512pf_gatherpf<mode>sf_mask): Prepend
	"%M2" to opcode.
	(*avx512pf_gatherpf<mode>df_mask): Likewise.
	(*avx512pf_scatterpf<mode>sf_mask): Likewise.
	(*avx512pf_scatterpf<mode>df_mask): Likewise.
	(*avx2_gathersi<mode>): Prepend "%M3" to opcode.
	(*avx2_gathersi<mode>_2): Prepend "%M2" to opcode.
	(*avx2_gatherdi<mode>): Prepend "%M3" to opcode.
	(*avx2_gatherdi<mode>_2): Prepend "%M2" to opcode.
	(*avx2_gatherdi<mode>_3): Prepend "%M3" to opcode.
	(*avx2_gatherdi<mode>_4): Prepend "%M2" to opcode.`
	(*avx512f_gathersi<mode>): Prepend "%M4" to opcode.
	(*avx512f_gathersi<mode>_2): Prepend "%M3" to opcode.
	(*avx512f_gatherdi<mode>): Prepend "%M4" to opcode.
	(*avx512f_gatherdi<mode>_2): Prepend "%M3" to opcode.
	(*avx512f_scattersi<mode>): Prepend "%M0" to opcode.
	(*avx512f_scatterdi<mode>): Likewise.

gcc/testsuite/

	Backport from mainline
	2019-03-14  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/89523
	* gcc.target/i386/pr89523-1a.c: New test.
	* gcc.target/i386/pr89523-1b.c: Likewise.
	* gcc.target/i386/pr89523-2.c: Likewise.
	* gcc.target/i386/pr89523-3.c: Likewise.
	* gcc.target/i386/pr89523-4.c: Likewise.
	* gcc.target/i386/pr89523-5.c: Likewise.
	* gcc.target/i386/pr89523-6.c: Likewise.
	* gcc.target/i386/pr89523-7.c: Likewise.
	* gcc.target/i386/pr89523-8.c: Likewise.
	* gcc.target/i386/pr89523-9.c: Likewise.

Added:
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-1a.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-1b.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-2.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-3.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-4.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-5.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-6.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-7.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-8.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-9.c
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/config/i386/i386.c
    branches/gcc-7-branch/gcc/config/i386/sse.md
    branches/gcc-7-branch/gcc/testsuite/ChangeLog