32-bit indices in VSIB address are sign-extended to 64 bits. In x32, when 32-bit indices are used as addresses, like %ymm7, 0(,%ymm6,1), %ymm0 0xf7fa3010 is sign-extended to 0xfffffffff7fa3010 which leads to invalid address in x32. Testcase: [hjl@gnu-4 00000001]$ cat foo.i void foo (void); extern float *ncost; float bar (int type, int num) { int i; float cost; cost = 0; for (i = 0; i < num; i++) if (type) cost += ncost[i]; else foo (); return (cost); } [hjl@gnu-4 00000001]$ gcc -S -mx32 -Ofast -funroll-loops -march=haswell foo.i [hjl@gnu-4 00000001]$ grep gather foo.s vgatherdps %ymm7, 0(,%ymm6,1), %ymm0 vgatherdps %ymm11, 0(,%ymm10,1), %ymm12 vgatherdps %ymm15, 0(,%ymm14,1), %ymm5 vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 vgatherdps %ymm12, 0(,%ymm11,1), %ymm13 vgatherdps %ymm5, 0(,%ymm15,1), %ymm2 vgatherdps %ymm7, 0(,%ymm10,1), %ymm6 vgatherdps %ymm14, 0(,%ymm13,1), %ymm15 vgatherdps %ymm7, 0(,%ymm10,1), %ymm6 vgatherdps %ymm13, 0(,%ymm12,1), %ymm14 vgatherdps %ymm10, 0(,%ymm9,1), %ymm7 vgatherdps %ymm12, 0(,%ymm11,1), %ymm13 vgatherdps %ymm9, 0(,%ymm2,1), %ymm10 vgatherdps %ymm12, 0(,%ymm11,1), %ymm6 vgatherdps %ymm5, 0(,%ymm15,1), %ymm2 [hjl@gnu-4 00000001]$
Author: hjl Date: Thu Mar 14 08:49:54 2019 New Revision: 269673 URL: https://gcc.gnu.org/viewcvs?rev=269673&root=gcc&view=rev Log: x32: Add addr32 prefix to VSIB address 32-bit indices in VSIB address are sign-extended to 64 bits. In x32, when 32-bit indices are used as addresses, like in vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 32-bit indices, 0xf7fa3010, is sign-extended to 0xfffffffff7fa3010 which is invalid address. Add addr32 prefix to UNSPEC_VSIBADDR instructions for x32 if there is no base register nor symbol. This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with -Ofast -funroll-loops -march=haswell gcc/ PR target/89523 * config/i386/i386.c (ix86_print_operand): Handle 'M' to add addr32 prefix to VSIB address for X32. * config/i386/sse.md (*avx512pf_gatherpf<mode>sf_mask): Prepend "%M2" to opcode. (*avx512pf_gatherpf<mode>df_mask): Likewise. (*avx512pf_scatterpf<mode>sf_mask): Likewise. (*avx512pf_scatterpf<mode>df_mask): Likewise. (*avx2_gathersi<mode>): Prepend "%M3" to opcode. (*avx2_gathersi<mode>_2): Prepend "%M2" to opcode. (*avx2_gatherdi<mode>): Prepend "%M3" to opcode. (*avx2_gatherdi<mode>_2): Prepend "%M2" to opcode. (*avx2_gatherdi<mode>_3): Prepend "%M3" to opcode. (*avx2_gatherdi<mode>_4): Prepend "%M2" to opcode.` (*avx512f_gathersi<mode>): Prepend "%M4" to opcode. (*avx512f_gathersi<mode>_2): Prepend "%M3" to opcode. (*avx512f_gatherdi<mode>): Prepend "%M4" to opcode. (*avx512f_gatherdi<mode>_2): Prepend "%M3" to opcode. (*avx512f_scattersi<mode>): Prepend "%M0" to opcode. (*avx512f_scatterdi<mode>): Likewise. gcc/testsuite/ PR target/89523 * gcc.target/i386/pr89523-1a.c: New test. * gcc.target/i386/pr89523-1b.c: Likewise. * gcc.target/i386/pr89523-2.c: Likewise. * gcc.target/i386/pr89523-3.c: Likewise. * gcc.target/i386/pr89523-4.c: Likewise. * gcc.target/i386/pr89523-5.c: Likewise. * gcc.target/i386/pr89523-6.c: Likewise. * gcc.target/i386/pr89523-7.c: Likewise. * gcc.target/i386/pr89523-8.c: Likewise. * gcc.target/i386/pr89523-9.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/i386/pr89523-1a.c trunk/gcc/testsuite/gcc.target/i386/pr89523-1b.c trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c trunk/gcc/testsuite/gcc.target/i386/pr89523-3.c trunk/gcc/testsuite/gcc.target/i386/pr89523-4.c trunk/gcc/testsuite/gcc.target/i386/pr89523-5.c trunk/gcc/testsuite/gcc.target/i386/pr89523-6.c trunk/gcc/testsuite/gcc.target/i386/pr89523-7.c trunk/gcc/testsuite/gcc.target/i386/pr89523-8.c trunk/gcc/testsuite/gcc.target/i386/pr89523-9.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
Author: hjl Date: Sun Mar 17 09:11:22 2019 New Revision: 269738 URL: https://gcc.gnu.org/viewcvs?rev=269738&root=gcc&view=rev Log: x32: Add addr32 prefix to VSIB address 32-bit indices in VSIB address are sign-extended to 64 bits. In x32, when 32-bit indices are used as addresses, like in vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 32-bit indices, 0xf7fa3010, is sign-extended to 0xfffffffff7fa3010 which is invalid address. Add addr32 prefix to UNSPEC_VSIBADDR instructions for x32 if there is no base register nor symbol. This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with -Ofast -funroll-loops -march=haswell gcc/ Backport from mainline 2019-03-14 H.J. Lu <hongjiu.lu@intel.com> PR target/89523 * config/i386/i386.c (ix86_print_operand): Handle 'M' to add addr32 prefix to VSIB address for X32. * config/i386/sse.md (*avx512pf_gatherpf<mode>sf_mask): Prepend "%M2" to opcode. (*avx512pf_gatherpf<mode>df_mask): Likewise. (*avx512pf_scatterpf<mode>sf_mask): Likewise. (*avx512pf_scatterpf<mode>df_mask): Likewise. (*avx2_gathersi<mode>): Prepend "%M3" to opcode. (*avx2_gathersi<mode>_2): Prepend "%M2" to opcode. (*avx2_gatherdi<mode>): Prepend "%M3" to opcode. (*avx2_gatherdi<mode>_2): Prepend "%M2" to opcode. (*avx2_gatherdi<mode>_3): Prepend "%M3" to opcode. (*avx2_gatherdi<mode>_4): Prepend "%M2" to opcode.` (*avx512f_gathersi<mode>): Prepend "%M4" to opcode. (*avx512f_gathersi<mode>_2): Prepend "%M3" to opcode. (*avx512f_gatherdi<mode>): Prepend "%M4" to opcode. (*avx512f_gatherdi<mode>_2): Prepend "%M3" to opcode. (*avx512f_scattersi<mode>): Prepend "%M0" to opcode. (*avx512f_scatterdi<mode>): Likewise. gcc/testsuite/ Backport from mainline 2019-03-14 H.J. Lu <hongjiu.lu@intel.com> PR target/89523 * gcc.target/i386/pr89523-1a.c: New test. * gcc.target/i386/pr89523-1b.c: Likewise. * gcc.target/i386/pr89523-2.c: Likewise. * gcc.target/i386/pr89523-3.c: Likewise. * gcc.target/i386/pr89523-4.c: Likewise. * gcc.target/i386/pr89523-5.c: Likewise. * gcc.target/i386/pr89523-6.c: Likewise. * gcc.target/i386/pr89523-7.c: Likewise. * gcc.target/i386/pr89523-8.c: Likewise. * gcc.target/i386/pr89523-9.c: Likewise. Added: branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-1a.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-1b.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-2.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-3.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-4.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-5.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-6.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-7.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-8.c branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/pr89523-9.c Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/i386/i386.c branches/gcc-8-branch/gcc/config/i386/sse.md branches/gcc-8-branch/gcc/testsuite/ChangeLog
Author: hjl Date: Sun Mar 17 09:27:56 2019 New Revision: 269739 URL: https://gcc.gnu.org/viewcvs?rev=269739&root=gcc&view=rev Log: x32: Add addr32 prefix to VSIB address 32-bit indices in VSIB address are sign-extended to 64 bits. In x32, when 32-bit indices are used as addresses, like in vgatherdps %ymm7, 0(,%ymm9,1), %ymm6 32-bit indices, 0xf7fa3010, is sign-extended to 0xfffffffff7fa3010 which is invalid address. Add addr32 prefix to UNSPEC_VSIBADDR instructions for x32 if there is no base register nor symbol. This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with -Ofast -funroll-loops -march=haswell gcc/ Backport from mainline 2019-03-14 H.J. Lu <hongjiu.lu@intel.com> PR target/89523 * config/i386/i386.c (ix86_print_operand): Handle 'M' to add addr32 prefix to VSIB address for X32. * config/i386/sse.md (*avx512pf_gatherpf<mode>sf_mask): Prepend "%M2" to opcode. (*avx512pf_gatherpf<mode>df_mask): Likewise. (*avx512pf_scatterpf<mode>sf_mask): Likewise. (*avx512pf_scatterpf<mode>df_mask): Likewise. (*avx2_gathersi<mode>): Prepend "%M3" to opcode. (*avx2_gathersi<mode>_2): Prepend "%M2" to opcode. (*avx2_gatherdi<mode>): Prepend "%M3" to opcode. (*avx2_gatherdi<mode>_2): Prepend "%M2" to opcode. (*avx2_gatherdi<mode>_3): Prepend "%M3" to opcode. (*avx2_gatherdi<mode>_4): Prepend "%M2" to opcode.` (*avx512f_gathersi<mode>): Prepend "%M4" to opcode. (*avx512f_gathersi<mode>_2): Prepend "%M3" to opcode. (*avx512f_gatherdi<mode>): Prepend "%M4" to opcode. (*avx512f_gatherdi<mode>_2): Prepend "%M3" to opcode. (*avx512f_scattersi<mode>): Prepend "%M0" to opcode. (*avx512f_scatterdi<mode>): Likewise. gcc/testsuite/ Backport from mainline 2019-03-14 H.J. Lu <hongjiu.lu@intel.com> PR target/89523 * gcc.target/i386/pr89523-1a.c: New test. * gcc.target/i386/pr89523-1b.c: Likewise. * gcc.target/i386/pr89523-2.c: Likewise. * gcc.target/i386/pr89523-3.c: Likewise. * gcc.target/i386/pr89523-4.c: Likewise. * gcc.target/i386/pr89523-5.c: Likewise. * gcc.target/i386/pr89523-6.c: Likewise. * gcc.target/i386/pr89523-7.c: Likewise. * gcc.target/i386/pr89523-8.c: Likewise. * gcc.target/i386/pr89523-9.c: Likewise. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-1a.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-1b.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-3.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-4.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-5.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-6.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-7.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-8.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr89523-9.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/i386.c branches/gcc-7-branch/gcc/config/i386/sse.md branches/gcc-7-branch/gcc/testsuite/ChangeLog