Created attachment 24249 [details] test case for this bug The attached foo.c program contains two 128-bit AVX loads from locations of type __m128d. However, gcc-4.6 converts the first load into a 256-load. While this is semantically correct, because the upper 128-bits are ignored, the vmovapd instruction has different alignment requirements in 128 and 256-bit mode, and therefore this conversion causes spurious segfaults when the data is not 32-byte aligned. Compile the attached program as follows: x86_64-linux-gnu-gcc-4.6 -mavx -O -S foo.c The generated assembly contains the incorrect load: vmovapd (%rdi), %ymm0 By contrast, gcc-4.5 generates the correct 128-bit load instruction "vmovapd (%rdi), %xmm0".
Confirmed, caused by r161279 [1],[2]. 2010-06-23 H.J. Lu <hongjiu.lu@intel.com> * config/i386/i386.c (bdesc_args): Replace CODE_FOR_avx_si_si256, CODE_FOR_avx_ps_ps256 and CODE_FOR_avx_pd_pd256 with CODE_FOR_vec_extract_lo_v8si, CODE_FOR_vec_extract_lo_v8sf and CODE_FOR_vec_extract_lo_v4df. * config/i386/sse.md (vec_extract_lo_<AVX256MODE4P:mode>): Changed to define_insn_and_split. (vec_extract_lo_<AVX256MODE8P:mode>): Likewise. (vec_extract_lo_v16hi): Likewise. (vec_extract_lo_v32qi): Likewise. (avx_<avxmodesuffixp><avxmodesuffix>_<avxmodesuffixp>): Likewise. (avx_<avxmodesuffixp>_<avxmodesuffixp><avxmodesuffix>): Removed. [1] http://gcc.gnu.org/ml/gcc-cvs/2010-06/msg01197.html [2] http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02216.html
Created attachment 24278 [details] The patch Hi, Here is fix for the bug. I made bootrstrap and make check on 4.6 BTW, it also have to be committed to trunk, since the problem is there is well K
(In reply to comment #2) > Created attachment 24278 [details] > The patch > Here is fix for the bug. I made bootrstrap and make check on 4.6 > BTW, it also have to be committed to trunk, since the problem is there is well Please post the patch to gcc-patches@ mailing list for approval or further discussion. Please follow the procedure, as explained in details in [1]. [1] http://gcc.gnu.org/contribute.html
A patch is posted at http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01325.html
Author: hjl Date: Wed May 18 22:12:28 2011 New Revision: 173880 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173880 Log: Properly handle 256bit load cast. gcc/ 2011-05-18 H.J. Lu <hongjiu.lu@intel.com> PR target/49002 * config/i386/sse.md (avx_<ssemodesuffix><avxsizesuffix>_<ssemodesuffix>): Properly handle load cast. gcc/testsuite/ 2011-05-18 H.J. Lu <hongjiu.lu@intel.com> PR target/49002 * gcc.target/i386/pr49002-1.c: New test. * gcc.target/i386/pr49002-2.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/i386/pr49002-1.c trunk/gcc/testsuite/gcc.target/i386/pr49002-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
Author: hjl Date: Wed May 18 22:56:35 2011 New Revision: 173881 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173881 Log: Properly handle 256bit load cast. gcc/ 2011-05-18 H.J. Lu <hongjiu.lu@intel.com> Backport from mainline 2011-05-18 H.J. Lu <hongjiu.lu@intel.com> PR target/49002 * config/i386/sse.md (avx_<avxmodesuffixp><avxmodesuffix>_<avxmodesuffixp>): Properly handle load cast. gcc/testsuite/ 2011-05-18 H.J. Lu <hongjiu.lu@intel.com> Backport from mainline 2011-05-18 H.J. Lu <hongjiu.lu@intel.com> PR target/49002 * gcc.target/i386/pr49002-1.c: New test. * gcc.target/i386/pr49002-2.c: Likewise. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/pr49002-1.c branches/gcc-4_6-branch/gcc/testsuite/gcc.target/i386/pr49002-2.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/i386/sse.md branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
Fixed.