35767 – x86 backend uses aligned load on unaligned memory

Bug 35767 - x86 backend uses aligned load on unaligned memory

Summary: x86 backend uses aligned load on unaligned memory

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	4.4.0

Importance:	P3 normal
Target Milestone:	4.4.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:	wrong-code

Depends on:	32000
Blocks:	32951 35771
	Show dependency tree / graph

Reported:	2008-03-30 16:48 UTC by H.J. Lu
Modified:	2014-02-04 21:26 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:	x86_64-unknown-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description H.J. Lu 2008-03-30 16:48:54 UTC

bash-3.2$ cat x.c
typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));

__m128  __attribute__((noinline))
iszero (__m128 x)
{
  return x;
}

typedef  __m128 __attribute__((aligned(1))) unaligned;

__m128  __attribute__((noinline))
foo (__m128 a1, __m128 a2, __m128 a3, __m128 a4,
     __m128 a5, __m128 a6, __m128 a7, __m128 a8,
     int b1, int b2, int b3, int b4, int b5, int b6, int b7, unaligned y)
{
  return iszero (y);
}

int
main (void)
{
  unaligned x;
  __m128 y, x0 = { 0 };
  x = x0; 
  y = foo (x0, x0, x0, x0, x0, x0, x0, x0, 1, 2, 3, 4, 5, 6, 7, x);
  return !__builtin_memcmp (&y, &x0, sizeof (y));
}
bash-3.2$ /export/build/gnu/gcc/build-x86_64-linux/stage1-gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/stage1-gcc/ -O x.c -o x
bash-3.2$ ./x
Segmentation fault
bash-3.2$ 

The issue here is V4SFmode may not always be properly aligned. This
is very similar to PR 32000.  The difference is TDmode is passed as
TImode on the stack. But here V4SFmode is used. The same problem
exists to all other SSE modes.

Comment 1 H.J. Lu 2008-03-31 04:46:36 UTC

*** Bug 35771 has been marked as a duplicate of this bug. ***

Comment 2 H.J. Lu 2008-03-31 04:48:30 UTC

Middle end use canonical type for passing parameters to function
calls. ix86_function_arg_boundary should do the same. Otherwise,
there will be a mismatch.

Comment 3 hjl@gcc.gnu.org 2008-05-27 20:19:20 UTC

Subject: Bug 35767

Author: hjl
Date: Tue May 27 20:18:33 2008
New Revision: 136054

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=136054
Log:
gcc/

2008-05-27  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/35767
	PR target/35771
	* config/i386/i386.c (ix86_function_arg_boundary): Use
	alignment of canonical type.
	(ix86_expand_vector_move): Check unaligned memory access for
	all SSE modes.

gcc/testsuite/

2008-05-27  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/35767
	PR target/35771
	* gcc.target/i386/pr35767-1.c: New.
	* gcc.target/i386/pr35767-1d.c: Likewise.
	* gcc.target/i386/pr35767-1i.c: Likewise.
	* gcc.target/i386/pr35767-2.c: Likewise.
	* gcc.target/i386/pr35767-2d.c: Likewise.
	* gcc.target/i386/pr35767-2i.c: Likewise.
	* gcc.target/i386/pr35767-3.c: Likewise.
	* gcc.target/i386/pr35767-4.c: Likewise.
	* gcc.target/i386/pr35767-5.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr35767-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-1d.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-1i.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-2.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-2d.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-2i.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-3.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-4.c
    trunk/gcc/testsuite/gcc.target/i386/pr35767-5.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog

Comment 4 H.J. Lu 2008-05-27 20:22:29 UTC

Fixed.

Comment 5 Paul Pluzhnikov 2014-02-04 21:26:06 UTC

gcc.target/i386/pr35767-5.c is failing for me in both -m32 and -m64 mode on trunk: xgcc (GCC) 4.9.0 20140204 (experimental)

The assembly produced:

test:
        subq    $24, %rsp
        movaps  .LC0(%rip), %xmm0
        movups  %xmm0, (%rsp)
        movaps  %xmm0, %xmm7
        movaps  %xmm0, %xmm6
        movaps  %xmm0, %xmm5
        movaps  %xmm0, %xmm4
        movaps  %xmm0, %xmm3
        movaps  %xmm0, %xmm2
        movaps  %xmm0, %xmm1
        call    foo
        movl    $0, %eax
        addq    $24, %rsp
        ret

The movups appears to be especially bogus since it's moving to 0(%rsp) that is guaranteed to be 16-byte aligned by the ABI.