Bug 50569 - [4.6/4.7 regression] unaligned memory accesses generated for memcpy
Summary: [4.6/4.7 regression] unaligned memory accesses generated for memcpy
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.1
: P1 major
Target Milestone: 4.6.3
Assignee: Eric Botcazou
URL: http://gcc.gnu.org/ml/gcc-patches/201...
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2011-09-29 18:45 UTC by Paul Koning
Modified: 2012-06-12 09:24 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.4.6, 4.5.4
Known to fail: 4.6.1, 4.7.0
Last reconfirmed: 2011-12-09 00:00:00


Attachments
Test code that shows the issue (944 bytes, application/octet-stream)
2011-09-29 18:45 UTC, Paul Koning
Details
Test case with main() (1.02 KB, application/octet-stream)
2011-09-29 19:52 UTC, Paul Koning
Details
reduced test case (293 bytes, text/plain)
2011-11-04 15:31 UTC, Mikael Pettersson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Koning 2011-09-29 18:45:11 UTC
Created attachment 25382 [details]
Test code that shows the issue

This issue shows up in 4.6.1 but is not present in 4.5.1.

The attached sample code picks up a buffer pointer from external data structures, copies a block of data to a temporary variable, and then references the temporary.  GCC 4.6.1 optimized away the memcpy and instead directly references the original data, which is fine... except that the pointer is "char *" meaning that the data structure in that buffer is not necessarily aligned.  And the references through that pointer are written as if the data IS aligned, so on my strict-alignment target (MIPS) I get an alignment exception.

GCC 4.5.1 does it right -- it also optimizes away the memcpy but it accesses the original data using unaligned accesses.
Comment 1 Mikael Pettersson 2011-09-29 18:57:17 UTC
Can you add a suitable main() to the test case to make it executable?  I'd like to check it on two strict-alignment platforms (ARMv5 and SPARC).  A cursory inspection of the generated assembly code hints that gcc-4.6-20110923 and gcc-4.7-20110924 keep the memcpy() calls when targeting armv5tel-linux-gnueabi.
Comment 2 Andrew Pinski 2011-09-29 19:00:01 UTC
I could not reproduce this in our internal version of gcc 4.6.1 though I do have a patch that might fix this.
Comment 3 Paul Koning 2011-09-29 19:52:47 UTC
Created attachment 25383 [details]
Test case with main()

Here is an updated testcase.  This one runs to completion with 4.5.1, aborts with a bus error exception on 4.6.1.

I compiled with -mabi=n32 -O1, but as far as I can tell from inspecting the generated code, the choice of ABI is not a factor.
Comment 4 Mikael Pettersson 2011-09-29 20:47:09 UTC
The updated test case caused no alignment exceptions on armv5tel-linux-gnueabi with gcc 4.6-20110923 or 4.7-20110924.  I'll check it on sparc64-linux tomorrow evening.
Comment 5 Paul Koning 2011-09-29 20:55:15 UTC
If the memcpy actually happens, that is the expected result.  The issue in the MIPS case is that the memcpy is optimized away, and the source data accessed instead, which would be ok if GCC hadn't lost track of the fact that it's a char* pointer (and, in addition, that all the fields of the struct in question are defined as "packed").
Comment 6 Mikael Pettersson 2011-10-01 10:29:07 UTC
This test case fails with a SIGBUS on sparc64-linux when compiled with -O1 -m32 by gcc 4.6-20110923 or 4.7-20110910.  gcc 4.5 and 4.4 generate correct code.
Comment 7 Mikael Pettersson 2011-11-04 15:31:25 UTC
Created attachment 25716 [details]
reduced test case

This reduced test case causes an alignment fault when compiled with gcc 4.7 or 4.6 on sparc64-linux and armv5tel-linux-gnueabi, but not when compiled with gcc 4.5 or older.  On sparc64-linux the alignment fault results in a fatal SIGBUS, on armv5tel-linux-gnueabi the fault can be fixed up by the kernel but that incurs massive runtime overhead.

The core of the test case is the following:

struct event {
    struct {
        unsigned int sec;
    } sent __attribute__((packed));
};

void __attribute__((noinline,noclone)) frob_entry(char *buf)
{
    struct event event;

    __builtin_memcpy(&event, buf, sizeof(event));
    if (event.sent.sec < 64) {
        event.sent.sec = -1U;
        __builtin_memcpy(buf, &event, sizeof(event));
    }
}

With gcc 4.6/4.7 the __builtin_memcpy() calls result in plain int-sized loads and stores on a misaligned address.  Removing '__attribute__((packed))' from the 'sent' struct avoids the bug.  Removing the 'struct {...} sent' wrapper around the 'unsigned int sec' but keeping '__attribute__((packed))' also avoids the bug.
Comment 8 Mikael Pettersson 2011-11-04 20:57:51 UTC
The regression started with r164136:
http://gcc.gnu.org/ml/gcc-cvs/2010-09/msg00428.html

Comparing the ARM assembly code from r164135 and r164136 on the reduced test case clearly shows the error:

--- pr50569.s-r164135   2011-11-04 21:06:58.000000000 +0100
+++ pr50569.s-r164136   2011-11-04 21:10:48.000000000 +0100
@@ -17,19 +17,10 @@
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
-       ldrb    r2, [r0, #1]    @ zero_extendqisi2
-       ldrb    r3, [r0, #0]    @ zero_extendqisi2
-       ldrb    r1, [r0, #2]    @ zero_extendqisi2
-       orr     r3, r3, r2, asl #8
-       ldrb    r2, [r0, #3]    @ zero_extendqisi2
-       orr     r3, r3, r1, asl #16
-       orr     r3, r3, r2, asl #24
+       ldr     r3, [r0, #0]
        cmp     r3, #63
        mvnls   r3, #0
-       strlsb  r3, [r0, #0]
-       strlsb  r3, [r0, #1]
-       strlsb  r3, [r0, #2]
-       strlsb  r3, [r0, #3]
+       strls   r3, [r0, #0]
        bx      lr
        .size   frob_entry, .-frob_entry
        .align  2
Comment 9 Eric Botcazou 2011-12-09 08:45:18 UTC
Please try the 4.6 branch (or a snapshot with the fix for PR tree-opt/51315).
Comment 10 Mikael Pettersson 2011-12-10 16:39:57 UTC
(In reply to comment #9)
> Please try the 4.6 branch (or a snapshot with the fix for PR tree-opt/51315).

Unfortunately gcc-4.6-20111209 (which includes the PR51315 fix) still causes alignment faults in this PR's test cases, on both sparc-linux and armv5tel-linux-gnueabi.
Comment 11 Eric Botcazou 2011-12-10 16:46:29 UTC
> Unfortunately gcc-4.6-20111209 (which includes the PR51315 fix) still causes
> alignment faults in this PR's test cases, on both sparc-linux and
> armv5tel-linux-gnueabi.

Bummer.  Investigating then.
Comment 12 Mikael Pettersson 2011-12-11 20:11:14 UTC
Eric's proposed patch applied on top of gcc-4.7-20111210 fixes both test cases for me on sparc-linux and arm-linux-gnueabi.
Comment 13 Eric Botcazou 2011-12-12 18:22:17 UTC
Author: ebotcazou
Date: Mon Dec 12 18:22:13 2011
New Revision: 182252

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182252
Log:
	PR tree-optimization/50569
	* tree-sra.c (build_ref_for_model): Replicate a chain of COMPONENT_REFs
	in the expression of MODEL instead of just the last one.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/20111212-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-sra.c
Comment 14 Eric Botcazou 2011-12-12 18:24:34 UTC
Author: ebotcazou
Date: Mon Dec 12 18:24:31 2011
New Revision: 182253

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182253
Log:
	PR tree-optimization/50569
	* tree-sra.c (build_ref_for_model): Replicate a chain of COMPONENT_REFs
	in the expression of MODEL instead of just the last one.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/20111212-1.c
      - copied unchanged from r182252, trunk/gcc/testsuite/gcc.c-torture/execute/20111212-1.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_6-branch/gcc/tree-sra.c
Comment 15 Eric Botcazou 2011-12-12 18:26:40 UTC
Patch installed.
Comment 16 Jiangning Liu 2012-06-12 09:24:21 UTC
Author: liujiangning
Date: Tue Jun 12 09:24:11 2012
New Revision: 188431

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188431
Log:
2011-06-12  Jiangning Liu  <jiangning.liu@arm.com>                                                        

	Backport r182252 from mainline
	2011-12-12  Eric Botcazou  <ebotcazou@adacore.com>

	PR tree-optimization/50569
	* tree-sra.c (build_ref_for_model): Replicate a chain of
	* COMPONENT_REFs
	in the expression of MODEL instead of just the last one.

2011-06-12  Jiangning Liu  <jiangning.liu@arm.com>                                                        

	Backport r182252 from mainline
	2011-12-12  Eric Botcazou  <ebotcazou@adacore.com>

	PR tree-optimization/50569
	* gcc.c-torture/execute/20111212-1.c: New test.


Added:
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.c-torture/execute/20111212-1.c
Modified:
    branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/tree-sra.c