Bug 64516 - [5 Regression] arm: wrong unaligned load generated
Summary: [5 Regression] arm: wrong unaligned load generated
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.9.2
: P2 normal
Target Milestone: 5.5
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2015-01-07 02:07 UTC by Markus F.X.J. Oberhumer
Modified: 2017-09-18 13:16 UTC (History)
2 users (show)

See Also:
Host:
Target: arm-linux-gnueabi
Build:
Known to work: 4.3.3, 5.4.1, 6.2.0, 7.0
Known to fail: 4.9.0, 4.9.1, 4.9.2, 5.3.0, 5.4.0, 6.1.0
Last reconfirmed: 2015-01-16 00:00:00


Attachments
unaligned_load_bug.c to reproduce the problem. (464 bytes, text/plain)
2015-01-07 02:07 UTC, Markus F.X.J. Oberhumer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Markus F.X.J. Oberhumer 2015-01-07 02:07:49 UTC
Created attachment 34391 [details]
unaligned_load_bug.c to reproduce the problem.

arm: bad code generated => run time crash

- gcc 4.9.2 seems to forget the attribute(packed) in some cases.

- please see the disassembly for get16_unaligned().

- put16_unaligned() is correct !

- FWIW, gcc 4.3.5 and clang 3.5 work

~Markus


$ cat unaligned_load_bug.c
typedef struct { char a[2]; } __attribute__((__packed__)) TU2;

// OK: correct (but poor) code generated for the store
void put16_unaligned(void *p, unsigned short v) {
    if (sizeof(TU2) == sizeof(v) && __alignof__(TU2) == 1) {
        *(TU2 *)p = *(const TU2 *)(const void *)(&v);
    }
}

// BUG: incorrect transformation into an aligned load => run time crash !!
unsigned short get16_unaligned(const void *p) {
    unsigned short v;
    if (sizeof(TU2) == sizeof(v) && __alignof__(TU2) == 1) {
        *(TU2 *)(void *)(&v) = *(const TU2 *)p;
    }
    return v;
}

// aligned versions - just for comparison
void put16_aligned(void *p, unsigned short v) {
    *(unsigned short *)p = *(&v);
}
unsigned short get16_aligned(const void *p) {
    unsigned short v;
    *(&v) = *(const unsigned short *)p;
    return v;
}

// EOF

$ arm-linux-gnueabi-gcc-4.9 -v
gcc version 4.9.2 (Ubuntu/Linaro 4.9.2-7ubuntu3)

$ arm-linux-gnueabi-gcc-4.9 -march=armv4 -marm -O2 -Wall -W -Wcast-align -c unaligned_load_bug.c

$ arm-linux-gnueabi-objdump -d unaligned_load_bug.o

unaligned_load_bug.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <put16_unaligned>:
   0:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
   4:   e24dd00c        sub     sp, sp, #12
   8:   e28d3008        add     r3, sp, #8
   c:   e16310b2        strh    r1, [r3, #-2]!
  10:   e3a02002        mov     r2, #2
  14:   e1a01003        mov     r1, r3
  18:   ebfffffe        bl      0 <memcpy>
  1c:   e28dd00c        add     sp, sp, #12
  20:   e49de004        pop     {lr}            ; (ldr lr, [sp], #4)
  24:   e12fff1e        bx      lr

00000028 <get16_unaligned>:
  28:   e1d000b0        ldrh    r0, [r0]
  2c:   e12fff1e        bx      lr

00000030 <put16_aligned>:
  30:   e1c010b0        strh    r1, [r0]
  34:   e12fff1e        bx      lr

00000038 <get16_aligned>:
  38:   e1d000b0        ldrh    r0, [r0]
  3c:   e12fff1e        bx      lr
Comment 1 Ramana Radhakrishnan 2015-01-16 14:24:12 UTC
Hmmm, I'm not sure if such type punning pushes the attributes through. We seem to think that the alignment will be 16 bits and not 8 bits at expand time.

(insn 6 3 7 2 (set (reg:SI 115)
        (zero_extend:SI (mem:HI (reg/v/f:SI 112 [ p ]) [1 MEM[(const struct TU2 *)p_2(D)]+0 S2 A16]))) /tmp/al.c:17 -1
     (nil))
Comment 2 Markus F.X.J. Oberhumer 2015-02-05 12:11:49 UTC
This is a wrong code regression against gcc 4.3.3.

I've booted an older ARM Ubuntu box just for testing:

$ gcc -v
[...]
gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4)

$ gcc -O3 -Wall -W -Wcast-align -c unaligned_load_bug.c

$ objdump -d unaligned_load_bug.o

unaligned_load_bug.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <put16_unaligned>:
   0:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
   4:   e24dd00c        sub     sp, sp, #12     ; 0xc
   8:   e28d3008        add     r3, sp, #8      ; 0x8
   c:   e3a02002        mov     r2, #2  ; 0x2
  10:   e16310b2        strh    r1, [r3, #-2]!
  14:   e1a01003        mov     r1, r3
  18:   ebfffffe        bl      0 <memcpy>
  1c:   e28dd00c        add     sp, sp, #12     ; 0xc
  20:   e8bd8000        pop     {pc}

00000024 <get16_unaligned>:
  24:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
  28:   e24dd00c        sub     sp, sp, #12     ; 0xc
  2c:   e1a01000        mov     r1, r0
  30:   e3a02002        mov     r2, #2  ; 0x2
  34:   e28d0006        add     r0, sp, #6      ; 0x6
  38:   ebfffffe        bl      0 <memcpy>
  3c:   e1dd00b6        ldrh    r0, [sp, #6]
  40:   e28dd00c        add     sp, sp, #12     ; 0xc
  44:   e8bd8000        pop     {pc}

00000048 <put16_aligned>:
  48:   e1c010b0        strh    r1, [r0]
  4c:   e12fff1e        bx      lr

00000050 <get16_aligned>:
  50:   e1d000b0        ldrh    r0, [r0]
  54:   e12fff1e        bx      lr
Comment 3 Markus F.X.J. Oberhumer 2015-02-05 12:34:33 UTC
Just for reference, here is the expected result:

$ clang -target armv5-unknown-linux-gnu -marm -mfloat-abi=soft --version
clang version 3.5.1 (tags/RELEASE_351/final)
Target: armv5-unknown-linux-gnu
Thread model: posix

$ clang -target armv5-unknown-linux-gnu -marm -mfloat-abi=soft -O2 -Wall -W -Wcast-align -c unaligned_load_bug.c

$ arm-linux-gnueabihf-objdump -d unaligned_load_bug.o

unaligned_load_bug.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <put16_unaligned>:
   0:   e5c01000        strb    r1, [r0]
   4:   e1a01421        lsr     r1, r1, #8
   8:   e5c01001        strb    r1, [r0, #1]
   c:   e12fff1e        bx      lr

00000010 <get16_unaligned>:
  10:   e5d01000        ldrb    r1, [r0]
  14:   e5d00001        ldrb    r0, [r0, #1]
  18:   e1810400        orr     r0, r1, r0, lsl #8
  1c:   e12fff1e        bx      lr

00000020 <put16_aligned>:
  20:   e1c010b0        strh    r1, [r0]
  24:   e12fff1e        bx      lr

00000028 <get16_aligned>:
  28:   e1d000b0        ldrh    r0, [r0]
  2c:   e12fff1e        bx      lr
Comment 4 Richard Biener 2016-06-10 12:17:21 UTC
get16_unaligned should work.  Confirmed on trunk.

(insn 6 5 7 (set (reg:SI 114)
        (zero_extend:SI (mem:HI (reg/v/f:SI 111 [ p ]) [1 MEM[(const struct TU2 *)p_2(D)]+0 S2 A16]))) t.c:7 -1
     (nil))

This is because the MEM_REF has a wrong type/alignment on it.  Ah, and it's
one folding I know is wrong for a long time ... :/

    case VIEW_CONVERT_EXPR:
      if (TREE_CODE (op0) == MEM_REF)
        {
          tem = fold_build2_loc (loc, MEM_REF, type,
                                 TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1));
          REF_REVERSE_STORAGE_ORDER (tem) = REF_REVERSE_STORAGE_ORDER (op0);
          return tem;
        }

this drops alignment info.

Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c    (revision 237286)
+++ gcc/fold-const.c    (working copy)
@@ -7964,6 +7974,8 @@ fold_unary_loc (location_t loc, enum tre
     case VIEW_CONVERT_EXPR:
       if (TREE_CODE (op0) == MEM_REF)
         {
+         if (TYPE_ALIGN (TREE_TYPE (op0)) != TYPE_ALIGN (type))
+           type = build_aligned_type (type, TYPE_ALIGN (TREE_TYPE (op0)));
          tem = fold_build2_loc (loc, MEM_REF, type,
                                 TREE_OPERAND (op0, 0), TREE_OPERAND (op0, 1));
          REF_REVERSE_STORAGE_ORDER (tem) = REF_REVERSE_STORAGE_ORDER (op0);

then generates

get16_unaligned:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldrb    r3, [r0]        @ zero_extendqisi2
        ldrb    r0, [r0, #1]    @ zero_extendqisi2
        orr     r0, r3, r0, lsl #8
        mov     pc, lr
Comment 5 Richard Biener 2016-06-13 07:35:17 UTC
Author: rguenth
Date: Mon Jun 13 07:34:45 2016
New Revision: 237355

URL: https://gcc.gnu.org/viewcvs?rev=237355&root=gcc&view=rev
Log:
2016-06-13  Richard Biener  <rguenther@suse.de>

	PR middle-end/64516
	* fold-const.c (fold_unary_loc): Preserve alignment when
	folding a VIEW_CONVERT_EXPR into a MEM_REF.

	* gcc.dg/align-3.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/align-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog
Comment 6 Richard Biener 2016-06-13 07:41:17 UTC
Fixed on trunk sofar.
Comment 7 Richard Biener 2016-07-07 11:37:56 UTC
Author: rguenth
Date: Thu Jul  7 11:37:24 2016
New Revision: 238085

URL: https://gcc.gnu.org/viewcvs?rev=238085&root=gcc&view=rev
Log:
2016-07-07  Richard Biener  <rguenther@suse.de>

        Backport from mainline
        2016-06-13  Richard Biener  <rguenther@suse.de>

	PR middle-end/64516
	* fold-const.c (fold_unary_loc): Preserve alignment when
	folding a VIEW_CONVERT_EXPR into a MEM_REF.

	* gcc.dg/align-3.c: New testcase.

Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.dg/align-3.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/fold-const.c
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
Comment 8 Markus F.X.J. Oberhumer 2016-08-02 07:19:08 UTC
Richard, many thanks for finally fixing this issue.

Will there be a backport to 4.9.4 and/or 5.4 ?
Comment 9 rguenther@suse.de 2016-08-02 07:30:57 UTC
On Tue, 2 Aug 2016, markus at oberhumer dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64516
> 
> --- Comment #8 from Markus F.X.J. Oberhumer <markus at oberhumer dot com> ---
> Richard, many thanks for finally fixing this issue.
> 
> Will there be a backport to 4.9.4 and/or 5.4 ?

It's too late for 4.9.4 but I will backport for 5.5.

Richard.
Comment 10 Richard Biener 2016-08-03 11:43:43 UTC
GCC 4.9 branch is being closed
Comment 11 Markus F.X.J. Oberhumer 2016-10-03 10:21:13 UTC
Did you have a chance to backport this for 5.5 yet ?

Thanks,
Markus
Comment 12 rguenther@suse.de 2016-10-03 16:31:30 UTC
On October 3, 2016 12:21:13 PM GMT+02:00, markus at oberhumer dot com <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64516
>
>--- Comment #11 from Markus F.X.J. Oberhumer <markus at oberhumer dot
>com> ---
>Did you have a chance to backport this for 5.5 yet ?

Not yet, I tried but the patch didn't trivially apply.

>Thanks,
>Markus
Comment 13 Richard Biener 2017-09-18 13:15:18 UTC
Author: rguenth
Date: Mon Sep 18 13:14:45 2017
New Revision: 252926

URL: https://gcc.gnu.org/viewcvs?rev=252926&root=gcc&view=rev
Log:
2017-09-18  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2017-04-10  Richard Biener  <rguenther@suse.de>

	PR middle-end/80362
	* fold-const.c (fold_binary_loc): Look at unstripped ops when
	looking for NEGATE_EXPR in -A / -B to A / B folding.

	* gcc.dg/torture/pr80362.c: New testcase.

	2015-11-25  Richard Biener  <rguenther@suse.de>

	PR middle-end/68528
	* fold-const.c (fold_binary_loc): Do not call negate_expr_p
	on stripped operands.

	* gcc.dg/torture/pr68528.c: New testcase.

	2017-03-27  Richard Biener  <rguenther@suse.de>

	PR middle-end/80171
	* gimple-fold.c (fold_ctor_reference): Properly guard against
	NULL return value from canonicalize_constructor_val.

	* g++.dg/torture/pr80171.C: New testcase.

	2016-06-13  Richard Biener  <rguenther@suse.de>

	PR middle-end/64516
	* fold-const.c (fold_unary_loc): Preserve alignment when
	folding a VIEW_CONVERT_EXPR into a MEM_REF.

	* gcc.dg/align-3.c: New testcase.

Added:
    branches/gcc-5-branch/gcc/testsuite/g++.dg/torture/pr80171.C
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/align-3.c
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr68528.c
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr80362.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/fold-const.c
    branches/gcc-5-branch/gcc/gimple-fold.c
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
Comment 14 Richard Biener 2017-09-18 13:16:33 UTC
Fixed.