An unexpected unaligned access instructions for volatile int are generated on gcc 4.5 or 4.6 with this test case. struct st { int ptr; } __attribute__ ((packed)); int foo(struct st *st) { int v = *(volatile int *)&st->ptr; return v; } This test case is derived from something like this in ARM linux kernel. struct st { unsigned int ptr; } __attribute__ ((packed)); unsigned int foo(struct st *st) { return readl(&st->ptr); } mipsel-linux-gcc-4.5.1 -O2 foo.c -S output: lwl $2,3($4) nop lwr $2,0($4) j $31 nop arm-linux-gnueabi-gcc-4.5.1 -O2 foo.c -S output: ldrb r3, [r0, #0] @ zero_extendqisi2 ldrb r1, [r0, #1] @ zero_extendqisi2 ldrb r2, [r0, #2] @ zero_extendqisi2 orr r3, r3, r1, asl #8 ldrb r0, [r0, #3] @ zero_extendqisi2 orr r3, r3, r2, asl #16 orr r0, r3, r0, asl #24 bx lr It seems the cast is ignored and unaligned access is assumed. gcc 4.4.4 works fine. mipsel-linux-gcc-4.4.4 -O2 foo.c -S output: lw $2,0($4) j $31 nop arm-linux-gnueabi-gcc-4.4.4 -O2 foo.c -S output: ldr r0, [r0, #0] bx lr The similar problem was found as PR 45704. git-bisect tell me same commit 0d9f1189f3df5ce5c0efc3ecadc7c0a4f75b202d is the first bad commit. Like PR 45704, reverting this change fixes this problme too. - STRIP_USELESS_TYPE_CONVERSION (sub); + STRIP_NOPS (sub); But the final fix for PR 45704 does not fix this problem. mipsel-linux-gcc (4.6.0 20100927): lbu $5,0($4) lbu $6,1($4) lbu $3,2($4) andi $6,$6,0x00ff lbu $2,3($4) andi $5,$5,0x00ff sll $6,$6,8 andi $3,$3,0x00ff or $4,$6,$5 sll $3,$3,16 or $3,$3,$4 sll $2,$2,24 j $31 or $2,$2,$3 arm-linux-gnueabi-gcc (4.6.0 20100927): ldrb r3, [r0, #0] @ zero_extendqisi2 ldrb r1, [r0, #1] @ zero_extendqisi2 ldrb r2, [r0, #2] @ zero_extendqisi2 orr r3, r3, r1, asl #8 ldrb r0, [r0, #3] @ zero_extendqisi2 orr r3, r3, r2, asl #16 orr r0, r3, r0, asl #24 bx lr
As a matter of clean implementation I suggest to do struct st { int ptr; } __attribute__ ((packed,aligned(__alignof__(int)))); (why use packed if the int is always aligned?)
(In reply to comment #1) > (why use packed if the int is always aligned?) The original problem was found with this structure in linux ehci_def.h: struct ehci_caps { u32 hc_capbase; u32 hcs_params; /* HCSPARAMS - offset 0x4 */ u32 hcc_params; /* HCCPARAMS - offset 0x8 */ u8 portroute [8]; /* nibbles for routing - offset 0xC */ } __attribute__ ((packed)); In this case maybe the "packed" is not needed, but there will be other cases which require "packed" attribute on struct for memory-mapped I/O.
(In reply to comment #1) > As a matter of clean implementation I suggest to do > > struct st { > int ptr; > } __attribute__ ((packed,aligned(__alignof__(int)))); I confirmed this fixes the problem. But fixing all packed structure in linux kernel like this would be so hard, I think.
Can someone check if 4.6 is really not affected?
(In reply to comment #4) > Can someone check if 4.6 is really not affected? arm-linux-gnueabi-gcc-4.6.0-20101124 works fine (generates ldr instruction), but mipsel-linux-gcc-4.6.0-20101124 still generates lbu (load-byte-unsigned) instructions.
GCC 4.5.2 is being released, adjusting target milestone.
(In reply to comment #5) > arm-linux-gnueabi-gcc-4.6.0-20101124 works fine (generates ldr instruction), It seems that was a side-effect of -fstrict-volatile-bitfields which was enabled by default on ARM EABI. With -fno-strict-volatile-bitfields, arm-linux-gnueabi-gcc-4.6.0-20110222 generates four ldrb instructions.
GCC 4.5.3 is being released, adjusting target milestone.
here is another testcase which generates loadbytes with gcc 4.6 but works as expected with gcc 4.5 on arm. gcc are latest from respective branches. and it fails irrespective of having -fstrict-volatile-bitfields or not. struct ehci_regs { char x; unsigned int port_status[0]; } __attribute__ ((packed)); //} __attribute__ ((packed,aligned(__alignof__(int)))); struct ehci_hcd{ struct ehci_regs *regs; }; int ehci_hub_control ( struct ehci_hcd *ehci, int wIndex ) { unsigned int *status_reg = &ehci->regs->port_status[wIndex]; return *(volatile unsigned int *)status_reg; }
Testing a fix for the volatile issue in comment #9. I can't reproduce anything wrong with the testcase from the initial comment, that looks like a target / expander issue. The issues probably should have had different bugs instead of lumping them together here.
Author: rguenth Date: Fri Jul 22 11:55:30 2011 New Revision: 176623 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176623 Log: 2011-07-22 Richard Guenther <rguenther@suse.de> PR tree-optimization/45819 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Properly preserve volatile and notrap flags. * gcc.dg/pr45819.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/pr45819.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-forwprop.c
Author: rguenth Date: Fri Jul 22 12:19:21 2011 New Revision: 176624 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176624 Log: 2011-07-22 Richard Guenther <rguenther@suse.de> PR tree-optimization/45819 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Properly preserve volatile and notrap flags. * gcc.dg/pr45819.c: New testcase. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr45819.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/testsuite/ChangeLog branches/gcc-4_6-branch/gcc/tree-ssa-forwprop.c
Comment #9 should be fixed now. That leaves the initial report which I can't reproduce - thus, not mine anymore. Please someone verify comment #9 on arm.
Oh, the initial testcase was invalid anyway.
struct ehci_regs { char x; unsigned int port_status[0]; } __attribute__ ((packed)); //} __attribute__ ((packed,aligned(__alignof__(int)))); struct ehci_hcd{ struct ehci_regs *regs; }; int ehci_hub_control ( struct ehci_hcd *ehci, int wIndex ) { unsigned int *status_reg = &ehci->regs->port_status[wIndex]; return *(volatile unsigned int *)status_reg; } this one is invalid as well, with or without the aligned attribute. ehci->regs->port_status[wIndex] _is_ unaligned. I don't think there is currently a way to tell GCC that the struct layout of ehci_regs is packed but port_status is properly aligned to the natural alignment of int.