Bug 45819 - [4.5 Regression] unexpected unaligned access to volatile int
Summary: [4.5 Regression] unexpected unaligned access to volatile int
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.5.1
: P2 normal
Target Milestone: 4.5.4
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-28 15:30 UTC by Atsushi Nemoto
Modified: 2011-07-22 12:25 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2011-07-22 09:49:13


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Atsushi Nemoto 2010-09-28 15:30:34 UTC
An unexpected unaligned access instructions for volatile int are
generated on gcc 4.5 or 4.6 with this test case.

struct st {
	int ptr;
} __attribute__ ((packed));

int foo(struct st *st)
{
	int v = *(volatile int *)&st->ptr;
	return v;
}

This test case is derived from something like this in ARM linux kernel.

struct st {
	unsigned int ptr;
} __attribute__ ((packed));
unsigned int foo(struct st *st)
{
	return readl(&st->ptr);
}

mipsel-linux-gcc-4.5.1 -O2 foo.c -S output:
	lwl	$2,3($4)
	nop
	lwr	$2,0($4)
	j	$31
	nop
arm-linux-gnueabi-gcc-4.5.1 -O2 foo.c -S output:
	ldrb	r3, [r0, #0]	@ zero_extendqisi2
	ldrb	r1, [r0, #1]	@ zero_extendqisi2
	ldrb	r2, [r0, #2]	@ zero_extendqisi2
	orr	r3, r3, r1, asl #8
	ldrb	r0, [r0, #3]	@ zero_extendqisi2
	orr	r3, r3, r2, asl #16
	orr	r0, r3, r0, asl #24
	bx	lr

It seems the cast is ignored and unaligned access is assumed.
gcc 4.4.4 works fine.
mipsel-linux-gcc-4.4.4 -O2 foo.c -S output:
	lw	$2,0($4)
	j	$31
	nop
arm-linux-gnueabi-gcc-4.4.4 -O2 foo.c -S output:
	ldr	r0, [r0, #0]
	bx	lr

The similar problem was found as PR 45704.

git-bisect tell me same commit
0d9f1189f3df5ce5c0efc3ecadc7c0a4f75b202d is the first bad commit.

Like PR 45704, reverting this change fixes this problme too.

-  STRIP_USELESS_TYPE_CONVERSION (sub);
+  STRIP_NOPS (sub);

But the final fix for PR 45704 does not fix this problem.

mipsel-linux-gcc (4.6.0 20100927):
	lbu	$5,0($4)
	lbu	$6,1($4)
	lbu	$3,2($4)
	andi	$6,$6,0x00ff
	lbu	$2,3($4)
	andi	$5,$5,0x00ff
	sll	$6,$6,8
	andi	$3,$3,0x00ff
	or	$4,$6,$5
	sll	$3,$3,16
	or	$3,$3,$4
	sll	$2,$2,24
	j	$31
	or	$2,$2,$3
arm-linux-gnueabi-gcc (4.6.0 20100927):
	ldrb	r3, [r0, #0]	@ zero_extendqisi2
	ldrb	r1, [r0, #1]	@ zero_extendqisi2
	ldrb	r2, [r0, #2]	@ zero_extendqisi2
	orr	r3, r3, r1, asl #8
	ldrb	r0, [r0, #3]	@ zero_extendqisi2
	orr	r3, r3, r2, asl #16
	orr	r0, r3, r0, asl #24
	bx	lr
Comment 1 Richard Biener 2010-09-28 16:04:38 UTC
As a matter of clean implementation I suggest to do

struct st {
    int ptr;
} __attribute__ ((packed,aligned(__alignof__(int))));

(why use packed if the int is always aligned?)
Comment 2 Atsushi Nemoto 2010-09-28 16:26:17 UTC
(In reply to comment #1)
> (why use packed if the int is always aligned?)

The original problem was found with this structure in linux ehci_def.h:
struct ehci_caps {
	u32		hc_capbase;
	u32		hcs_params;     /* HCSPARAMS - offset 0x4 */
	u32		hcc_params;      /* HCCPARAMS - offset 0x8 */
	u8		portroute [8];	 /* nibbles for routing - offset 0xC */
} __attribute__ ((packed));

In this case maybe the "packed" is not needed, but there will be other cases
which require "packed" attribute on struct for memory-mapped I/O.
Comment 3 Atsushi Nemoto 2010-09-30 02:06:18 UTC
(In reply to comment #1)
> As a matter of clean implementation I suggest to do
> 
> struct st {
>     int ptr;
> } __attribute__ ((packed,aligned(__alignof__(int))));

I confirmed this fixes the problem.
But fixing all packed structure in linux kernel like this would be so hard, I think.
Comment 4 Richard Biener 2010-11-12 13:57:45 UTC
Can someone check if 4.6 is really not affected?
Comment 5 Atsushi Nemoto 2010-11-24 14:34:00 UTC
(In reply to comment #4)
> Can someone check if 4.6 is really not affected?

arm-linux-gnueabi-gcc-4.6.0-20101124 works fine (generates ldr instruction),
but mipsel-linux-gcc-4.6.0-20101124 still generates lbu (load-byte-unsigned) instructions.
Comment 6 Richard Biener 2010-12-16 13:02:50 UTC
GCC 4.5.2 is being released, adjusting target milestone.
Comment 7 Atsushi Nemoto 2011-02-22 13:04:10 UTC
(In reply to comment #5)
> arm-linux-gnueabi-gcc-4.6.0-20101124 works fine (generates ldr instruction),

It seems that was a side-effect of -fstrict-volatile-bitfields which was enabled
by default on ARM EABI.
With -fno-strict-volatile-bitfields, arm-linux-gnueabi-gcc-4.6.0-20110222 generates
four ldrb instructions.
Comment 8 Richard Biener 2011-04-28 14:51:01 UTC
GCC 4.5.3 is being released, adjusting target milestone.
Comment 9 Khem Raj 2011-06-09 06:47:01 UTC
here is another testcase which generates loadbytes with gcc 4.6 but works as expected with gcc 4.5 on arm. gcc are latest from respective branches. and it fails irrespective of having -fstrict-volatile-bitfields or not.

struct ehci_regs {                                                              
char x;                                                                         
unsigned int port_status[0];                                                                                                                                    
} __attribute__ ((packed));                                                     
//} __attribute__ ((packed,aligned(__alignof__(int))));                         
                                                                                
struct ehci_hcd{                                                                
struct ehci_regs *regs;                                                         
};                                                                              
                                                                                
int ehci_hub_control (                                                          
 struct ehci_hcd *ehci,                                                         
 int wIndex                                                                     
) {                                                                             
 unsigned int *status_reg = &ehci->regs->port_status[wIndex];                   
 return *(volatile unsigned int *)status_reg;                                   
}
Comment 10 Richard Biener 2011-07-22 09:49:13 UTC
Testing a fix for the volatile issue in comment #9.

I can't reproduce anything wrong with the testcase from the initial comment,
that looks like a target / expander issue.

The issues probably should have had different bugs instead of lumping them
together here.
Comment 11 Richard Biener 2011-07-22 11:55:33 UTC
Author: rguenth
Date: Fri Jul 22 11:55:30 2011
New Revision: 176623

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176623
Log:
2011-07-22  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/45819
	* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Properly
	preserve volatile and notrap flags.

	* gcc.dg/pr45819.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/pr45819.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-forwprop.c
Comment 12 Richard Biener 2011-07-22 12:19:25 UTC
Author: rguenth
Date: Fri Jul 22 12:19:21 2011
New Revision: 176624

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176624
Log:
2011-07-22  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/45819
	* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Properly
	preserve volatile and notrap flags.

	* gcc.dg/pr45819.c: New testcase.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr45819.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_6-branch/gcc/tree-ssa-forwprop.c
Comment 13 Richard Biener 2011-07-22 12:20:30 UTC
Comment #9 should be fixed now.  That leaves the initial report which I
can't reproduce - thus, not mine anymore.

Please someone verify comment #9 on arm.
Comment 14 Richard Biener 2011-07-22 12:21:30 UTC
Oh, the initial testcase was invalid anyway.
Comment 15 Richard Biener 2011-07-22 12:25:01 UTC
struct ehci_regs {                                                              
char x;                                                                         
unsigned int port_status[0];                                                    
} __attribute__ ((packed));                                                     
//} __attribute__ ((packed,aligned(__alignof__(int))));                         

struct ehci_hcd{                                                                
struct ehci_regs *regs;                                                         
};                                                                              

int ehci_hub_control (                                                          
 struct ehci_hcd *ehci,                                                         
 int wIndex                                                                     
) {                                                                             
 unsigned int *status_reg = &ehci->regs->port_status[wIndex];                   
 return *(volatile unsigned int *)status_reg;                                   
}

this one is invalid as well, with or without the aligned attribute.

ehci->regs->port_status[wIndex] _is_ unaligned.  I don't think there
is currently a way to tell GCC that the struct layout of ehci_regs is
packed but port_status is properly aligned to the natural alignment of int.