The attached code when compiled with gcc-3.3 or 3.4 at -O2 generates a "ldd -23 (%r3),%r19" sequence which is wrong. Compiling with -O1 or -O2 -fno-gcse works around the problem. The offending asm seems to be generated from the if- statement in loop_unregister_transfer(). Removing the up/down calls make the problem go away. This test case was derived from the linux kernel, drivers/block/loop.c. thanks, randolph -------- 8< cut here 8< ------------- #define __PA_LDCW_ALIGNMENT 16 #define __ldcw_align(a) ({ \ unsigned long __ret = (unsigned long) a; \ __ret = (__ret + __PA_LDCW_ALIGNMENT - 1) & ~(__PA_LDCW_ALIGNMENT - 1); \ (volatile unsigned int *) __ret; \ }) #define __ldcw(a) ({ \ unsigned __ret; \ __asm__ __volatile__("ldcw 0(%1),%0" : "=r" (__ret) : "r" (a)); \ __ret; \ }) typedef struct { volatile unsigned int lock[4]; } spinlock_t; struct semaphore { spinlock_t sentry; int count; }; struct loop_device { struct loop_func_table *lo_encryption; struct semaphore lo_ctl_mutex; }; static int max_loop = 8; static struct loop_device *loop_dev; void __down (struct semaphore *sem); void __up (struct semaphore *sem); static inline void _raw_spin_lock (spinlock_t * x) { volatile unsigned int *a = __ldcw_align(&x->lock[0]); while (__ldcw(a) == 0) while (*a == 0); } static inline void _raw_spin_unlock (spinlock_t * x) { volatile unsigned int *a = __ldcw_align(&x->lock[0]); *a = 1; } inline void down (struct semaphore *sem) { _raw_spin_lock (&sem->sentry); if (sem->count > 0) { sem->count--; } else { __down (sem); } _raw_spin_unlock (&sem->sentry); } inline void up (struct semaphore *sem) { _raw_spin_lock (&sem->sentry); if (sem->count < 0) { __up (sem); } else { sem->count++; } _raw_spin_unlock (&sem->sentry); } int loop_unregister_transfer (int number) { struct loop_device *lo = &loop_dev[0]; for (lo = &loop_dev[0]; lo < &loop_dev[max_loop]; lo++) { down (&lo->lo_ctl_mutex); if (lo->lo_encryption == 0) loop_release_xfer (lo); up (&lo->lo_ctl_mutex); } return 0; } -------- 8< cut here 8< -------------
Does -fno-strict-aliasing help?
Nope, sorry legolas[20:55] /tmp% hppa64-linux-gcc -v Reading specs from /usr/lib/gcc-lib/hppa64-linux/3.3.3/specs Configured with: ../src/configure --enable-languages=c --prefix=/usr --disable- shared --disable-nls --enable-sjlj-exceptions --disable-threads -- includedir=/usr/hppa64-linux/include --with-as=/usr/bin/hppa64-linux-as --with- ld=/usr/bin/hppa64-linux-ld --host=hppa-linux --build=hppa-linux -- target=hppa64-linux Thread model: single gcc version 3.3.3 (Debian 20040306) legolas[20:55] /tmp% hppa64-linux-gcc -O2 -fno-strict-aliasing -c bug.c /tmp/cc91TVyq.s: Assembler messages: /tmp/cc91TVyq.s:67: Error: Field not properly aligned [8] (-23). /tmp/cc91TVyq.s:67: Error: Invalid operands
Retargeting as this is a regression for 3.4.0 also.
Created attachment 6515 [details] Another test case One more test case that triggers the bug. Hope it helps ;-)
[adding off topic information, that the bug was submitted to the Debian BTS as well as http://bugs.debian.org/253883 ]
Created attachment 6576 [details] Triggers code generation of unaligned load. Attached is a simpler testcase than the original, I'm triggering the bug in 3.3. I'm using the compiler right now for 64-bit kernel compilations and some initial 64-bit userspace bring-up. If there is anything I can do to help please ask. This bug is a pain for hppa64 users :( Thanks for all the help! Cheers, Carlos.
With Carlos's testcase at -O2, the initial insns in the function bug are: ldi 0,%r20 ldo 1(%r26),%r26 ldo 1(%r20),%r20 ldd 23(%r26),%r21 This appears to be a bug in the PA GO_IF_LEGITIMATE_ADDRESS macro. It should reject 14-bit offsets for DImode loads that aren't properly aligned as ldd doesn't support them. ldd differs from ldb/ldh/ldw in this respect. It needs both the base and displacement aligned.
Created attachment 6577 [details] Possible fix. Applies against 3.5.
Confirmed.
Created attachment 6581 [details] patch for 3.3 branch
bootstrapped 3.3 branch for hpux-linux with no regressions compared to a boostrap with without this patch. Matthias
This is OK for 3.4.1; please apply.
Subject: Bug 14782 CVSROOT: /cvs/gcc Module name: gcc Changes by: danglin@gcc.gnu.org 2004-06-21 23:49:05 Modified files: gcc : ChangeLog gcc/config/pa : pa.c pa.h Log message: PR rtl-optimization/14782 * pa.c (emit_move_sequence): Use SFmode for 4-byte modes when doing the address checks for secondary reloads for loads from and stores to floating-point registers. * pa.h (EXTRA_CONSTRAINT, case T): Use SFmode for 4-byte modes in the address check. Move work around for ELF32 targets to GO_IF_LEGITIMATE_ADDRESS. (GO_IF_LEGITIMATE_ADDRESS): Require constant offsets to be correctly aligned for DImode loads and stores. Don't allow long SFmode displacements on ELF32. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.4071&r2=2.4072 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.c.diff?cvsroot=gcc&r1=1.251&r2=1.252 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.h.diff?cvsroot=gcc&r1=1.218&r2=1.219
Subject: Bug 14782 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_4-branch Changes by: danglin@gcc.gnu.org 2004-06-22 00:18:09 Modified files: gcc : ChangeLog gcc/config/pa : pa.c pa.h Log message: PR rtl-optimization/14782 * pa.c (emit_move_sequence): Use SFmode for 4-byte modes when doing the address checks for secondary reloads for loads from and stores to floating-point registers. * pa.h (EXTRA_CONSTRAINT, case T): Use SFmode for 4-byte modes in the address check. Move work around for ELF32 targets to GO_IF_LEGITIMATE_ADDRESS. (GO_IF_LEGITIMATE_ADDRESS): Require constant offsets to be correctly aligned for DImode loads and stores. Don't allow long SFmode displacements on ELF32. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.517&r2=2.2326.2.518 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.235.4.4&r2=1.235.4.5 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.h.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.208.4.5&r2=1.208.4.6
Fixed in 3.4 and 3.5 branches.
Created attachment 6600 [details] Fix for 3.3 branch
(In reply to comment #16) > Created an attachment (id=6600) > Fix for 3.3 branch OK for 3.3.x.
Subject: Bug 14782 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_3-branch Changes by: doko@gcc.gnu.org 2004-07-25 18:53:32 Modified files: gcc : ChangeLog gcc/config/pa : pa.c pa.h Log message: 2004-06-21 John David Anglin <dave.anglin@nrc-cnrc.gc.ca> PR rtl-optimization/14782 * pa.c (emit_move_sequence): Use SFmode for 4-byte modes when doing the address checks for secondary reloads for loads from and stores to floating-point registers. * pa.h (EXTRA_CONSTRAINT, case T): Use SFmode for 4-byte modes in the address check. Move work around for ELF32 targets to GO_IF_LEGITIMATE_ADDRESS. (GO_IF_LEGITIMATE_ADDRESS): Require constant offsets to be correctly aligned for DImode loads and stores. Don't allow long SFmode displacements on ELF32. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.16114.2.1004&r2=1.16114.2.1005 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.188.2.15&r2=1.188.2.16 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.h.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.178.2.6&r2=1.178.2.7
Fixed in 3.3.5 also.