Bug 14782 - [3.3/3.4/4.0 Regression] produces an unaligned data access at -O2
Summary: [3.3/3.4/4.0 Regression] produces an unaligned data access at -O2
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.4.0
: P2 normal
Target Milestone: 3.3.5
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2004-03-30 07:38 UTC by Randolph Chung
Modified: 2004-10-30 21:10 UTC (History)
4 users (show)

See Also:
Host: hppa-linux
Target: hppa64-linux
Build: hppa-linux
Known to work: 3.3.5 3.4.1 4.0.0
Known to fail:
Last reconfirmed: 2004-06-20 06:07:18


Attachments
Another test case (276 bytes, text/plain)
2004-06-11 05:36 UTC, Randolph Chung
Details
Triggers code generation of unaligned load. (272 bytes, text/x-csrc)
2004-06-19 20:20 UTC, Carlos O'Donell
Details
Possible fix. (593 bytes, patch)
2004-06-20 01:20 UTC, John David Anglin
Details | Diff
patch for 3.3 branch (520 bytes, patch)
2004-06-20 10:19 UTC, Debian GCC Maintainers
Details | Diff
Fix for 3.3 branch (2.11 KB, patch)
2004-06-22 01:49 UTC, John David Anglin
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Randolph Chung 2004-03-30 07:38:35 UTC
The attached code when compiled with gcc-3.3 or 3.4 at -O2 generates a "ldd -23
(%r3),%r19" sequence which is wrong. Compiling with -O1 or -O2 -fno-gcse works 
around the problem. The offending asm seems to be generated from the if-
statement in loop_unregister_transfer(). Removing the up/down calls make the 
problem go away. This test case was derived from the linux kernel, 
drivers/block/loop.c.

thanks, randolph

-------- 8< cut here 8< -------------
#define __PA_LDCW_ALIGNMENT 16
#define __ldcw_align(a) ({ \
  unsigned long __ret = (unsigned long) a;                              \
  __ret = (__ret + __PA_LDCW_ALIGNMENT - 1) & ~(__PA_LDCW_ALIGNMENT - 1); \
  (volatile unsigned int *) __ret;                                      \
})


#define __ldcw(a) ({ \
        unsigned __ret; \
        __asm__ __volatile__("ldcw 0(%1),%0" : "=r" (__ret) : "r" (a)); \
        __ret; \
})

typedef struct {
  volatile unsigned int lock[4];
} spinlock_t;

struct semaphore
{
  spinlock_t sentry;
  int count;
};

struct loop_device
{
  struct loop_func_table *lo_encryption;
  struct semaphore lo_ctl_mutex;
};

static int max_loop = 8;
static struct loop_device *loop_dev;

void __down (struct semaphore *sem);
void __up (struct semaphore *sem);

static inline void _raw_spin_lock (spinlock_t * x)
{
  volatile unsigned int *a = __ldcw_align(&x->lock[0]);

  while (__ldcw(a) == 0)
    while (*a == 0);
}

static inline void _raw_spin_unlock (spinlock_t * x)
{
  volatile unsigned int *a = __ldcw_align(&x->lock[0]);
  *a = 1;
}

inline void down (struct semaphore *sem)
{
  _raw_spin_lock (&sem->sentry);
  if (sem->count > 0) { sem->count--; }
  else { __down (sem); }
  _raw_spin_unlock (&sem->sentry);
}

inline void up (struct semaphore *sem)
{
  _raw_spin_lock (&sem->sentry);
  if (sem->count < 0) { __up (sem); }
  else { sem->count++; }
  _raw_spin_unlock (&sem->sentry);
}

int
loop_unregister_transfer (int number)
{
  struct loop_device *lo = &loop_dev[0];

  for (lo = &loop_dev[0]; lo < &loop_dev[max_loop]; lo++)
    {
      down (&lo->lo_ctl_mutex);

      if (lo->lo_encryption == 0)
        loop_release_xfer (lo);

      up (&lo->lo_ctl_mutex);
    }

  return 0;
}
-------- 8< cut here 8< -------------
Comment 1 Andrew Pinski 2004-04-07 03:22:22 UTC
Does -fno-strict-aliasing help?
Comment 2 Randolph Chung 2004-04-07 04:19:40 UTC
Nope, sorry

legolas[20:55] /tmp% hppa64-linux-gcc -v
Reading specs from /usr/lib/gcc-lib/hppa64-linux/3.3.3/specs
Configured with: ../src/configure --enable-languages=c --prefix=/usr --disable-
shared --disable-nls --enable-sjlj-exceptions --disable-threads --
includedir=/usr/hppa64-linux/include --with-as=/usr/bin/hppa64-linux-as --with-
ld=/usr/bin/hppa64-linux-ld --host=hppa-linux --build=hppa-linux --
target=hppa64-linux
Thread model: single
gcc version 3.3.3 (Debian 20040306)
legolas[20:55] /tmp% hppa64-linux-gcc -O2 -fno-strict-aliasing -c bug.c
/tmp/cc91TVyq.s: Assembler messages:
/tmp/cc91TVyq.s:67: Error: Field not properly aligned [8] (-23).
/tmp/cc91TVyq.s:67: Error: Invalid operands
Comment 3 Andrew Pinski 2004-06-08 19:25:02 UTC
Retargeting as this is a regression for 3.4.0 also.
Comment 4 Randolph Chung 2004-06-11 05:36:42 UTC
Created attachment 6515 [details]
Another test case

One more test case that triggers the bug. Hope it helps ;-)
Comment 5 Debian GCC Maintainers 2004-06-13 10:26:30 UTC
[adding off topic information, that the bug was submitted to the Debian BTS as
well as http://bugs.debian.org/253883 ]
Comment 6 Carlos O'Donell 2004-06-19 20:20:16 UTC
Created attachment 6576 [details]
Triggers code generation of unaligned load.

Attached is a simpler testcase than the original, I'm triggering the bug in
3.3. I'm using the compiler right now for 64-bit kernel compilations and some
initial 64-bit userspace bring-up. If there is anything I can do to help please
ask. This bug is a pain for hppa64 users :(

Thanks for all the help!
Cheers,
Carlos.
Comment 7 John David Anglin 2004-06-20 00:08:23 UTC
With Carlos's testcase at -O2, the initial insns in the function bug are:

        ldi 0,%r20
        ldo 1(%r26),%r26
        ldo 1(%r20),%r20
        ldd 23(%r26),%r21

This appears to be a bug in the PA GO_IF_LEGITIMATE_ADDRESS macro.  It should
reject 14-bit offsets for DImode loads that aren't properly aligned as ldd
doesn't support them.  ldd differs from ldb/ldh/ldw in this respect.  It
needs both the base and displacement aligned.
Comment 8 John David Anglin 2004-06-20 01:20:43 UTC
Created attachment 6577 [details]
Possible fix.

Applies against 3.5.
Comment 9 Andrew Pinski 2004-06-20 06:07:18 UTC
Confirmed.
Comment 10 Debian GCC Maintainers 2004-06-20 10:19:02 UTC
Created attachment 6581 [details]
patch for 3.3 branch
Comment 11 Debian GCC Maintainers 2004-06-20 19:23:13 UTC
bootstrapped 3.3 branch for hpux-linux with no regressions compared to a
boostrap with without this patch.

    Matthias
Comment 12 Mark Mitchell 2004-06-21 21:13:19 UTC
This is OK for 3.4.1; please apply.
Comment 13 GCC Commits 2004-06-21 23:49:11 UTC
Subject: Bug 14782

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	danglin@gcc.gnu.org	2004-06-21 23:49:05

Modified files:
	gcc            : ChangeLog 
	gcc/config/pa  : pa.c pa.h 

Log message:
	PR rtl-optimization/14782
	* pa.c (emit_move_sequence): Use SFmode for 4-byte modes when doing
	the address checks for secondary reloads for loads from and stores
	to floating-point registers.
	* pa.h (EXTRA_CONSTRAINT, case T): Use SFmode for 4-byte modes
	in the address check.  Move work around for ELF32 targets to
	GO_IF_LEGITIMATE_ADDRESS.
	(GO_IF_LEGITIMATE_ADDRESS): Require constant offsets to be
	correctly aligned for DImode loads and stores.  Don't allow long
	SFmode displacements on ELF32.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.4071&r2=2.4072
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.c.diff?cvsroot=gcc&r1=1.251&r2=1.252
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.h.diff?cvsroot=gcc&r1=1.218&r2=1.219

Comment 14 GCC Commits 2004-06-22 00:18:13 UTC
Subject: Bug 14782

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	danglin@gcc.gnu.org	2004-06-22 00:18:09

Modified files:
	gcc            : ChangeLog 
	gcc/config/pa  : pa.c pa.h 

Log message:
	PR rtl-optimization/14782
	* pa.c (emit_move_sequence): Use SFmode for 4-byte modes when doing
	the address checks for secondary reloads for loads from and stores
	to floating-point registers.
	* pa.h (EXTRA_CONSTRAINT, case T): Use SFmode for 4-byte modes
	in the address check.  Move work around for ELF32 targets to
	GO_IF_LEGITIMATE_ADDRESS.
	(GO_IF_LEGITIMATE_ADDRESS): Require constant offsets to be
	correctly aligned for DImode loads and stores.  Don't allow long
	SFmode displacements on ELF32.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.517&r2=2.2326.2.518
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.235.4.4&r2=1.235.4.5
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.h.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.208.4.5&r2=1.208.4.6

Comment 15 John David Anglin 2004-06-22 01:45:00 UTC
Fixed in 3.4 and 3.5 branches.
Comment 16 John David Anglin 2004-06-22 01:49:08 UTC
Created attachment 6600 [details]
Fix for 3.3 branch
Comment 17 Gabriel Dos Reis 2004-07-25 17:35:09 UTC
(In reply to comment #16)
> Created an attachment (id=6600)
> Fix for 3.3 branch

OK for 3.3.x.
Comment 18 GCC Commits 2004-07-25 18:53:35 UTC
Subject: Bug 14782

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_3-branch
Changes by:	doko@gcc.gnu.org	2004-07-25 18:53:32

Modified files:
	gcc            : ChangeLog 
	gcc/config/pa  : pa.c pa.h 

Log message:
	2004-06-21  John David Anglin  <dave.anglin@nrc-cnrc.gc.ca>
	
	PR rtl-optimization/14782
	* pa.c (emit_move_sequence): Use SFmode for 4-byte modes when doing
	the address checks for secondary reloads for loads from and stores
	to floating-point registers.
	* pa.h (EXTRA_CONSTRAINT, case T): Use SFmode for 4-byte modes
	in the address check.  Move work around for ELF32 targets to
	GO_IF_LEGITIMATE_ADDRESS.
	(GO_IF_LEGITIMATE_ADDRESS): Require constant offsets to be
	correctly aligned for DImode loads and stores.  Don't allow long
	SFmode displacements on ELF32.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.16114.2.1004&r2=1.16114.2.1005
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.188.2.15&r2=1.188.2.16
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/pa/pa.h.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.178.2.6&r2=1.178.2.7

Comment 19 Andrew Pinski 2004-07-25 19:05:11 UTC
Fixed in 3.3.5 also.