Inlined assembly instruction pushed out of loop in GCC 4.4.1/ARM

Simon Kagstrom
Tue Aug 18 14:45:00 GMT 2009


I'm having a compile issue with the Linux kernel for ARM and GCC 4.4.1
(also occurs in 4.3.3 at least). The code (from orion_nand.c in the
Linux kernel, cooked down) looks like this:

  void *vobb = (void*)0x12345678;

  void orion_nand_read_buf(uint8_t *buf, int len)
  	void *io_base = vobb;
  	uint64_t *buf64;
  	int i = 0;

  	buf64 = (uint64_t *)buf;
  	while (i < len/8) {
  		uint64_t x;
  		asm ("ldrd\t%0, [%1]" : "=r" (x) : "r" (io_base));
  		buf64[i++] = x;

and the problem is that the ldrd instruction is moved out of the loop:

  00000000 <orion_nand_read_buf>:
     0:	e2813007 	add	r3, r1, #7
     4:	e3510000 	cmp	r1, #0
     8:	b1a01003 	movlt	r1, r3
     c:	e59f302c 	ldr	r3, [pc, #44]	
    10:	e92d4010 	push	{r4, lr}
    14:	e5932000 	ldr	r2, [r3]
    18:	e1c220d0 	ldrd	r2, [r2]                     # Here is the ldrd instruction!
    1c:	e1a011c1 	asr	r1, r1, #3
    20:	e3a0c000 	mov	ip, #0
    24:	ea000000 	b	2c
    28:	e18020f4 	strd	r2, [r0, r4]                 # But the loop starts here
    2c:	e15c0001 	cmp	ip, r1
    30:	e1a0418c 	lsl	r4, ip, #3
    34:	e28cc001 	add	ip, ip, #1
    38:	bafffffa 	blt	28                           # Loop back
    3c:	e8bd8010 	pop	{r4, pc}
    40:	00000000 	.word	0x00000000

GCC is built with crosstool-ng and is called like this:

  arm-unknown-linux-gnueabi-gcc -mcpu=arm926ej-s -c tst.c -o /tmp/tst.o -Os

the problem does not occur without optimization, but with any -O level.
It can be solved by making the inline asm statement volatile, but the
Linux developers are not happy with that change:

Now, I'm not sure that the inline assembly is completely correct
either. The ldrd instruction

requires that the destination register should be even (I presume since
it puts the result in a register pair), and maybe that should be
encoded in the output operand constraints? At least GCC 3.4.4 gives me

  Error: destination register must be even -- `ldrd r1,[r2]'

when compiling this code.

So is this a GCC bug or should I head back to the Linux people with it?

// Simon

More information about the Gcc-help mailing list