memcpy() inlining and alignment

Jason Thorpe thorpej@netbsd.org
Fri Mar 3 16:25:00 GMT 2000


Hi folks...

I've recently been bitten by some unfortunate compiler behavior, and I
would like your opinion as to whether or not this is a compiler bug or
an application bug (in this case, code in the BSD TCP/IP stack).

Background: certain platforms have problems with doing unaligned accesses,
specifically the Alpha and SPARC will fault, and the ARM will generate
incorrect data.  Because of this, we generally copy data which may be
unaligned into buffers which have known alignment before accessing them.

Now, GCC can be configured to inline certain calls to memcpy().  However,
we're encountering a case where this inlining of a memcpy() is causing
unaligned accesses, because GCC apparently isn't taking into consideration
that the source address (in this case) may not be aligned.

The following code fragment illustrates the behavior.  Note in the first
case, the compiler thinks it is copying a struct (this particular struct 
as an alignment constraint of sizeof(int) on the Alpha), and in the second
case, it thinks it's copying a series of chars.

    ----- snip -----

#include <string.h>

struct s {
	unsigned int a;
	unsigned int b;
	unsigned int i;
};

unsigned int
foo(unsigned char *cp)
{
	struct s *src, dst;

	src = (struct s *)(cp + 1);

	memcpy(&dst, src, sizeof(struct s));

	return (dst.i);
}

unsigned int
bar(unsigned char *cp)
{
	struct s *src, dst;
	unsigned char *cp0;

	src = (struct s *)(cp + 1);
	cp0 = (unsigned char *) &dst;

	memcpy(cp0, src, sizeof(struct s));

	return (dst.i);
}

     ----- snip -----

When this code is compiled with -O0, it simply emits calls to memcpy(),
and the code fragment functions correctly (as the libc/libkern memcpy()
handles unaligned addresses).

However, when this code is compiled with -O1 or -O2, the memcpy() calls
are inlined by the compiler.  The result is unaligned access in the first
case and correct access in the second case.

     ----- snip -----

	.file	1 "gccbug.c"
	.version	"01.01"
	.set noat
.text
	.align 5
	.globl foo
	.ent foo
foo:
	.frame $30,16,$26,0
$foo..ng:
	subq $30,16,$30
	.prologue 0
	addq $16,1,$1
	lds $f10,1($16)
	ldl $2,8($1)
	lds $f11,4($1)
	sts $f10,0($30)
	addl $2,$31,$0
	stl $2,8($30)
	sts $f11,4($30)
	addq $30,16,$30
	ret $31,($26),1
	.end foo
	.align 5
	.globl bar
	.ent bar
bar:
	.frame $30,16,$26,0
$bar..ng:
	subq $30,16,$30
	.prologue 0
	addq $16,1,$7
	ldq_u $1,3($30)
	addq $30,4,$5
	ldl $2,1($16)
	ldq_u $4,0($30)
	msklh $1,$30,$1
	ldl $8,8($7)
	inslh $2,$30,$3
	ldl $6,4($7)
	insll $2,$30,$2
	mskll $4,$30,$4
	bis $1,$3,$1
	stq_u $1,3($30)
	bis $4,$2,$4
	stq_u $4,0($30)
	inslh $6,$5,$7
	ldq_u $1,7($30)
	insll $6,$5,$6
	ldq_u $2,4($30)
	addq $30,8,$3
	inslh $8,$3,$4
	msklh $1,$5,$1
	mskll $2,$5,$2
	bis $1,$7,$1
	stq_u $1,7($30)
	bis $2,$6,$2
	stq_u $2,4($30)
	insll $8,$3,$8
	ldq_u $1,11($30)
	ldq_u $2,8($30)
	msklh $1,$3,$1
	mskll $2,$3,$2
	bis $1,$4,$1
	stq_u $1,11($30)
	bis $2,$8,$2
	stq_u $2,8($30)
	ldl $0,8($30)
	addq $30,16,$30
	ret $31,($26),1
	.end bar
	.ident	"GCC: (GNU) egcs-2.91.66 19990314 (egcs-1.1.2 release)"

     ----- snip -----

Now, I realize that we're using a fairly dated compiler, so I went poking
around in egcs-current, having remembered that David Edelsohn had made
some changes regarding block copies and alignment (in this particular case,
it was some application code that was being unkind to the PowerPC).  However,
I'm not sure his change to SLOW_UNALIGNED_ACCESS will affect my problem,
since the Alpha GCC configuration already unconditionally defines
SLOW_UNALIGNED_ACCESS.

So, my questions are:

	(1) Is this a compiler bug, or is the code making the memcpy()
	    call broken?  I'm really hoping it's not the latter, since
	    there is a LOT of code out there which assumes it can
	    copy data from unaligned buffers into aligned buffers before
	    accessing it.

	(2) If it is a compiler bug, is it fixed in a later GCC version?

	(3) If the answer to (2) is "not been fixed", can someone suggest
	    a reasonable workaround (besides aliasing the pointer in
	    the code)?  I'm currently thinking "disable memcpy() inlining"
	    in the GCC configuration as a reasonable short-term fix, but
	    isn't a good long-term solution.

	(4) If the answer to (2) is "yo, it's fixed!", can someone who
	    tracks GCC activity a little closer than I do give me a hint
	    as to where to start looking?  This is something I'll likely
	    have to pull into the current version of the compiler we're
	    using.

Thanks for your help!

	-- Jason R. Thorpe <thorpej@netbsd.org>



More information about the Gcc-bugs mailing list