Inline asm for ARM

Pavel Pavlov pavel@summit-tech.ca
Wed Jun 16 16:51:00 GMT 2010


> -----Original Message-----
> On 06/16/2010 01:15 PM, Andrew Haley wrote:
> > On 06/16/2010 11:23 AM, Pavel Pavlov wrote:
...
> inline uint64_t smlalbb(uint64_t acc, unsigned int lo, unsigned int hi) {
>   union
>   {
>     uint64_t ll;
>     struct
>     {
>       unsigned int l;
>       unsigned int h;
>     } s;
>   } retval;
> 
>   retval.ll = acc;
> 
>   __asm__("smlalbb %0, %1, %2, %3"
> 	  : "+r"(retval.s.l), "+r"(retval.s.h)
> 	  : "r"(lo), "r"(hi));
> 
>   return retval.ll;
> }
> 

[Pavel Pavlov] 
Later on I found out that I had to use +r constraint, but then, when I use that function for example like that:
int64_t rsmlalbb64(int64_t i, int x, int y)
{
	return smlalbb64(i, x, y);
}

Gcc generates this asm:
<rsmlalbb64>:
push	{r4, r5}
mov	r4, r0
mov	ip, r1
smlalbb	r4, ip, r2, r3
mov	r5, ip
mov	r0, r4
mov	r1, ip
pop	{r4, r5}
bx	lr

It's bizarre what gcc is doing in that function, I understand if it can't optimize and correctly use r0 and r1 directly, but from that listing it looks as if gcc got drunk and decided to touch r5 for absolutely no reason!

the expected out should have been like that:
<rsmlalbb64>:
smlalbb	r0, r1, r2, r3
bx	lr

I'm using cegcc 4.1.0 and I compile with 
arm-mingw32ce-g++ -O3 -mcpu=arm1136j-s -c ARM_TEST.cpp -o ARM_TEST_GCC.obj

Is there a way to access individual parts of that 64-bit input integer or, is there a way to specify that two 32-bit integers should be treated as a Hi:Lo parts of 64 bit variable. It's commonly done with a temporary, but the result is that gcc generates to much junk.



More information about the Gcc-help mailing list