Inline asm for ARM

Pavel Pavlov pavel@summit-tech.ca
Wed Jun 16 16:55:00 GMT 2010


> -----Original Message-----
> From: Andrew Haley [mailto:aph@redhat.com]
> On 06/16/2010 05:11 PM, Pavel Pavlov wrote:
> >> -----Original Message-----
> >> On 06/16/2010 01:15 PM, Andrew Haley wrote:
> >>> On 06/16/2010 11:23 AM, Pavel Pavlov wrote:
> > ...
> >> inline uint64_t smlalbb(uint64_t acc, unsigned int lo, unsigned int hi) {
> >>   union
> >>   {
> >>     uint64_t ll;
> >>     struct
> >>     {
> >>       unsigned int l;
> >>       unsigned int h;
> >>     } s;
> >>   } retval;
> >>
> >>   retval.ll = acc;
> >>
> >>   __asm__("smlalbb %0, %1, %2, %3"
> >> 	  : "+r"(retval.s.l), "+r"(retval.s.h)
> >> 	  : "r"(lo), "r"(hi));
> >>
> >>   return retval.ll;
> >> }
> >>
> >
> > [Pavel Pavlov]
> > Later on I found out that I had to use +r constraint, but then, when I use that
> function for example like that:
> > int64_t rsmlalbb64(int64_t i, int x, int y) {
> > 	return smlalbb64(i, x, y);
> > }
> >
> > Gcc generates this asm:
> > <rsmlalbb64>:
> > push	{r4, r5}
> > mov	r4, r0
> > mov	ip, r1
> > smlalbb	r4, ip, r2, r3
> > mov	r5, ip
> > mov	r0, r4
> > mov	r1, ip
> > pop	{r4, r5}
> > bx	lr
> >
> > It's bizarre what gcc is doing in that function, I understand if it
> > can't optimize and correctly use r0 and r1 directly, but from that
> > listing it looks as if gcc got drunk and decided to touch r5 for
> > absolutely no reason!
> >
> > the expected out should have been like that:
> > <rsmlalbb64>:
> > smlalbb	r0, r1, r2, r3
> > bx	lr
> >
> > I'm using cegcc 4.1.0 and I compile with
> > arm-mingw32ce-g++ -O3 -mcpu=arm1136j-s -c ARM_TEST.cpp -o
> > arm-mingw32ce-g++ ARM_TEST_GCC.obj
> >
> > Is there a way to access individual parts of that 64-bit input integer
> > or, is there a way to specify that two 32-bit integers should be
> > treated as a Hi:Lo parts of 64 bit variable. It's commonly done with a
> > temporary, but the result is that gcc generates to much junk.
> 
> Why don't you just use the function I sent above?  It generates
> 
> smlalbb:
> 	smlalbb r0, r1, r2, r3
> 	mov	pc, lr
> 
> smlalXX64:
> 	smlalbb r0, r1, r2, r3
> 	smlalbt r0, r1, r2, r3
> 	smlaltb r0, r1, r2, r3
> 	smlaltt r0, r1, r2, r3
> 	mov	pc, lr
> 

[Pavel Pavlov] 
What's your gcc -v? The output I posted comes from your function.




More information about the Gcc-help mailing list