This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
RE: Inline asm for ARM
> -----Original Message-----
> From: Andrew Haley [mailto:aph@redhat.com]
> Sent: Wednesday, June 16, 2010 13:23
> To: Pavel Pavlov
> Cc: gcc-help@gcc.gnu.org
> Subject: Re: Inline asm for ARM
>
> On 06/16/2010 06:12 PM, Pavel Pavlov wrote:
> >> From: gcc-help-owner@gcc.gnu.org [mailto:gcc-help-owner@gcc.gnu.org]
> >> On Behalf Of Andrew Haley
> >>
> >>> By the way, the version that takes hi:lo for the first int64 works fine:
> >>>
> >>> static __inline void smlalbb(int * lo, int * hi, int x, int y) { #if
> >>> defined(__CC_ARM)
> >>> __asm { smlalbb *lo, *hi, x, y; }
> >>> #elif defined(__GNUC__)
> >>> __asm__ __volatile__("smlalbb %0, %1, %2, %3" : "+r"(*lo),
> >>> "+r"(*hi)
> >>> : "r"(x), "r"(y)); #endif }
> >>>
> >>>
> >>> void test_smlalXX(int hi, int lo, int a, int b) {
> >>> smlalbb(&hi, &lo, a, b);
> >>> smlalbt(&hi, &lo, a, b);
> >>> smlaltb(&hi, &lo, a, b);
> >>> smlaltt(&hi, &lo, a, b);
> >>> }
> >>>
> >>> Translates directly into four asm opcodes
> >>
> >> Mmmm, but the volatile is wrong. If you need volatile to stop gcc
> >> from deleting your asm, you have a mistake somewhere.
> >
> > I had to add volatile when I had that mess with "=&r" and "0", now I
> > think it might be removed.
>
> > Just tested, and I still need that. The reason I needed that was
> > because my test function was a noop:
>
> > void test_smlalXX(int lo, int hi, int a, int b) {
> > smlalbb(&lo, &hi, a, b);
> > smlalbt(&lo, &hi, a, b);
> > smlaltb(&lo, &hi, a, b);
> > smlaltt(&lo, &hi, a, b);
> > }
>
> > Gcc correctly guesses that there is no side effect from that function
> > if I don't use volatile. So, I removed volatile and added return for
> > that function:
> >
> > uint64_t test_smlalXX(int lo, int hi, int a, int b) {
> > smlalbb(&lo, &hi, a, b);
> > smlalbt(&lo, &hi, a, b);
> > smlaltb(&lo, &hi, a, b);
> > smlaltt(&lo, &hi, a, b);
> >
> > T64 retval;
> >
> > retval.s.hi = hi;
> > retval.s.lo = lo;
> > return retval.i64;
> > }
> >
> > The output becomes:
> > 000000e4 <_Z12test_smlalXXiiii>:
> > e4: e92d0030 push {r4, r5}
> > e8: e1410382 smlalbb r0, r1, r2, r3
> > ec: e14103c2 smlalbt r0, r1, r2, r3
> > f0: e14103a2 smlaltb r0, r1, r2, r3
> > f4: e1a05001 mov r5, r1
> > f8: e14503e2 smlaltt r0, r5, r2, r3
> > fc: e1a04000 mov r4, r0
> > 100: e1a01005 mov r1, r5
> > 104: e8bd0030 pop {r4, r5}
> > 108: e12fff1e bx lr
> >
> > Basically gcc, gets confused about return variable and generates
> > useless gunk at the end for the last function. I tried to comment
> > smlaltt(&lo, &hi, a, b); in the test_smlalXX, and gcc still generates
> > that same useless code around smlattb
>
> I have seen something similar with higher optimization levels, where some pass
> messes things up a bit. Your
>
> mov r4, r0
>
> is very weird, though. I can't explain that.
>
> -O1 generates perfect code for me, though.
>
> Andrew.
[Pavel Pavlov]
That's similar to that bizarre listing I sent previously, I can't explain what's happening it just puts some code that has no meaning at all. -O1, -O2 and-O3 generate identical results for me.