This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: problems with gcc inline assembly using xmm registers
- From: David Palao <david dot palao at uv dot es>
- To: Nathan Sidwell <nathan at codesourcery dot com>
- Cc: gcc-help at gcc dot gnu dot org
- Date: Fri, 3 Dec 2004 18:08:02 +0100
- Subject: Re: problems with gcc inline assembly using xmm registers
- Organization: Universidad de Valencia
- References: <200412031628.53453.david.palao@uv.es> <41B09632.8040005@codesourcery.com>
El Viernes, 3 de Diciembre de 2004 17:37, escribió:
> David Palao wrote:
> > __asm__ __volatile__ ("movsd %0, %%xmm3 \n\t" \
> > "movsd %1, %%xmm6 \n\t" \
> > "movsd %2, %%xmm4 \n\t" \
> > "movsd %3, %%xmm7 \n\t" \
> > "movsd %4, %%xmm5 \n\t" \
> > "unpcklpd %%xmm3, %%xmm3 \n\t" \
> > "unpcklpd %%xmm6, %%xmm6 \n\t" \
> > "unpcklpd %%xmm4, %%xmm4 \n\t" \
> > "mulpd %%xmm0, %%xmm3 \n\t" \
>
> ....
>
> > "addpd %%xmm6, %%xmm5 \n\t" \
> > "addpd %%xmm7, %%xmm3 \n\t" \
> > "movsd %7, %%xmm6 \n\t" \
> > "movsd %8, %%xmm7 \n\t" \
> > "unpcklpd %%xmm6, %%xmm6 \n\t" \
> > "unpcklpd %%xmm7, %%xmm7 \n\t" \
> > "mulpd %%xmm1, %%xmm6 \n\t" \
> > "mulpd %%xmm2, %%xmm7 \n\t" \
> > "addpd %%xmm6, %%xmm4 \n\t" \
> > "addpd %%xmm7, %%xmm5" \
>
> don't write it this way, use the mmx builtins directly and then the
> compiler can handle all the register allocation for you. You'll
> have to be careful to arrange for no more than 8 mmx things
> to be live at one time though. That's not too hard to achieve
> if you're careful. I had success using this technique to do some
> 2D FFTs, it was way simpler than writing assembly directly.
>
> nathan
First, thanks for the reply.
What do you mean with "use the mmx builtins directly"?
(remember I'm learning this stuff right now...)
If I understand correctly, it is possible to say to the compiler that it has
to use ONLY xmm registers (you mean xmm, right?) in doing certain part(s) of
the code (and only this part(s) ) and it will be done in the most efficient
way, is it true???
How can I do it?
Regards
David