This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: problems with gcc inline assembly using xmm registers
- From: David Palao <david dot palao at uv dot es>
- To: Thorsten Reinecke <thre at thorstenreinecke dot de>
- Cc: gcc-help at gcc dot gnu dot org
- Date: Fri, 3 Dec 2004 18:41:20 +0100
- Subject: Re: problems with gcc inline assembly using xmm registers
- Organization: Universidad de Valencia
- References: <200412031628.53453.david.palao@uv.es> <Pine.LNX.4.61.0412031705390.2211@tripper.tr69.homelinux.net>
Thank you for the answer!
>
> I had some trouble using mmx and xmm registers, too. But my code works
> now. See the attached code snippet. You can get an idea of how to use the
> input list, output list and clobber list and also how to share xmm
> registers between different asm inline blocks.
>
I will read it, but it looks hardcore to me (as I'm new in assembly)
> You use only memory operands, so this shouldn't be a problem. But you
> haven't declared any output. You're clobbering xmm registers, but you
> don't tell the compiler that you do so. Maybe that's the problem.
What's the problem if I don't need output operands?
Concerning to clobbering part; well, I have tried clobbering xmm registers as
well (I hope I did it right). For instance:
__asm__ __volatile__ ("movsd %0, %%xmm3 \n\t" \
"movsd %1, %%xmm6 \n\t" \
"movsd %2, %%xmm4 \n\t" \
"movsd %3, %%xmm7 \n\t" \
"movsd %4, %%xmm5 \n\t" \
"unpcklpd %%xmm3, %%xmm3 \n\t" \
"unpcklpd %%xmm6, %%xmm6 \n\t" \
"unpcklpd %%xmm4, %%xmm4 \n\t" \
"mulpd %%xmm0, %%xmm3 \n\t" \
"unpcklpd %%xmm7, %%xmm7 \n\t" \
"mulpd %%xmm1, %%xmm6 \n\t" \
"unpcklpd %%xmm5, %%xmm5 \n\t" \
"mulpd %%xmm0, %%xmm4 \n\t" \
"addpd %%xmm6, %%xmm3 \n\t" \
"mulpd %%xmm2, %%xmm7 \n\t" \
"mulpd %%xmm0, %%xmm5 \n\t" \
"addpd %%xmm7, %%xmm4 \n\t" \
"movsd %5, %%xmm6 \n\t" \
"movsd %6, %%xmm7 \n\t" \
"unpcklpd %%xmm6, %%xmm6 \n\t" \
"unpcklpd %%xmm7, %%xmm7 \n\t" \
"mulpd %%xmm1, %%xmm6 \n\t" \
"mulpd %%xmm2, %%xmm7 \n\t" \
"addpd %%xmm6, %%xmm5 \n\t" \
"addpd %%xmm7, %%xmm3 \n\t" \
"movsd %7, %%xmm6 \n\t" \
"movsd %8, %%xmm7 \n\t" \
"unpcklpd %%xmm6, %%xmm6 \n\t" \
"unpcklpd %%xmm7, %%xmm7 \n\t" \
"mulpd %%xmm1, %%xmm6 \n\t" \
"mulpd %%xmm2, %%xmm7 \n\t" \
"addpd %%xmm6, %%xmm4 \n\t" \
"addpd %%xmm7, %%xmm5" \
: \
: \
"m" ((u).c11.real()), \
"m" ((u).c12.real()), \
"m" ((u).c21.real()), \
"m" ((u).c23.real()), \
"m" ((u).c31.real()), \
"m" ((u).c32.real()), \
"m" ((u).c13.real()), \
"m" ((u).c22.real()), \
"m" ((u).c33.real()) \
: \
"%xmm0", \
"%xmm1", \
"%xmm2", \
"%xmm3", \
"%xmm4", \
"%xmm5", \
"%xmm6", \
"%xmm7" );
BUT it doesn't work either way (with/without clobbering list).
Any idea???
Regards
David