This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: help interpreting gcc 4.1.1 optimisation bug
- From: Etienne Lorrain <etienne_lorrain at yahoo dot fr>
- To: gcc at gcc dot gnu dot org
- Date: Wed, 14 Jun 2006 11:48:33 +0200 (CEST)
- Subject: Re: help interpreting gcc 4.1.1 optimisation bug
> The correct version is I think,
>
> void longcpy(long* _dst, long* _src, unsigned _numwords)
> {
> asm volatile (
> "cld \n\t"
> "rep \n\t"
> "movsl \n\t"
> // Outputs (read/write)
> : "=S" (_src), "=D" (_dst), "=c" (_numwords)
> // Inputs - specify same registers as outputs
> : "0" (_src), "1" (_dst), "2" (_numwords)
> // Clobbers: direction flag, so "cc", and "memory"
> : "cc", "memory"
> );
> }
I did not re-check with GCC-4.1.1, but I noticed problems with this
kind of "memory" clobber: when the source you are copying from is
not in memory but (is a structure) in the stack. I have to say that
I tend to use a form without "volatile" after the asm (one of the
result has to be used then).
The usual symtom is that the memcopy is done, but the *content* of the
source structure is not updated *before* the memcopy: nothing in your
asm says that the content of your pointer has to be up-to-date.
The "memory" says that main memory will be changed, not that it will be
used, and if you are memcopy-ing from a structure in stack - for instance
a structure which fit in a register - you may have problems.
That is why IHMO it is better to do type copying by directly copying
structure (mostly when using -fstrict-aliasing) instead of using
memcpy() - like: struct {int a,b,c } x, y = {0,1,1}; x = y;
The main disadvantage of the type copying is the relatively bad code
that previous compiler can generate for it, and that bug may appear
(correct me if I am wrong) because by not calling an external function
called memcpy() you are again not forcing the external memory to be
updated - but it should be quicker for exactly the same reason.
I did not really experiment with __builtin_memcpy(), is it treated
specially or like a standard function call; I do not know if:
int globint;
int fct (short *a, short *b) {
globint = 3;
__builtin_memcpy(a, b, sizeof(*a));
if (globint == 3)
return 1;
else
return 0;
}
Is the test present or optimised away like in:
int globint;
int fct (short *a, short *b) {
globint = 3;
*a = *b;
if (globint == 3)
return 1;
else
return 0;
}
Etienne.
__________________________________________________
Do You Yahoo!?
En finir avec le spam? Yahoo! Mail vous offre la meilleure protection possible contre les messages non sollicités
http://mail.yahoo.fr Yahoo! Mail