This is the mail archive of the
mailing list for the GCC project.
Re: Is there any other optimization for memory?
Parmenides <email@example.com> writes:
> For the purpose of understanding some gcc's features, without ideas of
> details underlying gcc, I have to code some examples in C and compile
> them into assembly code, then observe them to get some ideas. Memory
> values caching in registers is one optimization taken by gcc,
> reordering instructions is another. A "memory" clobber in an inline
> assembly may have influence on the both. I have coded an example in C
> to try to understand the former.
> int s = 0;
> int tst(int lim)
> int i;
> for (i = 1; i < lim; i++)
> s = s + i;
> asm volatile(
> s = s * 10;
> return s;
> To compile the C souce, the following command is excuted.
> gcc -S -O tst.c
> The corresponding assembly code is as follows:
> pushl %ebp
> movl %esp, %ebp
> movl 8(%ebp), %ecx
> cmpl $1, %ecx
> jle .L2
> movl s, %edx
> movl $1, %eax
> addl %eax, %edx
> incl %eax
> cmpl %eax, %ecx
> jne .L4
> movl %edx, s <--- After the loop, s is write back into memory.
> movl s, %eax <--- Before the evaluating 's = s * 10', s
> is reload into register.
> leal (%eax,%eax,4), %eax
> addl %eax, %eax
> movl %eax, s
> popl %ebp
> So, the "memory" clobber have prevented the optimization. But for the
> latter case, namely reordering instructions, I can not obtain an
> example like the above to illustrate how "memory" clobber prevent
> reordering instructions. I don't know some circumstances under which
> gcc will do reodering. Without them, I can not observe the effect of
> the "memory" clobber.
Instruction reordering is easier to observe on a machine other than the
x86, one with long load latencies. Here is an example, though:
f (int *a, int *b, int c)
int i, j;
for (i = 0; i < c; i++)
int a0, a1, a2, a3;
asm ("nop" : "=r" (j) : "r" (i));
a0 = a;
a1 = a;
a2 = a;
a3 = a;
b = a0;
b = a1;
b = a2;
b = a3;
When optimizing, the memory load instructions will be reordered to occur
before the asm. At least, that's what I see with current mainline gcc
on x86_64. This isn't a case of memory caching; it's reordering of the
load instructions across the asm.