This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Is there any other optimization for memory?

On 14/07/2011 16:24, Parmenides wrote:
2011/7/14 Ian Lance Taylor<>:
Parmenides<> writes:

2.  "and not optimize stores or loads to that memory"
Except caching memory values in registers, is there any other
optimizaiton for stores or loads to memory?

Not in this case, I think it's just another way of saying the same thing.

I think the reordering instructions involving memory operations but not stores or loads might count as some optimization. It seems that a "memory" will prevent gcc from this kind of optimization. If so, would the manual give some statements about it.

I'm sorry, I don't understand what you mean.

For the purpose of understanding some gcc's features, without ideas of details underlying gcc, I have to code some examples in C and compile them into assembly code, then observe them to get some ideas. Memory values caching in registers is one optimization taken by gcc, reordering instructions is another. A "memory" clobber in an inline assembly may have influence on the both. I have coded an example in C to try to understand the former.

int s = 0;
int tst(int lim)
      int i;

      for (i = 1; i<  lim; i++)
           s = s + i;

      asm volatile(

s = s * 10;

      return s;

To compile the C souce, the following command is excuted.
gcc -S -O tst.c

The corresponding assembly code is as follows:
         pushl   %ebp
         movl    %esp, %ebp
         movl    8(%ebp), %ecx
         cmpl    $1, %ecx
         jle     .L2
         movl    s, %edx
         movl    $1, %eax
         addl    %eax, %edx
         incl    %eax
         cmpl    %eax, %ecx
         jne     .L4
         movl    %edx, s<--- After the loop, s is write back into memory.
         movl    s, %eax<--- Before the evaluating 's = s * 10', s
is reload into register.
         leal    (%eax,%eax,4), %eax
         addl    %eax, %eax
         movl    %eax, s
         popl    %ebp

So, the "memory" clobber have prevented the optimization. But for the
latter case, namely reordering instructions, I can not obtain an
example like the above to illustrate how "memory" clobber prevent
reordering instructions. I don't know some circumstances under which
gcc will do reodering. Without them, I can not observe the effect of
the "memory" clobber.

I don't think you're going to find a suitable example, because I don't think a memory barrier will interact much with other memory optimisations, such as re-ordered loads and stores, speculative loads, etc. The barrier gives you a point in the code that says "all memory operations before this point should be completed, and no memory operations after this point should be started".

If you've got code like this:

extern int data[32];
void foo(void) {
	int a = data[0];
	int b = data[1];
	data[2] = b;
	data[3] = a;
	asm volatile ("" ::: "memory");
	int c = data[0];
	int d = data[1];
	data[4] = c;
	data[5] = d;

The compiler is still free to re-arrange the loads and stores of data /before/ the memory barrier. On a superscaler cpu, it might choose to read data[1] before data[0], to better hide the read latencies. Or it might issue a speculative load or a cache pre-load instruction first. Or it might store data[3] before data[2], to improve the pipelining. Or it might load data[0] and data[1] with a double-register load instruction. There are lots of potential "memory optimisations" available, and the compiler can do any of them. The only thing the barrier does is separate the function into two halves, and the compiler can't re-order memory operations across the barrier.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]