better load/store scheduling

Vladimir Makarov vmakarov@redhat.com
Thu Mar 1 22:12:00 GMT 2007


Ben Cheng wrote:

>Well, I guess the real question is how to make gcc schedule better code
>if loop unrolling is enabled?
>
>My original code is actually 
>
>    for (i = 0; i < 4096; i++) {
>        g[i]   = h[i] + 10;
>    }
>
>After gcc unrolls the loop, the loop bodies from different iterations
>aren't overlapping with each other because the load from later
>iterations is not scheduled across earlier stores. I thought this might
>be due to phase ordering issues of optimization stages so I manually
>unroll the loop. But unfortunately I still cannot get gcc to schedule
>loads/stores more aggressively.
>
>Since I want gcc to unroll the loop for me, I cannot create temporaries
>for h[i]. Therefore I am still hoping for some magic command line
>options to make gcc produce better scheduling.
>
>  
>
There is no such magic option.  The problem is not in the scheduler 
itself.  It can be done when/if we have more accurate aliasing info on 
rtl level.

Another problem is that even if we have more accurate alias analysis, it 
might be still impossible to move ld/st after RA worked.  Insn 
scheduling before RA is switched off for x86, x86_64 because of a bug 
which finally occurs in reload when the reload can not find a hard 
register for an insn operand. To get rid off this bug, insn scheduler 
should be register pressure sensitive.

Also It is better to use software pipelining for this loop.  You can try 
-fmodulo-sched and see what happens.  It might work.




More information about the Gcc-help mailing list