This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] Addressing Mode Selection Issues.


Hi,

GCC chooses the addressing mode early during RTL
generation. A large number of target macros like  LEGITIMIZE_ADDRESS, 
GO_IF_ LEGITIMATE_ADDRESS, GO_IF_LEGITIMATE_INDEX etc form GCC's 
interface for choosing the addressing modes at different points.
This is fine for targets where cost of different addressing 
modes doesn't vary but not optimal for others like SH. 
We might say that there is no good mechanism to chose an 
addressing mode such that address arithmetic/spills are 
minimized. 

Consider the problems:
1. void foo ()
     {
       int i, a[500];
       i = 234;
       a[i] = 12123;
       i = 236;
       a[i] = 12123;
       i = 238;
        ....
     }
GCC generates this code at -O2
    mov.w   .L3,r1      ! L3 is 12123
    mov.w   .L4,r0      ! L4 is 234*4
    mov.l  r1,@(r0,r14) ! Always using (r0, r14) 
    mov.w   .L5,r0      ! L5 is 236 *4
    mov.l  r1,@(r0,r14)
    mov.w   .L6,r0
There are related addresses in this snippet; It would be better if we 
had something like
    mov.w   .L3,r1
    mov.w   .L4, r11
    add     r14, r11
    mov.l  r1,@r11
    mov.l  r1,@(16,r11)
This avoids excess r0 usage and better scheduling freedom.

2. Address inheritance is not propagated. The regmove pass accidentally 
   does something useful (for address arithmetic), 
   within basic blocks.

      pX <= pA + N
      ...
      pX <= pA + M
       |
	 |
`	 v
      pX <= pA + N
      ...
      pX <= pX + (M - N)

   But this is not done across basic blocks and does not handle all
   modes.

2. On a related note, we lose alias information too, when references
   are broken down in addressing modes.
   (we expect things would be better in tree-ssa, though)

  int foo (short  *a, short *b)
	{
   	  a[17] = a[0]+ a[18];
  	  b[17] = b[0]+ a[18];
	}
  GCC generates (-O2 -ml -m4 -fno-argument-alias)
	  add     #34,r3
        mov     r4,r2
        add     #36,r2
        mov.w   @r4,r0  !r0 = A[0]
        mov.w   @r2,r1  !r1 = A[18]
        add     r1,r0
        mov.w   r0,@r3  !A[17]=r0
        mov     r5,r3   !
        add     #34,r3
        mov.w   @r2,r1  !Load A[18] again.
 	  .... 

IDEAS:
------
One solution could be to fake the standard addressing modes, the rtl
optimizers are comfortable with. And change to target's addressing mode
in a separate pass after sched1. (After sched1, because scheduler
can possibly do better load store scheduling with abstract modes).
We can modify the front-end to generate large virtual displacements. 
There can be a new macro called "TARGET_USE_LARGE_VIRTUAL_OFFSETS". 
If this is nonzero, this will force the front end to generate large virtual
offsets. 
There would be two abstractions provided at RTL level.

   1.	The target machine supports infinite displacement for 
      displacement + register indirect mode.
   2.	The target machine supports dual register indirect mode where two 
      registers are arbitrary. i.e. @(rm, rn) is supported.

The new pass can map these abstractions to the target's actual addressing
modes.
 
I suspect, some examples above might be optimized in tree-ssa branch but
tree-ssa's perspective is different. I think a pass that has more of 
target's view might be useful. Hence the request for comments.

Best Regards,
Naveen Sharma.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]