[Bug rtl-optimization/24815] loop unrolling ends up with too much reg+index addressing

olegendo at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Sun Jul 22 16:26:00 GMT 2012


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24815

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|sh-elf                      |sh*-*-*
   Last reconfirmed|2006-04-24 22:37:46         |2012-07-22
                 CC|                            |olegendo at gcc dot gnu.org
         Depends on|                            |50749

--- Comment #5 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-22 16:25:55 UTC ---
As of rev 189746 I was able to reproduce the problem with the following reduced
test case:

extern int tbl[1000];

void f (int* b, const int* a)
{
  for (int i = 0; i < 998; i++)
    b[i] = a[tbl[i]];
}

... compiled with '-O2 -m4-single -ml' (no loop unrolling):

        mov.l   .L6,r3          ! 66    movsi_ie/1    [length = 2]
        mov     #0,r7           ! 40    movsi_ie/3    [length = 2]
        mov.w   .L7,r2          ! 70    *movhi/1    [length = 2]
        .align 2
.L3:
        mov     r7,r0           ! 77    movsi_ie/2    [length = 2]
        mov.l   @(r0,r3),r1     ! 46    movsi_ie/7    [length = 2]
        dt      r2              ! 71    dect    [length = 2]
        shll2   r1              ! 47    ashlsi3_std/3    [length = 2]
        mov     r1,r0           ! 78    movsi_ie/2    [length = 2]
        mov.l   @(r0,r5),r1     ! 49    movsi_ie/7    [length = 2]
        mov     r7,r0           ! 79    movsi_ie/2    [length = 2]
        add     #4,r7           ! 51    *addsi3_compact    [length = 2]
        bf/s    .L3             ! 72    branch_false    [length = 2]
        mov.l   r1,@(r0,r4)     ! 50    movsi_ie/11    [length = 2]
        rts
        nop                     ! 83    *return_i    [length = 4]
        .align 1
.L7:
        .short    998
.L8:
        .align 2
.L6:
        .long    _tbl


... which would be better as:
        mov.l   .L6,r3
        mov.w   .L7,r2
.L3:
        mov.l   @r3+,r0
        dt      r2
        shll2   r0
        mov.l   @(r0,r5),r1
        mov.l   r1,@r4
        bf/s    .L3
        add     #4,r4

        rts
        nop

With loop unrolling enabled it looks similar to the code in comment #2.
It seems that this issue also depends on the auto-inc-dec related PR 50749.



More information about the Gcc-bugs mailing list