[Bug tree-optimization/67213] New: When compiling for size with -Os loops can get bigger after peeling

Fri Aug 14 09:21:00 GMT 2015

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67213

            Bug ID: 67213
           Summary: When compiling for size with -Os loops can get bigger
                    after peeling
           Product: gcc
           Version: 5.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fredrik.hederstierna@securitas-direct.com
  Target Milestone: ---

When compiling thumb1 code for size with -Os some loops can be larger due to
complete peeling.

Example code:

extern char data[10];

void test_iter_2(void)
{
  int i;
  for (i = 0; i < 2; i++) {
    data[i] = i;
  }
}

void test_iter_6(void)
{
  int i;
  for (i = 0; i < 6; i++) {
    data[i] = i;
  }
}

void test_iter_7(void)
{
  int i;
  for (i = 0; i < 7; i++) {
    data[i] = i;
  }
}

It will compile to

00000000 <test_iter_2>:
   0:   e3a02000        mov     r2, #0
   4:   e59f300c        ldr     r3, [pc, #12]   ; 18 <test_iter_2+0x18>
   8:   e5c32000        strb    r2, [r3]
   c:   e3a02001        mov     r2, #1
  10:   e5c32001        strb    r2, [r3, #1]
  14:   e12fff1e        bx      lr
  18:   00000000        .word   0x00000000

0000001c <test_iter_6>:
  1c:   e3a02000        mov     r2, #0
  20:   e59f302c        ldr     r3, [pc, #44]   ; 54 <test_iter_6+0x38>
  24:   e5c32000        strb    r2, [r3]
  28:   e3a02001        mov     r2, #1
  2c:   e5c32001        strb    r2, [r3, #1]
  30:   e3a02002        mov     r2, #2
  34:   e5c32002        strb    r2, [r3, #2]
  38:   e3a02003        mov     r2, #3
  3c:   e5c32003        strb    r2, [r3, #3]
  40:   e3a02004        mov     r2, #4
  44:   e5c32004        strb    r2, [r3, #4]
  48:   e3a02005        mov     r2, #5
  4c:   e5c32005        strb    r2, [r3, #5]
  50:   e12fff1e        bx      lr
  54:   00000000        .word   0x00000000

00000058 <test_iter_7>:
  58:   e3a03000        mov     r3, #0
  5c:   e59f2010        ldr     r2, [pc, #16]   ; 74 <test_iter_7+0x1c>
  60:   e7c33002        strb    r3, [r3, r2]
  64:   e2833001        add     r3, r3, #1
  68:   e3530007        cmp     r3, #7
  6c:   1afffffb        bne     60 <test_iter_7+0x8>
  70:   e12fff1e        bx      lr
  74:   00000000        .word   0x00000000

The unrolling of iter_6 seems to be controlled by default:

 --param max-completely-peel-times=5

if changing to

 --param max-completely-peel-times=0

code for iter_6 gets ok, but then iter_2 get larger.

00000000 <test_iter_2>:
   0:   e3a03000        mov     r3, #0
   4:   e59f2010        ldr     r2, [pc, #16]   ; 1c <test_iter_2+0x1c>
   8:   e7c33002        strb    r3, [r3, r2]
   c:   e2833001        add     r3, r3, #1
  10:   e3530002        cmp     r3, #2
  14:   1afffffb        bne     8 <test_iter_2+0x8>
  18:   e12fff1e        bx      lr
  1c:   00000000        .word   0x00000000

00000020 <test_iter_6>:
  20:   e3a03000        mov     r3, #0
  24:   e59f2010        ldr     r2, [pc, #16]   ; 3c <test_iter_6+0x1c>
  28:   e7c33002        strb    r3, [r3, r2]
  2c:   e2833001        add     r3, r3, #1
  30:   e3530006        cmp     r3, #6
  34:   1afffffb        bne     28 <test_iter_6+0x8>
  38:   e12fff1e        bx      lr
  3c:   00000000        .word   0x00000000

00000040 <test_iter_7>:
  40:   e3a03000        mov     r3, #0
  44:   e59f2010        ldr     r2, [pc, #16]   ; 5c <test_iter_7+0x1c>
  48:   e7c33002        strb    r3, [r3, r2]
  4c:   e2833001        add     r3, r3, #1
  50:   e3530007        cmp     r3, #7
  54:   1afffffb        bne     48 <test_iter_7+0x8>
  58:   e12fff1e        bx      lr
  5c:   00000000        .word   0x00000000

I guess its a trade off between number allowed unrolls and expected code size
growth/decrease. Though it could maybe be detected that code size growth in
this case.

Attach toolchain build script and code.