This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/58296] New: ivopts is unable to handle some loops altered by the loop header copying pass


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58296

            Bug ID: 58296
           Summary: ivopts is unable to handle some loops altered by the
                    loop header copying pass
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: uranus at tinlans dot org

$ cat test.c
void bne_loop(unsigned int val,unsigned int N)
{
  int i;

  for (i=0;i<N;++i)
    printf("%d\n",val+i);
}

Please note that the comparison expression in the for loop, 'i < N', is a
comparison between a signed int variable and an unsigned int variable. If we
change the type of i from 'int' to 'unsigned int', the issue won't be occured.

$ arm-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-eabi-gcc
COLLECT_LTO_WRAPPER=/home1/lhtseng/arm/4.9/libexec/gcc/arm-eabi/4.9.0/lto-wrapper
Target: arm-eabi
Configured with: ../../../../work/4.9/src/gcc-4.9.0/configure --target=arm-eabi
--prefix=/home1/lhtseng/arm/4.9 --disable-nls --disable-shared
--enable-languages=c --enable-__cxa_atexit --enable-c99 --enable-long-long
--enable-threads=single --with-newlib --disable-multilib --disable-libssp
--disable-libgomp --disable-decimal-float --disable-libffi --disable-libmudflap
--disable-lto --with-gmp=/home1/lhtseng/work/general
--with-mpfr=/home1/lhtseng/work/general --with-mpc=/home1/lhtseng/work/general
--with-isl=/home1/lhtseng/work/general --with-cloog=/home1/lhtseng/work/general
Thread model: single
gcc version 4.9.0 20130802 (experimental) (GCC) 

$ arm-eabi-gcc -O3 -fdump-tree-all -O3 -da -S test.c
$ cat -n test.s
...
    27  .L3:
    28          add     r1, r1, r5
    29          add     r4, r4, #1
    30          ldr     r0, .L9
    31          bl      printf
    32          cmp     r4, r6
    33          mov     r1, r4
    34          bne     .L3
...

The instruction 'mov r1, r4' is redundant. Reading the dump of the RTL
generation pass can understand how it's expanded:

$ cat test.c.166r.expand
...
;; i.0_4 = (unsigned int) i_9;

(insn 20 19 0 (set (reg:SI 110 [ i.0 ])
        (reg/v:SI 112 [ i ])) ../test.c:6 -1
     (nil))
...

$ cat test.c.165t.optimized
...
  <bb 4>:
  # i_13 = PHI <i_9(5), 0(3)>
  # i.0_16 = PHI <i.0_4(5), 0(3)>
  _7 = i.0_16 + val_6(D);
  printf ("%d\n", _7);
  i_9 = i_13 + 1;
  i.0_4 = (unsigned int) i_9;
  if (i_9 != _15)
    goto <bb 5>;
  else
    goto <bb 6>;
...

It's surprised that the line 'i.0_4 = (unsigned int) i_9;' cannot be handled by
any tree-level optimization passes and RTL level optimization passes. After
doing some investigations, we finally find that using '-Os' or '-fno-tree-ch'
instead of '-O3' can generate the optimized code, and the conversion was
eliminated by ivopts properly:
$ arm-eabi-gcc -O3 -fdump-tree-all -O3 -fno-tree-ch -da -S test.c
$ cat test.c.119t.ivopts
   <bb 3>:
  _7 = ivtmp.9_11;
  printf ("%d\n", _7);
  ivtmp.9_10 = ivtmp.9_11 + 1;

  <bb 4>:
  # ivtmp.9_11 = PHI <val_6(D)(2), ivtmp.9_10(3)>
  if (ivtmp.9_11 != _12)
    goto <bb 3>;
  else
    goto <bb 5>;

$ cat test.s
...
.L3:
        mov     r1, r4
        bl      printf
        add     r4, r4, #1
.L2:
        cmp     r4, r5
        ldr     r0, .L6
        bne     .L3
        ldmfd   sp!, {r3, r4, r5, lr}
        bx      lr
...

Therefore, it's believed that there are something wrong with ivopts, which is
unable to handle the loop altered by the tree-ch pass when there is a
comparison (int v.s. unsigned int) in the condition field of a FOR statement.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]