This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/55731] New: Issue with complete innermost loop unrolling (cunrolli)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731

             Bug #: 55731
           Summary: Issue with complete innermost loop unrolling
                    (cunrolli)
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: ysrumyan@gmail.com
                CC: hubicka@ucw.cz, izamyatin@gmail.com


I attached 2 test-cases extracted from important benchmark at which clang and
icc outperform gcc for x86 target (atom). For 1st test-case (t.c) cunrolli
phase does not perform complete loop unrolling with the following message (test
was compiled with -O3 -funroll-loops options):

  Loop size: 23
  Estimated size after unrolling: 33
Not unrolling loop 1: size would grow.

but it is unrolled by cunroll phase:

  Loop size: 24
  Estimated size after unrolling: 32
Unrolled loop 1 completely (duplicated 2 times).

I wonder why this loop was not unrolled by cunrolli? We lost a lot of
optimizations for unrolled loop such as Constant (address) Propagation, Dead
code elimination etc. and got non-optimal binaries.

For comparsion I added another test (t2.c) with successfull complete unrolling
by cunrolli, at which we can see that all assignments to local array 'b' were
properly propagated and deleted but we don't have such transformations for 1st
test-case.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]