This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Empty loop elimination in 4.0 ?


In the days of gcc-2.6.x the documentation of gcc said that empty loops
were not eliminated. The docs used to say that this was on purpose -
because empty loops were being used as delay loops.

Does this ancient and not portable delay-loop hack still work with
gcc version 4.0.0 20050102 (experimental) ?
For example:
      int f(int i)
           int a = 0;
           int j;
           for (j=0 ; j < 100 ; ++j)
         return 0;
      int main()
        return f(10*1000*1000);

For this code, gcc finds out that 'a' is dead, and it almost eliminates
the inner loop. Yet, the inner loop is not eliminated, and it is not
moved out of the outer loop.
I think that the above code should have been transformed to something
      int f(int i)
         return 0;

This is not a regression, because this is the way gcc-3.4.2 does it. Still
"fixing" this may result in performance improvements that may offset
one or two performance regressions.

Here is the relevant part of the asm (gcc -O2 t.c -s):
        movl    $100, %eax
        .p2align 4,,15
        decl    %eax
        jne     .L5
        incl    %edx
        movl    %ecx, %eax
        subl    %edx, %eax
        testl   %eax, %eax
        jg      .L4

On the bright side, at least it does not try to unroll the outer loop
like it did with gcc-3.4.2 (gcc-3.4 copies the inner loop over and over).

I came to this example by examining some tail recursion code - that
was badly optimized (bad on itself, without comparing it to any other
compiler or version).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]