This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Another performance regression
- From: Dale Johannesen <dalej at apple dot com>
- To: gcc-patches at gcc dot gnu dot org, gcc at gcc dot gnu dot org
- Cc: Dale Johannesen <dalej at apple dot com>
- Date: Thu, 26 Sep 2002 12:01:38 -0700
- Subject: Another performance regression
Try the program at the bottom with -O2 -funroll-loops. Don't worry
about the body
of the loops; that's only important insofar as it has the right amount
of code
to cause the inner loop to be unrolled the right number of times,
namely 2, with
1 left over. The unroller generates some rather stupid code here:
/* Calculate the difference between the final and initial
values.
Final value may be a (plus (reg x) (const_int 1)) rtx.
Let the following cse pass simplify this if initial value
is
a constant.
(there's more to it besides the expression described above)
with the expectation that cse will clean it up. However, the second
pass of
loop optimization pulls some, but not all, of this code out of the
outer loop, with the effect that cse can't eliminate it. On ppc, for
example,
the beginning of the function looks like this:
bge- cr0,L18 ; zero-trip check for outer loop
li r0,1 ; unnecessary
cmpwi cr1,r0,0 ; unnecessary
cmpwi cr6,r0,25 ; unnecessary
L16: ; top of outer loop
slwi r0,r6,2
li r8,0
add r7,r0,r28
mr r10,r29
bge+ cr6,L22 ; always false
beq- cr1,L15 ; always false
L22:
... single copy of inner loop body...
L15:
... two copies of inner loop body, executed 12 times...
ble L15
...
blt L16
L18:
I'm not entirely sure, but I think this patch was the culprit:
2002-07-21 Richard Henderson <rth@redhat.com>
* loop.h (LOOP_AUTO_UNROLL): Rename from LOOP_FIRST_PASS.
* loop.c (strength_reduce): Update.
* toplev.c (rest_of_compilation): Do unrolling in the first
loop pass, not the second.
This didn't happen when unrolling was done last.
So should I fix this by making the unrolling code smarter, in effect
doing cse's job? It seems likely Roger Sayle's approach of running
gcse after loop opts would Just Work. Is that going to go in?
int foo(char *abcd00, int abcd01, char *abcd02, int *abcd03,
int*abcd04) {
int abcd05, abcd06, abcd07=0, abcd08=0, abcd09, abcd10, abcd11=0;
for (abcd05=0;abcd05<abcd01;abcd05++) {
for(abcd06=0;abcd06<25;abcd06++) {
if(abcd00[abcd05]==abcd06 && abcd07<2) {
if (abcd07==0) {
abcd09=26*abcd03[abcd06]; abcd02[abcd08++]=abcd00[abcd05];
abcd07=1;
} else if (abcd07==1) {
abcd10=abcd09+abcd03[abcd06];
abcd02[abcd08++]=abcd00[abcd05]; abcd07=2;
}
}
if(abcd07==2) {
abcd04[abcd11++]=abcd10; abcd07=0;
break;
}
}
}
return abcd07;
}