rtlopt loop unroller question
Zdenek Dvorak
rakdver@atrey.karlin.mff.cuni.cz
Wed Oct 22 07:49:00 GMT 2003
Hello,
> The following (sent on behalf of Yossi Markovich) simple loop:
>
> int * foo ()
> {
> int A[N];
> int B[N];
> int i;
> for (i=0; i<N; i++)
> A[i] = B[i];
> return A;
> }
>
> results in much better code when compiled using "gcc3.4 -O3
> -fold-unroll-loops", than when compiled using the rltopt branch with "-O3
> -funroll-loops" (on powerpc-apple-darwin6.4). We are aware of the fact that
> the new loop optimizer in mainline is known to have caused regressions; we
> were wondering whether something can be done to get the better addressing
> calculation using the rltopt branch (possibly using a different set of
> flags?)?
as I suspected, my favourite piece of cse strikes again. With the
patch below, the code produced is much better:
Zdenek
.text
.align 2
.globl _foo
_foo:
lis r3,0xffff
li r0,797
ori r2,r3,14400
stmw r25,-28(r1)
mtctr r0
stwux r1,r1,r2
addi r12,r1,24
L5:
addi r4,r12,4
addi r2,r12,8
addi r29,r12,12
addi r28,r12,16
addi r27,r12,20
addi r26,r12,24
addi r25,r12,28
lwz r0,25544(r12)
lwz r9,25544(r4)
lwz r11,25544(r2)
lwz r10,25544(r29)
lwz r8,25544(r28)
lwz r7,25544(r27)
lwz r6,25544(r26)
lwz r5,25544(r25)
stw r0,8(r12)
addi r12,r12,32
stw r9,8(r4)
stw r11,8(r2)
stw r10,8(r29)
stw r8,8(r28)
stw r7,8(r27)
stw r6,8(r26)
stw r5,8(r25)
bdnz L5
lwz r1,0(r1)
lmw r25,-28(r1)
blr
Index: cse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cse.c,v
retrieving revision 1.231.2.11
diff -c -3 -p -r1.231.2.11 cse.c
*** cse.c 20 Jul 2003 22:04:20 -0000 1.231.2.11
--- cse.c 22 Oct 2003 00:46:36 -0000
*************** fold_rtx (x, insn)
*** 4222,4227 ****
--- 4222,4228 ----
|| XEXP (y, 0) == folded_arg0)
break;
+ #if 0
/* Don't associate these operations if they are a PLUS with the
same constant and it is a power of two. These might be doable
with a pre- or post-increment. Similarly for two subtracts of
*************** fold_rtx (x, insn)
*** 4237,4242 ****
--- 4238,4244 ----
|| (HAVE_POST_DECREMENT
&& exact_log2 (- INTVAL (const_arg1)) >= 0)))
break;
+ #endif
/* Compute the code used to compose the constants. For example,
A-C1-C2 is A-(C1 + C2), so if CODE == MINUS, we want PLUS. */
More information about the Gcc
mailing list