Consider this bit of code:
extern double a[20];
double test1 (int n)
{
double accum = 0.0;
int i;
for (i=0; i<n; i++) { accum -= a[i]; }
accum = fabs (accum);
return accum;
}
which is compiled for MIPS using
mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops
-march=74kf1_1 -S abstest.c
With a GCC 4.6 compiler, this produces:
...
.L3:
mtc1 $3,$f2
ldc1 $f0,0($5)
addiu $5,$5,8
mtc1 $2,$f3
sub.d $f2,$f2,$f0
mfc1 $3,$f2
bne $5,$4,.L3
mfc1 $2,$f3
ext $5,$2,0,31
move $4,$3
.L2:
mtc1 $4,$f0
j $31
mtc1 $5,$f1
...
This is terrible code, with all that pointless register-shuffling
inside the loop -- what's gone wrong? Well, the bit-twiddling
expansion of "fabs" produced by optabs.c uses subreg expressions, and
on MIPS CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for
integer operations. And, when IRA sees that, it decides it cannot
alloc "accum" to a FP reg at all, even if it obviously makes sense to
put it there for the rest of its lifetime.
On mainline trunk, things are even worse as it's spilling to memory,
not just shuffling between registers:
.L3:
ldc1 $f0,0($2)
addiu $2,$2,8
sub.d $f2,$f2,$f0
bne $2,$3,.L3
sdc1 $f2,0($sp)
lw $2,0($sp)
ext $3,$2,0,31
lw $2,4($sp)
.L2:
sw $2,4($sp)
sw $3,0($sp)
lw $3,4($sp)
lw $2,0($sp)
addiu $sp,$sp,8
mtc1 $3,$f0
j $31
mtc1 $2,$f1
I've been experimenting with a patch to the MIPS backend to add
define_insn_and_split patterns for floating-point abs -- the idea is
to attach some constraints to the insns to tell IRA it needs a GP reg
for these operations, so it can apply its usual cost analysis and
reload logic instead of giving up. Then the split to introduce the
subreg expansion happens after reload when we already have the right
register class. This seems to work well enough on 4.6; for this
particular example, I'm getting:
.L3:
ldc1 $f2,0($2)
addiu $2,$2,8
bne $2,$4,.L3
sub.d $f0,$f0,$f2
mfc1 $2,$f1
ext $2,$2,0,31
j $31
mtc1 $2,$f1
However, same patch on mainline is still giving spills to memory. :-(
So, here's my question. Is it worthwhile for me to continue this
approach of trying to make the MIPS backend smarter? Or is the way
IRA deals with CANNOT_CHANGE_MODE_CLASS fundamentally broken and in
need of fixing in a target-inspecific way? And/or is there some other
regression in IRA on mainline that's causing it to spill to memory
when it didn't used to in 4.6?