This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Optimization problem on PPC when casting integers to floats




From: Geoff Keating <geoffk@cygnus.com>
>> But in the case of casting unsigned shorts to floats, gcc goes bonkers =
>> and ends up doing part of the conversion twice and throwing away the =
>> results.
>Could you be more specific? I think this is fixed in the lastest
>sources, but I'm not sure.


From this test function:

double test1(unsigned short *stuff)
{
unsigned int i;
double x = 0;
for (i = 0; i < 1000; i++)
x += stuff[i];
return x;
}

I was getting the following code (under MacOS X DP4):

_test1:
lis r7,ha16(LC0)
la r7,lo16(LC0)(r7)
lfd f1,0(r7)
lis r11,0x4330
lis r9,0x4330
lis r10,0x8000
li r8,1000
mtctr r8
L8:
lhz r0,0(r3)
addi r3,r3,2
stw r9,-16(r1)
stw r10,-12(r1)
lfd f13,-16(r1)
xoris r8,r0,0x8000
stw r8,-4(r1)
stw r11,-8(r1)
lfd f0,-8(r1)
fsub f0,f0,f13
fadd f1,f1,f0
bdnz L8
blr


Notice in the loop that there are apparently two copies of the double on the stack (offsets -4, -8, -12 and -16 are all written), but only one double is read (at offset -8).

From the test results that I forwarded from Steven G. Johnson for Linux PPC with 2.95.2 and a recent snapshot, it looks like this case is now only as bad as the others (it only has one extra write instead of three). The assembly that Steven mailed to me was as follows:

gcc-2.95.2 produces the following:

test1:
stwu 1,-16(1)
lis 9,.LC0@ha
la 9,.LC0@l(9)
lfd 1,0(9)
lis 9,.LC1@ha
li 0,1000
la 9,.LC1@l(9)
mtctr 0
lfd 13,0(9)
lis 11,0x4330
..L8:
lhz 0,0(3)
xoris 0,0,0x8000
stw 0,12(1)
stw 11,8(1)
lfd 0,8(1)
addi 3,3,2
fsub 0,0,13
fadd 1,1,0
bdnz .L8
la 1,16(1)
blr


The CVS gcc produces this:

test1:
li 0,1000
lis 11,.LC1@ha
mtctr 0
lis 9,.LC0@ha
la 11,.LC1@l(11)
la 9,.LC0@l(9)
lfd 13,0(11)
lfd 1,0(9)
lis 9,0x4330
stwu 1,-16(1)
..L9:
lhz 0,0(3)
addi 3,3,2
stw 9,8(1)
xoris 0,0,0x8000
stw 0,12(1)
lfd 0,8(1)
fsub 0,0,13
fadd 1,1,0
bdnz .L9
addi 1,1,16
blr



-tim


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]