Odd register-register move issue.
Simon Dardis
Simon.Dardis@imgtec.com
Mon Aug 24 15:57:00 GMT 2015
Hello all,
I'm investigating a GCC issue in which an effective FP register-register move occurs through memory in a well-known benchmark. I've tracked it down to some interaction of the conditional store elimination pass and the ssa-dom pass. The produced assembly with -O3 is:
g:
<snip>
div.d $f9,$f6,$f4
li $8,1 # 0x1
li $2,1 # 0x1
mul.d $f2,$f9,$f9
sdc1 $f2,8($9)
.L39:
ldc1 $f1,8($9)
li $13,1 # 0x1
<snip>
L47:
<snip>
b .L39
sdc1 $f2,8($9) # delay slot
In the above case, $9 points to some array and $f2 gets written there as expected. The basic-block labelled 47 is another entry to L39 and also writes out $f2 (with a different value). From L39 though, $f1 is loaded with the value we just wrote out. GCC has duplicated the store and not reused $f2. However if -fno-tree-dom-opts or -fno-tree-cselim is used GCC will generate:
div.d $f9,$f4,$f2
li $2,1 # 0x1
li $7,1 # 0x1
.L37:
li $13,1 # 0x1
move $11,$9
mul.d $f1,$f9,$f9
mul.d $f8,$f1,$f1
sdc1 $f1,8($9)
which is a great deal better as there is no effective FP to FP move though memory. The C code that produces the above looks like:
if( SomeVal > 1 )
{
v = 1 / SomeVal;
inverted = true ;
}
else
{
v = SomeVal;
inverted = false ;
}
array[1] = v * v;
fmadd loop with array[1]
I have been able to reproduce this for x86_64 as well. How might I go about resolving this issue?
Thanks,
Simon
More information about the Gcc-help
mailing list