Odd register-register move issue.

Simon Dardis Simon.Dardis@imgtec.com
Mon Aug 24 15:57:00 GMT 2015


Hello all,

I'm investigating a GCC issue in which an effective FP register-register move occurs through memory in a well-known benchmark. I've tracked it down to some interaction of the conditional store elimination pass and the ssa-dom pass. The produced assembly with -O3 is:

g:
<snip>
       div.d   $f9,$f6,$f4
        li      $8,1                    # 0x1
        li      $2,1                    # 0x1
        mul.d   $f2,$f9,$f9
        sdc1    $f2,8($9)
.L39:
        ldc1    $f1,8($9)
        li      $13,1                   # 0x1
<snip>

L47:
<snip>
        b       .L39
        sdc1    $f2,8($9)	# delay slot

In the above case, $9 points to some array and $f2 gets written there as expected. The basic-block labelled 47 is another entry to L39 and also writes out $f2 (with a different value). From L39 though, $f1 is loaded with the value we just wrote out. GCC has duplicated the store and not reused $f2. However if -fno-tree-dom-opts or -fno-tree-cselim is used GCC will generate:


        div.d   $f9,$f4,$f2
        li      $2,1                    # 0x1
        li      $7,1                    # 0x1
.L37:
        li      $13,1                   # 0x1
        move    $11,$9
        mul.d   $f1,$f9,$f9
        mul.d   $f8,$f1,$f1
        sdc1    $f1,8($9)

which is a great deal better as there is no effective FP to FP move though memory.  The C code that produces the above looks like:

        if( SomeVal > 1 )
        {
            v = 1 / SomeVal;
            inverted = true ;
        }
        else
        {
            v = SomeVal; 
            inverted = false ;
        }

        array[1] = v * v;  
        fmadd loop with array[1]

I have been able to reproduce this for x86_64 as well. How might I go about resolving this issue?

Thanks,
Simon



More information about the Gcc-help mailing list