[PATCH, i386]: Fix PR target/29852: Use fprem and fprem1 insns for SSE math

Uros Bizjak ubizjak@gmail.com
Wed Nov 29 22:12:00 GMT 2006


Hello!

This patch implements fmod and remainder intrinsics using x87 
instructions also for SSE math. In order to shorten truncation sequences 
and x87->SSE reg reloads, truncxfsf2_mixed and truncxfdf2_mixed patterns 
have to be enabled also for non-mixed SSE/387 math.

The testcase from PR:

double foo(double a, double b)
{
  double x = fmod(a, 1.1);
  return x + b;
}

compiles for x86_64 target to (-O2 -mno-math-errno for clarity):

        movsd   %xmm0, -16(%rsp)
        fldl    -16(%rsp)
        fldl    .LC0(%rip)
        fxch    %st(1)
.L2:
        fprem
        fnstsw  %ax
        testb   $4, %ah
        jne     .L2
        fstp    %st(1)
        fstpl   -8(%rsp)		<<- this is the truncation insn
        movsd   -8(%rsp), %xmm0
        addsd   %xmm1, %xmm0
        ret

As shown in the PR, this patch executed synthetic fmod() testcase more 
than 4 times faster than unpatched gcc and almost 2 times faster than icc.

2006-11-29  Uros Bizjak  <ubizjak@gmail.com>

        PR target/29852
        config/i386/i386.md (*truncxfsf2_mixed, *truncxfdf2_mixed): Enable
        insn patterns for TARGET_80387.
        (*truncxfsf2_i387, *truncxfdf2_i387): Remove.
        (*truncxfsf2_i387_1): Rename to *truncxfsf2_i387.
        (*truncxfdf2_i387_1): Rename to *truncxfdf2_i387.
        (fmod<mode>3, remainder<mode>3): Enable expaders for SSE math.
        Generate truncxf<mode>2 insn patterns for strict SSE math.

Patch was bootstrapped on x86_64-pc-linux-gnu and regression tested for 
c, c++ and fortran.

OK for mainline?

Uros.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: i386-fmodsse.diff
Type: text/x-patch
Size: 4619 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20061129/27e66867/attachment.bin>


More information about the Gcc-patches mailing list