This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC PATCH, x86_64] Use -mno-sse[,2] to fall back to x87 FP argument passing convention


Hello!

I would like to present a kind of x86_64 ABI enhancement for FP types,
where argument passing convention would follow x87 argument passing
convention.

In current state of x86_64 affairs, user can select -mfpmath=387 to
instruct the compiler to use x87 FP instructions. However, arguments
to and from functions are still passed in SSE registers, as specified
x86_64 ABI convention. This means that at the beginning of the
function, FP argumets have to be kicked from SSE registers to memory
and pulled back from memory into x87 registers (and vice versa at the
end of the function).

Attached RFC patch can be used to specify where FP arguments of
different modes are passed. It is suprisingly short, as it piggybacks
on existing XFmode functionality (XFmode arguments are always passed
via x87 registers). So, specifying -mno-sse2 now passes DFmode
arguments using x87 passing convention, instead of emitting an error
that SSE register is used without SSE enabled.

The benefits of this change can be seen from following (rather silly) example:

--cut here--
double test(double x)
{
       double a = atan2(x, 2.0);

       a *= 4.0;
       return a;
}
--cut here--

Currently, gcc -O2 -mfmpath=387 will produce:
test:
.LFB2:
       subq    $8, %rsp
.LCFI0:
       flds    .LC0(%rip)
       fstpl   (%rsp)
       movsd   (%rsp), %xmm1
       call    atan2
       movsd   %xmm0, (%rsp)
       fldl    (%rsp)
       fmuls   .LC1(%rip)
       fstpl   (%rsp)
       movsd   (%rsp), %xmm0
       addq    $8, %rsp
       ret

where patched gcc -O2 -mno-sse [-mfpmath=387] would produce:
test:
.LFB2:
       subq    $24, %rsp
.LCFI0:
       movabsq $4611686018427387904, %rax
       movq    %rax, 8(%rsp)
       movq    32(%rsp), %rax
       movq    %rax, (%rsp)
       call    atan2
       fmuls   .LC1(%rip)
       addq    $24, %rsp
       ret

But the real gain will be using -O2 -mno-sse [-mfpmath=387]
-ffast-math where x87 builtins will be emitted:
test:
.LFB2:
       flds    .LC0(%rip)
       fldl    8(%rsp)
       fxch    %st(1)
       fpatan
       fmuls   .LC1(%rip)
       ret

I think that this would be a nice extension in comparison to an error
that is currently emmited for -mno-sse. Please note, that complex
numbers and decimal FP are not supported by attached patch.

Uros.

Attachment: x86_64-usex87.diff
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]