Bug 94852 - -ffloat-store on x64 target
Summary: -ffloat-store on x64 target
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: documentation
Depends on:
Blocks:
 
Reported: 2020-04-29 15:57 UTC by Ivan Sorokin
Modified: 2021-10-27 19:50 UTC (History)
3 users (show)

See Also:
Host:
Target: x86_64-*-* i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-04-30 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ivan Sorokin 2020-04-29 15:57:47 UTC
At the moment -ffloat-store significantly pessimizes the code generation regardless of whether -mfpmath=sse -msse2 are used or not:

float f(float a, float b)
{
    return a + b;
}

-O2:
        addss   xmm0, xmm1
        ret

-O2 -ffloat-store:
        movss   DWORD PTR [rsp-20], xmm0
        movss   xmm0, DWORD PTR [rsp-20]
        movss   DWORD PTR [rsp-24], xmm1
        addss   xmm0, DWORD PTR [rsp-24]
        movss   DWORD PTR [rsp-4], xmm0
        movss   xmm0, DWORD PTR [rsp-4]
        ret

Note that -mfpmath=sse -msse2 are the defaults on x86-64. My understanding is that -ffloat-store doesn't affect the result of floating point operations when SSE math is used. If this is true -ffloat-store pessimizes generated code without any change in observable behavior.

Recently I have found a steam game that targets x86-64 and was compiled with -ffloat-store (presumably by mistake). For details see: https://forums.factorio.com/viewtopic.php?f=30&t=81134 . When -ffloat-store was removed a developer reported a 35% speedup of the Linux version of the game.

My guess is -ffloat-store might be used by mistake when someone tries to get reproducible results on x86 without realizing that the same flags affects the performance negatively on x86-64.

To prevent issues like this in the future I think GCC could do two things:
1. Ignore -ffloat-store when it doesn't affect the result of floating-point operations pretending that redundant loads/stores are optimized.
2. Issue a warning when -ffloat-store doesn't affect the result of floating-point operations. Because there is no point in using a flag which only effect is pessimizing code generation.
Comment 1 Richard Biener 2020-04-29 16:54:32 UTC
@item -ffloat-store
@opindex ffloat-store
Do not store floating-point variables in registers, and inhibit other
options that might change whether a floating-point value is taken from a
register or memory.

I think it does what it says?
Comment 2 Andreas Schwab 2020-04-29 17:04:25 UTC
For avoiding issues with excess precision there is -fexcess-precision=standard now.
Comment 3 Jakub Jelinek 2020-04-29 17:23:00 UTC
(In reply to Andreas Schwab from comment #2)
> For avoiding issues with excess precision there is
> -fexcess-precision=standard now.

Only for C, for C++ we don't have it.  And -fexcess-precision=standard actually doesn't avoid the excess precision, but just enforces it consistently.
Comment 4 Ivan Sorokin 2020-04-29 17:58:37 UTC
(In reply to Richard Biener from comment #1)
> @item -ffloat-store
> @opindex ffloat-store
> Do not store floating-point variables in registers, and inhibit other
> options that might change whether a floating-point value is taken from a
> register or memory.
> 
> I think it does what it says?

Yes, the behavior of the compiler and the documentation matches very well. The compiler works as intended. My report is not about a bug, but about a possible improvement.

If ignoring or implementing a warning is considered undesirable, I would suggest expanding the documentation by clarifying the interaction between -ffloat-store and -mfpmath=sse.

Something like this in the documentation would help: "If used together with -mfpmath=sse, -ffloat-store doesn't change the results of floating point operations. The only effect it has is severely pessimizing the generated code."
Comment 5 Andrew Pinski 2020-04-29 18:01:51 UTC
Just marking as a dup of bug 323 and moving on.

*** This bug has been marked as a duplicate of bug 323 ***
Comment 6 Ivan Sorokin 2020-04-29 18:15:27 UTC
(In reply to Richard Biener from comment #1)
> @item -ffloat-store
> @opindex ffloat-store
> Do not store floating-point variables in registers, and inhibit other
> options that might change whether a floating-point value is taken from a
> register or memory.
> 
> I think it does what it says?

This is a follow-up for my previous comment.

Perhaps I haven't explained myself properly, let me explain why I find the existing behavior a bit confusing.

From the documentation on -ffloat-store:
"This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture."

When a person uses -ffloat-store the desired effect is not the additional loads/stores, but the reproducible results across different compiler version/optimization options. It just happened that the cheapest way to go so is adding additional loads/stores.

I'm pretty sure most users would be in favor of removing extra loads/stores when they don't affect the results.

I understand that perhaps there are reasons why -ffloat-store should work the way it works now. If this is true, I would recommend updating the documentation by reflecting the cases (if they exists) when one might want to use -ffloat-store on x86-64. From what I understand now using -ffloat-store on x86-64 is just a mistake.
Comment 7 Richard Sandiford 2020-04-30 16:29:50 UTC
Reopening.  I don't think this is a dup of PR323.  The problem
is instead that the workaround for PR323 is silently accepted
and applied even on targets for which the workaround isn't needed.
That seems like a valid point to me, and if we don't have an
earlier PR making the same point, I think we should keep this
one open.

I agree that at the very least the documentation could be improved
(but I'm not volunteering :-))