At the moment -ffloat-store significantly pessimizes the code generation regardless of whether -mfpmath=sse -msse2 are used or not: float f(float a, float b) { return a + b; } -O2: addss xmm0, xmm1 ret -O2 -ffloat-store: movss DWORD PTR [rsp-20], xmm0 movss xmm0, DWORD PTR [rsp-20] movss DWORD PTR [rsp-24], xmm1 addss xmm0, DWORD PTR [rsp-24] movss DWORD PTR [rsp-4], xmm0 movss xmm0, DWORD PTR [rsp-4] ret Note that -mfpmath=sse -msse2 are the defaults on x86-64. My understanding is that -ffloat-store doesn't affect the result of floating point operations when SSE math is used. If this is true -ffloat-store pessimizes generated code without any change in observable behavior. Recently I have found a steam game that targets x86-64 and was compiled with -ffloat-store (presumably by mistake). For details see: https://forums.factorio.com/viewtopic.php?f=30&t=81134 . When -ffloat-store was removed a developer reported a 35% speedup of the Linux version of the game. My guess is -ffloat-store might be used by mistake when someone tries to get reproducible results on x86 without realizing that the same flags affects the performance negatively on x86-64. To prevent issues like this in the future I think GCC could do two things: 1. Ignore -ffloat-store when it doesn't affect the result of floating-point operations pretending that redundant loads/stores are optimized. 2. Issue a warning when -ffloat-store doesn't affect the result of floating-point operations. Because there is no point in using a flag which only effect is pessimizing code generation.
@item -ffloat-store @opindex ffloat-store Do not store floating-point variables in registers, and inhibit other options that might change whether a floating-point value is taken from a register or memory. I think it does what it says?
For avoiding issues with excess precision there is -fexcess-precision=standard now.
(In reply to Andreas Schwab from comment #2) > For avoiding issues with excess precision there is > -fexcess-precision=standard now. Only for C, for C++ we don't have it. And -fexcess-precision=standard actually doesn't avoid the excess precision, but just enforces it consistently.
(In reply to Richard Biener from comment #1) > @item -ffloat-store > @opindex ffloat-store > Do not store floating-point variables in registers, and inhibit other > options that might change whether a floating-point value is taken from a > register or memory. > > I think it does what it says? Yes, the behavior of the compiler and the documentation matches very well. The compiler works as intended. My report is not about a bug, but about a possible improvement. If ignoring or implementing a warning is considered undesirable, I would suggest expanding the documentation by clarifying the interaction between -ffloat-store and -mfpmath=sse. Something like this in the documentation would help: "If used together with -mfpmath=sse, -ffloat-store doesn't change the results of floating point operations. The only effect it has is severely pessimizing the generated code."
Just marking as a dup of bug 323 and moving on. *** This bug has been marked as a duplicate of bug 323 ***
(In reply to Richard Biener from comment #1) > @item -ffloat-store > @opindex ffloat-store > Do not store floating-point variables in registers, and inhibit other > options that might change whether a floating-point value is taken from a > register or memory. > > I think it does what it says? This is a follow-up for my previous comment. Perhaps I haven't explained myself properly, let me explain why I find the existing behavior a bit confusing. From the documentation on -ffloat-store: "This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture." When a person uses -ffloat-store the desired effect is not the additional loads/stores, but the reproducible results across different compiler version/optimization options. It just happened that the cheapest way to go so is adding additional loads/stores. I'm pretty sure most users would be in favor of removing extra loads/stores when they don't affect the results. I understand that perhaps there are reasons why -ffloat-store should work the way it works now. If this is true, I would recommend updating the documentation by reflecting the cases (if they exists) when one might want to use -ffloat-store on x86-64. From what I understand now using -ffloat-store on x86-64 is just a mistake.
Reopening. I don't think this is a dup of PR323. The problem is instead that the workaround for PR323 is silently accepted and applied even on targets for which the workaround isn't needed. That seems like a valid point to me, and if we don't have an earlier PR making the same point, I think we should keep this one open. I agree that at the very least the documentation could be improved (but I'm not volunteering :-))