Hello, Attached test reproduces the error: $ gcc -m32 -mmmx 1.c $ ./a.out Aborted (core dumped) Disassembly of the foo is: foo32x2_be: .LFB0: pushl %ebp # 22 *pushsi2 [length = 1] movl %esp, %ebp # 23 *movsi_internal/1 [length = 2] subl $16, %esp # 24 pro_epilogue_adjust_stack_si_add/1 [length = 3] movq %mm0, -8(%ebp) # 3 *movv2sf_internal/9 [length = 4] movl -4(%ebp), %eax # 7 *movsf_internal/4 [length = 3] movl %eax, -12(%ebp) # 14 *movsf_internal/5 [length = 3] flds -12(%ebp) # 21 *movsf_internal/1 [length = 3] leave # 27 leave [length = 1] ret # 28 simple_return_internal [length = 1] We're passing v2sf vector using MMX register, which aliased to x87 stack. Then we're trying to load FP to it, which leds to NaN. As far as I understand, we need `emms' instruction between last MMX use and before first x87 use. Reproduces everywhere, up to 4.7.2 (may be earlier, I have no such).
There is no testcase attached, but you need to *manually* insert _mm_empty (== emms) to switch from MMX to x87 state. The compiler does not automatically insert emms for you.
Created attachment 31389 [details] Testcase
(In reply to Uroš Bizjak from comment #1) > There is no testcase attached, but you need to *manually* insert _mm_empty > (== emms) to switch from MMX to x87 state. > > The compiler does not automatically insert emms for you. Well, then problem is different, test as simple as empty call. I doubt we should emit wrong code here.
(In reply to Yukhin Kirill from comment #3) > (In reply to Uroš Bizjak from comment #1) > > There is no testcase attached, but you need to *manually* insert _mm_empty > > (== emms) to switch from MMX to x87 state. > > > > The compiler does not automatically insert emms for you. > > Well, then problem is different, test as simple as empty call. > I doubt we should emit wrong code here. You need: float foo32x2_be (float32x2_t x) { __builtin_ia32_emms(); return x[1]; } This is documented somewhere in Intel's ISA manual, so the testcase is probably invalid.
I see. So, it seems like a limitation to passing vectors as arguments in 32-bit mode. We may implement something similar to `vzerroupper' autogeneration or simply close the bug as `user misunderstanding.'
I think this is a dup of PR48397.
(In reply to H.J. Lu from comment #6) > I think this is a dup of PR48397. No, this one happens due to missing interdependencies between x87 and MMX registers. We could make all MMX instructions dependant on st(0) register, but this would mean that no MMX instruction whatsoever will be reordered.
(In reply to Yukhin Kirill from comment #5) > I see. So, it seems like a limitation to passing vectors as arguments in > 32-bit mode. We may implement something similar to `vzerroupper' > autogeneration or simply close the bug as `user misunderstanding.' We already tried this. Please see PR19161. There is however, one problem - gcc warns about SSE vector argument without SSE on 32bit targets. I will fix type_natural_mode.
Author: uros Date: Fri Dec 6 17:16:52 2013 New Revision: 205753 URL: http://gcc.gnu.org/viewcvs?rev=205753&root=gcc&view=rev Log: PR target/59405 * config/i386/i386.c (type_natural_mode): Properly handle size 8 for !TARGET_64BIT. testsuite/ChangeLog: PR target/59405 * gcc.target/i386/pr59405.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr59405.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog
The warning is fixed, the testcase without _mm_empty () is invalid.
Author: uros Date: Sun Dec 8 14:20:42 2013 New Revision: 205790 URL: http://gcc.gnu.org/viewcvs?rev=205790&root=gcc&view=rev Log: Backport from mainline 2013-12-06 Uros Bizjak <ubizjak@gmail.com> PR target/59405 * config/i386/i386.c (type_natural_mode): Properly handle size 8 for !TARGET_64BIT. testsuite/ChangeLog Backport from mainline 2013-12-06 Uros Bizjak <ubizjak@gmail.com> PR target/59405 * gcc.target/i386/pr59405.c: New test. Added: branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr59405.c Modified: branches/gcc-4_8-branch/gcc/ChangeLog branches/gcc-4_8-branch/gcc/config/i386/i386.c branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
The warning was a regression from 4.7, now fixed on 4.8+.