Bug 59405 - Incorrect FP<->MMX transition during call/ret
Summary: Incorrect FP<->MMX transition during call/ret
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: 4.8.3
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-06 11:05 UTC by Yukhin Kirill
Modified: 2013-12-08 14:22 UTC (History)
5 users (show)

See Also:
Host:
Target: i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Testcase (165 bytes, text/plain)
2013-12-06 11:18 UTC, Yukhin Kirill
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yukhin Kirill 2013-12-06 11:05:09 UTC
Hello,
Attached test reproduces the error:
  $ gcc -m32 -mmmx 1.c
  $ ./a.out
  Aborted (core dumped)

Disassembly of the foo is:
foo32x2_be:
.LFB0:
        pushl   %ebp    # 22    *pushsi2        [length = 1]
        movl    %esp, %ebp      # 23    *movsi_internal/1       [length = 2]
        subl    $16, %esp       # 24    pro_epilogue_adjust_stack_si_add/1      [length = 3]
        movq    %mm0, -8(%ebp)  # 3     *movv2sf_internal/9     [length = 4]
        movl    -4(%ebp), %eax  # 7     *movsf_internal/4       [length = 3]
        movl    %eax, -12(%ebp) # 14    *movsf_internal/5       [length = 3]
        flds    -12(%ebp)       # 21    *movsf_internal/1       [length = 3]
        leave   # 27    leave   [length = 1]
        ret     # 28    simple_return_internal  [length = 1]

We're passing v2sf vector using MMX register, which aliased to x87 stack.
Then we're trying to load FP to it, which leds to NaN.

As far as I understand, we need `emms' instruction between last MMX use and
before first x87 use.

Reproduces everywhere, up to 4.7.2 (may be earlier, I have no such).
Comment 1 Uroš Bizjak 2013-12-06 11:17:38 UTC
There is no testcase attached, but you need to *manually* insert _mm_empty (== emms) to switch from MMX to x87 state.

The compiler does not automatically insert emms for you.
Comment 2 Yukhin Kirill 2013-12-06 11:18:37 UTC
Created attachment 31389 [details]
Testcase
Comment 3 Yukhin Kirill 2013-12-06 11:19:36 UTC
(In reply to Uroš Bizjak from comment #1)
> There is no testcase attached, but you need to *manually* insert _mm_empty
> (== emms) to switch from MMX to x87 state.
> 
> The compiler does not automatically insert emms for you.

Well, then problem is different, test as simple as empty call.
I doubt we should emit wrong code here.
Comment 4 Uroš Bizjak 2013-12-06 11:25:11 UTC
(In reply to Yukhin Kirill from comment #3)
> (In reply to Uroš Bizjak from comment #1)
> > There is no testcase attached, but you need to *manually* insert _mm_empty
> > (== emms) to switch from MMX to x87 state.
> > 
> > The compiler does not automatically insert emms for you.
> 
> Well, then problem is different, test as simple as empty call.
> I doubt we should emit wrong code here.

You need:

float
foo32x2_be (float32x2_t x)
{
  __builtin_ia32_emms();
  return x[1];
}

This is documented somewhere in Intel's ISA manual, so the testcase is probably invalid.
Comment 5 Yukhin Kirill 2013-12-06 11:47:10 UTC
I see. So, it seems like a limitation to passing vectors as arguments in 32-bit mode. We may implement something similar to `vzerroupper' autogeneration or simply close the bug as `user misunderstanding.'
Comment 6 H.J. Lu 2013-12-06 12:00:36 UTC
I think this is a dup of PR48397.
Comment 7 Uroš Bizjak 2013-12-06 12:30:37 UTC
(In reply to H.J. Lu from comment #6)
> I think this is a dup of PR48397.

No, this one happens due to missing interdependencies between x87 and MMX registers. We could make all MMX instructions dependant on st(0) register, but this would mean that no MMX instruction whatsoever will be reordered.
Comment 8 Uroš Bizjak 2013-12-06 12:36:35 UTC
(In reply to Yukhin Kirill from comment #5)
> I see. So, it seems like a limitation to passing vectors as arguments in
> 32-bit mode. We may implement something similar to `vzerroupper'
> autogeneration or simply close the bug as `user misunderstanding.'

We already tried this. Please see PR19161.

There is however, one problem - gcc warns about SSE vector argument without SSE on 32bit targets. I will fix type_natural_mode.
Comment 9 uros 2013-12-06 17:16:55 UTC
Author: uros
Date: Fri Dec  6 17:16:52 2013
New Revision: 205753

URL: http://gcc.gnu.org/viewcvs?rev=205753&root=gcc&view=rev
Log:
	PR target/59405
	* config/i386/i386.c (type_natural_mode): Properly handle
	size 8 for !TARGET_64BIT.

testsuite/ChangeLog:

	PR target/59405
	* gcc.target/i386/pr59405.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr59405.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog
Comment 10 Uroš Bizjak 2013-12-06 17:27:09 UTC
The warning is fixed, the testcase without _mm_empty () is invalid.
Comment 11 uros 2013-12-08 14:20:45 UTC
Author: uros
Date: Sun Dec  8 14:20:42 2013
New Revision: 205790

URL: http://gcc.gnu.org/viewcvs?rev=205790&root=gcc&view=rev
Log:
	Backport from mainline
	2013-12-06  Uros Bizjak  <ubizjak@gmail.com>

	PR target/59405
	* config/i386/i386.c (type_natural_mode): Properly handle
	size 8 for !TARGET_64BIT.

testsuite/ChangeLog

	Backport from mainline
	2013-12-06  Uros Bizjak  <ubizjak@gmail.com>

	PR target/59405
	* gcc.target/i386/pr59405.c: New test.


Added:
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/i386/pr59405.c
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/config/i386/i386.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 12 Uroš Bizjak 2013-12-08 14:22:27 UTC
The warning was a regression from 4.7, now fixed on 4.8+.