Bug 65782 - Assembly failure (invalid register for .seh_savexmm) with -O3 -mavx512f on mingw-w64
Summary: Assembly failure (invalid register for .seh_savexmm) with -O3 -mavx512f on mi...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: 8.4
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2015-04-16 05:09 UTC by James Almer
Modified: 2020-02-14 17:32 UTC (History)
5 users (show)

See Also:
Host:
Target: x86_64-w64-mingw32
Build:
Known to work:
Known to fail: 4.9.2, 5.0
Last reconfirmed: 2015-09-22 00:00:00


Attachments
Assembly output (15.72 KB, text/plain)
2015-04-16 05:09 UTC, James Almer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description James Almer 2015-04-16 05:09:51 UTC
Created attachment 35328 [details]
Assembly output

https://raw.githubusercontent.com/foo86/dcadec/4dac90072f1a0ad368430dbbb568ac71def0241f/libdcadec/idct_float.c

GCC 5.1.0 RC and mingw-w64 v4.0.1, cross-compiler. Can also be reproduced with GCC 4.9

[jamrial@archVM dcadec]$ x86_64-w64-mingw32-gcc -O3 -mavx512f -c -o libdcadec/idct_float.o libdcadec/idct_float.c
/tmp/ccGUgVPR.s: Assembler messages:
/tmp/ccGUgVPR.s:557: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:559: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:561: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:563: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:565: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:567: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:569: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:571: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:573: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:575: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:577: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:579: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:581: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:583: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:585: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:587: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1482: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1484: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1486: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1488: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1490: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1492: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1494: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1496: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1498: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1500: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1502: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1504: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1506: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1508: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1510: Error: invalid register for .seh_savexmm
/tmp/ccGUgVPR.s:1512: Error: invalid register for .seh_savexmm

[jamrial@archVM dcadec]$ x86_64-w64-mingw32-gcc -v
Using built-in specs.
COLLECT_GCC=x86_64-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/opt/mingw64/lib/gcc/x86_64-w64-mingw32/5.0.1/lto-wrapper
Target: x86_64-w64-mingw32
Configured with: /home/jamrial/gcc-5.1.0-RC-20150412/configure --host=x86_64-unknown-linux-gnu --build=x86_64-unknown-linux-gnu --target=x86_64-w64-mingw32 --disable-multilib --enable-static --disable-shared --enable-64bit --prefix=/opt/mingw64 --with-sysroot=/opt/mingw64 --enable-version-specific-runtime-libs --with-dwarf --enable-fully-dynamic-string --enable-languages=c,c++ --enable-libssp --with-host-libstdcxx='-lstdc++ -lsupc++' --enable-lto --disable-win32-registry --libexecdir=/opt/mingw64/lib --disable-nls
Thread model: win32
gcc version 5.0.1 20150412 (prerelease) (GCC)

Attached is the resulting assembly file.
Comment 1 Kai Tietz 2015-09-22 10:54:31 UTC
This issue is related to output in gcc for SEH-prologue pseudos.  It tries to output registers not being supported 8-byte SSE ones.

Generally, AVX512 can't be supported in an 32-byte aligned way on x64 target anyway.
Comment 2 Jakub Jelinek 2017-01-20 16:31:11 UTC
For the xmm16 to xmm31 registers a possible workaround could be to turn those registers fixed on mingw (thus unavailable for register allocation and not call saved).  See PR79127.
Comment 3 Daniel Fruzynski 2019-01-01 21:31:32 UTC
Cygwin (x86_64-pc-cygwin) is also affected. I have encountered this bug on gcc 7.4.0.

Could you add new option which would remove XMM16+ registers from available registers pool? It could be used as an easy to use workaround until you fix it properly.
Comment 4 Daniel Fruzynski 2019-01-01 22:58:56 UTC
I have found that I can use -ffixed-reg option for this. It allows to eliminate one register, so I have to use it 16 times to eliminate all xmm16..31 registers. It would be handy to have another option which would allow to disable all registers from this group together.
Comment 5 Daniel Fruzynski 2019-01-01 23:29:29 UTC
I got following link: https://stackoverflow.com/questions/53733624/is-xmm8-register-value-preserved-across-calls/53733767#53733767

Quote from it: "Any additional registers for newer instruction sets are volatile by default. This includes the upper parts of YMM0-15 and ZMM0-15 as well as ?MM16-31 if present.".

So it looks that gcc should not generate .seh_savexmm for xmm16..31 at all.
Comment 6 Agner Fog 2019-04-29 16:01:50 UTC
I get the same error with G++ 7.4.0 Cygwin when compiling with option -mavx512vl -m64. 

A workaround is to use -fno-asynchronous-unwind-tables

Register xmm16-31 should be considered clobbered in Win64. See https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin
Comment 7 GCC Commits 2020-02-08 10:05:05 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:79ab8c4321b2dc940bb706a7432a530e26f0df1a

commit r10-6522-g79ab8c4321b2dc940bb706a7432a530e26f0df1a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat Feb 8 10:59:40 2020 +0100

    i386: Make xmm16-xmm31 call used even in ms ABI [PR65782]
    
    On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
    > I guess that Comment #9 patch form the PR should be trivially correct,
    > but althouhg it looks obvious, I don't want to propose the patch since
    > I have no means of testing it.
    
    I don't have means of testing it either.
    https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
    is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
    128-bits only) are call preserved.
    
    We are talking e.g. about
    /* { dg-options "-O2 -mabi=ms -mavx512vl" } */
    
    typedef double V __attribute__((vector_size (16)));
    void foo (void);
    V bar (void);
    void baz (V);
    void
    qux (void)
    {
      V c;
      {
        register V a __asm ("xmm18");
        V b = bar ();
        asm ("" : "=x" (a) : "0" (b));
        c = a;
      }
      foo ();
      {
        register V d __asm ("xmm18");
        V e;
        d = c;
        asm ("" : "=x" (e) : "0" (d));
        baz (e);
      }
    }
    where according to the MSDN doc gcc incorrectly holds the c value
    in xmm18 register across the foo call; if foo is compiled by some Microsoft
    compiler (or LLVM), then it could clobber %xmm18.
    If all xmm18 occurrences are changed to say xmm15, then it is valid to hold
    the 128-bit value across the foo call (though, surprisingly, LLVM saves it
    into stack anyway).
    
    The other parts are I guess mainly about SEH.  Consider e.g.
    void
    foo (void)
    {
      register double x __asm ("xmm14");
      register double y __asm ("xmm18");
      asm ("" : "=x" (x));
      asm ("" : "=v" (y));
      x += y;
      y += x;
      asm ("" : : "x" (x));
      asm ("" : : "v" (y));
    }
    looking at cross-compiler output, with -O2 -mavx512f this emits
    	.file	"abcdeq.c"
    	.text
    	.align 16
    	.globl	foo
    	.def	foo;	.scl	2;	.type	32;	.endef
    	.seh_proc	foo
    foo:
    	subq	$40, %rsp
    	.seh_stackalloc	40
    	vmovaps %xmm14,	(%rsp)
    	.seh_savexmm	%xmm14, 0
    	vmovaps %xmm18,	16(%rsp)
    	.seh_savexmm	%xmm18, 16
    	.seh_endprologue
    	vaddsd	%xmm18, %xmm14, %xmm14
    	vaddsd	%xmm18, %xmm14, %xmm18
    	vmovaps	(%rsp), %xmm14
    	vmovaps	16(%rsp), %xmm18
    	addq	$40, %rsp
    	ret
    	.seh_endproc
    	.ident	"GCC: (GNU) 10.0.1 20200207 (experimental)"
    Does whatever assembler mingw64 uses even assemble this (I mean the
    .seh_savexmm %xmm16, 16 could be problematic)?
    I can find e.g.
    https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin/43210527
    which then links to
    https://gcc.gnu.org/PR65782
    
    2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
    	    Jakub Jelinek  <jakub@redhat.com>
    
    	PR target/65782
    	* config/i386/i386.h (CALL_USED_REGISTERS): Make
    	xmm16-xmm31 call-used even in 64-bit ms-abi.
    
    	* gcc.target/i386/pr65782.c: New test.
    
    Co-authored-by: Uroš Bizjak <ubizjak@gmail.com>
Comment 8 Kai Tietz 2020-02-09 09:18:31 UTC
Hmm, that behavior of gcc seems to be indeed pretty bad. The SEH commands for registers above index 15 (0..15) for xmm? are indeed undefined, and even worse, can't be coded proper into the seh table correctly.
Anything above 16-byte size of ?mm registers, and anything above register index 15 has to be treated as call clobbered. But in anycase, the unwind information has not to contain that information
Comment 9 GCC Commits 2020-02-13 22:33:02 UTC
The releases/gcc-9 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a91e5d88970c8d865a49f2a4ed4e17ee2c58b73f

commit r9-8222-ga91e5d88970c8d865a49f2a4ed4e17ee2c58b73f
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat Feb 8 10:59:40 2020 +0100

    i386: Make xmm16-xmm31 call used even in ms ABI [PR65782]
    
    On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
    > I guess that Comment #9 patch form the PR should be trivially correct,
    > but althouhg it looks obvious, I don't want to propose the patch since
    > I have no means of testing it.
    
    I don't have means of testing it either.
    https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
    is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
    128-bits only) are call preserved.
    
    We are talking e.g. about
    /* { dg-options "-O2 -mabi=ms -mavx512vl" } */
    
    typedef double V __attribute__((vector_size (16)));
    void foo (void);
    V bar (void);
    void baz (V);
    void
    qux (void)
    {
      V c;
      {
        register V a __asm ("xmm18");
        V b = bar ();
        asm ("" : "=x" (a) : "0" (b));
        c = a;
      }
      foo ();
      {
        register V d __asm ("xmm18");
        V e;
        d = c;
        asm ("" : "=x" (e) : "0" (d));
        baz (e);
      }
    }
    where according to the MSDN doc gcc incorrectly holds the c value
    in xmm18 register across the foo call; if foo is compiled by some Microsoft
    compiler (or LLVM), then it could clobber %xmm18.
    If all xmm18 occurrences are changed to say xmm15, then it is valid to hold
    the 128-bit value across the foo call (though, surprisingly, LLVM saves it
    into stack anyway).
    
    The other parts are I guess mainly about SEH.  Consider e.g.
    void
    foo (void)
    {
      register double x __asm ("xmm14");
      register double y __asm ("xmm18");
      asm ("" : "=x" (x));
      asm ("" : "=v" (y));
      x += y;
      y += x;
      asm ("" : : "x" (x));
      asm ("" : : "v" (y));
    }
    looking at cross-compiler output, with -O2 -mavx512f this emits
    	.file	"abcdeq.c"
    	.text
    	.align 16
    	.globl	foo
    	.def	foo;	.scl	2;	.type	32;	.endef
    	.seh_proc	foo
    foo:
    	subq	$40, %rsp
    	.seh_stackalloc	40
    	vmovaps %xmm14,	(%rsp)
    	.seh_savexmm	%xmm14, 0
    	vmovaps %xmm18,	16(%rsp)
    	.seh_savexmm	%xmm18, 16
    	.seh_endprologue
    	vaddsd	%xmm18, %xmm14, %xmm14
    	vaddsd	%xmm18, %xmm14, %xmm18
    	vmovaps	(%rsp), %xmm14
    	vmovaps	16(%rsp), %xmm18
    	addq	$40, %rsp
    	ret
    	.seh_endproc
    	.ident	"GCC: (GNU) 10.0.1 20200207 (experimental)"
    Does whatever assembler mingw64 uses even assemble this (I mean the
    .seh_savexmm %xmm16, 16 could be problematic)?
    I can find e.g.
    https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin/43210527
    which then links to
    https://gcc.gnu.org/PR65782
    
    2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
    	    Jakub Jelinek  <jakub@redhat.com>
    
    	PR target/65782
    	* config/i386/i386.h (CALL_USED_REGISTERS): Make
    	xmm16-xmm31 call-used even in 64-bit ms-abi.
    
    	* gcc.target/i386/pr65782.c: New test.
    
    Co-authored-by: Uroš Bizjak <ubizjak@gmail.com>
Comment 10 GCC Commits 2020-02-14 16:39:05 UTC
The releases/gcc-8 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:baef3efdc4992e4dcb7f4de62ff5bcb13bf05f60

commit r8-10016-gbaef3efdc4992e4dcb7f4de62ff5bcb13bf05f60
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Feb 14 15:47:55 2020 +0100

    i386: Make xmm16-xmm31 call used even in ms ABI [PR65782]
    
    On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
    > I guess that Comment #9 patch form the PR should be trivially correct,
    > but althouhg it looks obvious, I don't want to propose the patch since
    > I have no means of testing it.
    
    I don't have means of testing it either.
    https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
    is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
    128-bits only) are call preserved.
    
    We are talking e.g. about
    /* { dg-options "-O2 -mabi=ms -mavx512vl" } */
    
    typedef double V __attribute__((vector_size (16)));
    void foo (void);
    V bar (void);
    void baz (V);
    void
    qux (void)
    {
      V c;
      {
        register V a __asm ("xmm18");
        V b = bar ();
        asm ("" : "=x" (a) : "0" (b));
        c = a;
      }
      foo ();
      {
        register V d __asm ("xmm18");
        V e;
        d = c;
        asm ("" : "=x" (e) : "0" (d));
        baz (e);
      }
    }
    where according to the MSDN doc gcc incorrectly holds the c value
    in xmm18 register across the foo call; if foo is compiled by some Microsoft
    compiler (or LLVM), then it could clobber %xmm18.
    If all xmm18 occurrences are changed to say xmm15, then it is valid to hold
    the 128-bit value across the foo call (though, surprisingly, LLVM saves it
    into stack anyway).
    
    The other parts are I guess mainly about SEH.  Consider e.g.
    void
    foo (void)
    {
      register double x __asm ("xmm14");
      register double y __asm ("xmm18");
      asm ("" : "=x" (x));
      asm ("" : "=v" (y));
      x += y;
      y += x;
      asm ("" : : "x" (x));
      asm ("" : : "v" (y));
    }
    looking at cross-compiler output, with -O2 -mavx512f this emits
    	.file	"abcdeq.c"
    	.text
    	.align 16
    	.globl	foo
    	.def	foo;	.scl	2;	.type	32;	.endef
    	.seh_proc	foo
    foo:
    	subq	$40, %rsp
    	.seh_stackalloc	40
    	vmovaps %xmm14,	(%rsp)
    	.seh_savexmm	%xmm14, 0
    	vmovaps %xmm18,	16(%rsp)
    	.seh_savexmm	%xmm18, 16
    	.seh_endprologue
    	vaddsd	%xmm18, %xmm14, %xmm14
    	vaddsd	%xmm18, %xmm14, %xmm18
    	vmovaps	(%rsp), %xmm14
    	vmovaps	16(%rsp), %xmm18
    	addq	$40, %rsp
    	ret
    	.seh_endproc
    	.ident	"GCC: (GNU) 10.0.1 20200207 (experimental)"
    Does whatever assembler mingw64 uses even assemble this (I mean the
    .seh_savexmm %xmm16, 16 could be problematic)?
    I can find e.g.
    https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin/43210527
    which then links to
    https://gcc.gnu.org/PR65782
    
    2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
    	    Jakub Jelinek  <jakub@redhat.com>
    
    	PR target/65782
    	* config/i386/i386.h (CALL_USED_REGISTERS): Make
    	xmm16-xmm31 call-used even in 64-bit ms-abi.
    
    	* gcc.target/i386/pr65782.c: New test.
    
    Co-authored-by: Uroš Bizjak <ubizjak@gmail.com>
Comment 11 Jakub Jelinek 2020-02-14 17:32:08 UTC
Fixed on the trunk and all release branches.