[Bug target/92190] [10 Regression] ICE in sp_valid_at, at config/i386/i386.c:6162 since r276648

ubizjak at gmail dot com gcc-bugzilla@gcc.gnu.org
Wed Nov 27 20:31:00 GMT 2019


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92190

--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Liu Hao from comment #7)
> MSDN says 'the upper portions of YMM0-15 and ZMM0-15 are considered volatile
> and must be considered destroyed on function calls' explicitly [1].
> 
> I am not clear about the cause of OP's ICE, but I think it should conform to
> MSABI to emit VZEROUPPER in the epilog, followed by restoring XMM6 - XMM15,
> destroying their upper halves. Similar with the prolog.

The insertion of vzeroupper is not "invisible" to stack frame management code
any more, since vzeroupper is now defined as:

(insn 738 619 434 2 (parallel [
            (unspec_volatile [
                    (const_int 0 [0])
                ] UNSPECV_VZEROUPPER)
            (clobber (reg:V2DI 20 xmm0))
            (clobber (reg:V2DI 21 xmm1))
            (clobber (reg:V2DI 22 xmm2))
            (clobber (reg:V2DI 23 xmm3))
            (clobber (reg:V2DI 24 xmm4))
            (clobber (reg:V2DI 25 xmm5))
            (set (reg:V2DI 26 xmm6)
                (reg:V2DI 26 xmm6))
            (clobber (reg:V2DI 27 xmm7))
            (clobber (reg:V2DI 44 xmm8))
            (clobber (reg:V2DI 45 xmm9))
            (clobber (reg:V2DI 46 xmm10))
            (clobber (reg:V2DI 47 xmm11))
            (clobber (reg:V2DI 48 xmm12))
            (clobber (reg:V2DI 49 xmm13))
            (clobber (reg:V2DI 50 xmm14))
            (clobber (reg:V2DI 51 xmm15))
        ]) "pr92190.c":8:3 -1
     (nil))


. The insertion point of vzeroupper pass is just after reload pass, and now all
xmm registers (xmm0 - xmm15) become live. This is not a problem in SYSV ABI,
where all registers are call_used, but in MS ABI, the prologue now tries to
save xmm6 - xmm15 to the stack.

So, vzeroupper should be described in a way that won't trigger saves of xmm6 -
xmm15 to the stack, while still mark that high part of the register is
clobbered.

An alternative would be to consider the mode of call_used register and save
only wide (> 128bits) registers in the caller. I'm not sure if the current
implementation already clobbers the high part of the 256bit register.


More information about the Gcc-bugs mailing list