Bug 110260 - Multiple applications misbehave at runtime when compiled with -march=znver4
Summary: Multiple applications misbehave at runtime when compiled with -march=znver4
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 13.1.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2023-06-14 21:22 UTC by Jimi Huotari
Modified: 2024-01-27 18:37 UTC (History)
6 users (show)

See Also:
Host:
Target: x86_64
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jimi Huotari 2023-06-14 21:22:21 UTC
While hunting for the cause for a stack alignment issue in Wine [1], it started to look more and more that '-march=znver4' is at least related to the issue.

I had compiled a few hundred other packages mistakingly with certain testing flags in between building Wine and the mingw-toolchain, and noticed a certain issue related to KWin disappear [2].

The issue came back the next day, after resting X11, and it took me a while to realise what had happened.  I saw from my build logs that KWin was one of those packages that I had compiled with the testing C{XX}FLAGS (-march=x86-64 -O2 -g), and I had then later compiled it again with the usual flags (-march=znver4 -O2 -fomit-frame-pointer -pipe -mindirect-branch=thunk), and so I did some test builds with -march=x86-64 and -march=znver3 and the problem goes away.

There's at least one more recent issue with my LXQt panel task manager widget, but I'm not sure yet if this will make that go away as well.

Let me know if I can help pinpoint the issue more, or/and confirm this to be a GCC issue or not.

Next I'm thinking of trying to see if I can spot one or more of the instructions do it that is not included with znver3 per the x86-Options documentation [3].

(By the by, is ADCX a typo of ADX?  I see -madx as an option but only one use of it otherwise, and no -adcx as an option and lots of mentions of it... but perhaps I'm not reading them correct-like.)

Thank you!


1. https://bugs.winehq.org/show_bug.cgi?id=55007
2. https://bugs.kde.org/show_bug.cgi?id=469426
3. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
Comment 1 Andrew Pinski 2023-06-14 21:28:41 UTC
Could be multiple things really.
PR 109780, PR 109093, PR 109087 and PR 109982 are related to an aligment issue with -march=znver* even.
Comment 2 Sam James 2023-06-14 21:54:42 UTC
The case I've looked at the most is KWin (https://bugs.gentoo.org/895750) where we narrowed it down to https://bugs.kde.org/show_bug.cgi?id=460572#c28 with:
"""
To emphasise:
- -O2 -march=x86-64-v3 works
- -O2 -march=x86-64-v4 (or -march=rocketlake) triggers it
"""

It may not be the same as Chiitoo's bug here, but it probably is given the symptoms are the same & both likely involve AVX512.

I couldn't reproduce it on my machine with AVX512 (only one machine, Tiger Lake) and couldn't really do any more at that point. I guess a good start would be seeing what the minimal file(s) required to break kwin are w/ -march=znver4.
Comment 3 Richard Biener 2023-06-15 06:32:40 UTC
Can you try if -fno-schedule-insns helps in (all?) cases?
Comment 4 Alexander Monakov 2023-06-15 07:01:10 UTC
Um, sched1 is not enabled on x86 so -fno-schedule-insns does nothing — you probably meant -fno-schedule-insns2?

Another thing to try is -fstack-reuse=none, as indicated by comment #1.
Comment 5 Richard Biener 2023-06-15 07:04:31 UTC
(In reply to Alexander Monakov from comment #4)
> Um, sched1 is not enabled on x86 so -fno-schedule-insns does nothing — you
> probably meant -fno-schedule-insns2?

Yes, of course - thanks for correcting me.

> Another thing to try is -fstack-reuse=none, as indicated by comment #1.
Comment 6 Alexander Monakov 2023-06-15 07:08:15 UTC
(In reply to Jimi Huotari from comment #0)
> (By the by, is ADCX a typo of ADX?  I see -madx as an option but only one
> use of it otherwise, and no -adcx as an option and lots of mentions of it...
> but perhaps I'm not reading them correct-like.)

ADX is an x86 extension that adds two new instructions, ADCX and ADOX:
https://en.wikipedia.org/wiki/Intel_ADX
Comment 7 Jimi Huotari 2023-06-15 09:53:30 UTC
(In reply to Alexander Monakov from comment #6)
> (In reply to Jimi Huotari from comment #0)
> > (By the by, is ADCX a typo of ADX?  I see -madx as an option but only one
> > use of it otherwise, and no -adcx as an option and lots of mentions of it...
> > but perhaps I'm not reading them correct-like.)
> 
> ADX is an x86 extension that adds two new instructions, ADCX and ADOX:
> https://en.wikipedia.org/wiki/Intel_ADX


Ah!  Thank you.


(In reply to Alexander Monakov from comment #4)
> Um, sched1 is not enabled on x86 so -fno-schedule-insns does nothing — you
> probably meant -fno-schedule-insns2?
> 
> Another thing to try is -fstack-reuse=none, as indicated by comment #1.


So far things look no different with either one or both of these set.  I'll test more as time permits, and see if I can find any better test-cases, and will also look into those bug reports mentioned here.

In the initial report, being in a bit of a rush, I forgot to actually describe the issues here (instead of just linking), so I'll do that here at least a bit.

KWin: This one show up so that I'm unable to move any window past the top and left borders of the screen(s), and the also behave in a weird way when trying to push them past them, and then resizing them, making the adjustment 'jump' a lot at once.  This also makes the LXQt panel behave in strange ways when set to these locations; it will not animate automatic hiding of it, and it acts as if it's always visible, even when hidden (so if clicking something below it, while hiding, the click still goes onto the panel even if it's supposed to be hiding).

Wine: There are issues with stack alignment leading to application crashes triggered after a change made to the Wine code [1].

1. https://source.winehq.org/git/wine.git/commitdiff/62173699c38453777c7d5638ed2e779790506b75
Comment 8 Jimi Huotari 2023-06-15 11:17:41 UTC
Looks like with '-mno-avx512f' the issues go away for me.

As a sidey-note, this seems to also possibly affect KWin desktop effects (animated ones at least), which don't seem to be doing their thing (haven't been using them so had not noticed until trying now).  Have not yet confirmed that that is related though.
Comment 9 Zeb Figura 2023-06-15 18:06:26 UTC
Wine developer here—FWIW, I think these may ultimately be separate issues.

The problem we were seeing with Wine is that, on the i386-w64-mingw32 target, the stack alignment is supposed to be assumed to be only 4 (versus 16 on ELF or x86_64 targets), and so any function using SSE/AVX instructions needs to manually align the stack. In the cases where Wine was crashing, gcc was generating vmovdqa instructions without actually aligning the stack first, whereas without the -march=znver4 flag it apparently would align the stack and then generate SSE movdqa.
Comment 10 Alexander Monakov 2023-06-15 19:48:06 UTC
Right, those are different issues. Any chance of a standalone testcase extracted from Wine? If you already see a function where stack realignment is missing, just give us preprocessed containing source, full gcc command line, and output of 'gcc -v', as described on https://gcc.gnu.org/bugs/

(please open a new bug with that, and mention the new bug # here)
Comment 11 Zeb Figura 2023-06-15 21:55:08 UTC
(In reply to Alexander Monakov from comment #10)
> Right, those are different issues. Any chance of a standalone testcase
> extracted from Wine? If you already see a function where stack realignment
> is missing, just give us preprocessed containing source, full gcc command
> line, and output of 'gcc -v', as described on https://gcc.gnu.org/bugs/
> 
> (please open a new bug with that, and mention the new bug # here)

I've filed bug 110273 for the Wine misaligned stack problem.