Bug 94400 - 531.deepsjeng_r is 7% slower at -O2 -march=znver2 than GCC 9
Summary: 531.deepsjeng_r is 7% slower at -O2 -march=znver2 than GCC 9
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 10.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: spec
  Show dependency treegraph
 
Reported: 2020-03-30 10:04 UTC by Martin Jambor
Modified: 2021-02-04 16:47 UTC (History)
2 users (show)

See Also:
Host: x86_64-linux
Target: x86_64-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-03-30 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Jambor 2020-03-30 10:04:03 UTC
When compiled with -O2 -march=native and run on an AMD Zen2 CPU,
531.deepsjeng_r runs about 7% slower.  This can be bisected to a
single commit:

commit a9a4edf0e71bbac9f1b5dcecdcf9250111d16889
Author: Jan Hubicka <hubicka@ucw.cz>
Date:   Sat Nov 30 22:25:24 2019 +0100

    Update max_bb_count in execute_fixup_cfg
    
            * tree-cfg.c (execute_fixup_cfg): Update also max_bb_count when
            scaling happen.
    
    From-SVN: r278879

Surprisingly, I cannot see a similar problem on an Intel Cascade Lake
server CPU, but I have confirmed the above on two different Rome
systems (one running SLES, one openSUSE Tumbleweed).
Comment 1 Martin Liška 2020-03-30 10:10:27 UTC
I can confirm on LNT znver2 machine, but the bisection points to a different commit:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=324.387.0&plot.1=311.387.0&plot.2=348.387.0&plot.3=280.387.0&plot.4=297.387.0&

while LNT znver1 machine is not affected and the speed is similar to GCC 9:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=145.387.0&plot.1=49.387.0&plot.2=79.387.0&plot.3=259.387.0&plot.4=29.387.0&
Comment 2 Martin Jambor 2021-02-04 16:47:02 UTC
The regression dropped to 1.9% according to my own measurements which also match LNT (linked above).  It is peculiar to an unusual option combination, specific to Zen2 (I cannot see it on Zen3 or CascadeLake) and so I think it is unreasonable to expect that anybody will actually want to work on it.  And the PR really is mostly fixed, so let me close it as such.