Bug 113641 - [13/14/15 regression] 510.parest_r with PGO at O2 slower than GCC 12 (7% on Zen 3&2, 4% on CascadeLake) since r13-4272-g8caf155a3d6e23
Summary: [13/14/15 regression] 510.parest_r with PGO at O2 slower than GCC 12 (7% on Z...
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 14.0
: P2 normal
Target Milestone: 13.4
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: spec
  Show dependency treegraph
 
Reported: 2024-01-28 17:26 UTC by Martin Jambor
Modified: 2024-12-28 23:21 UTC (History)
2 users (show)

See Also:
Host: x86_64-linux-gnu
Target: x86_64-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Jambor 2024-01-28 17:26:24 UTC
During the development of GCC 13, 510.parest_r run-time regressed on x86_64 when built with profile guided optimization and just plain O2 and master than when using GCC12.  The difference is not big but fairly clear cut, about 7.6% on Zen3:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=740.457.0&plot.1=892.457.0&plot.2=694.457.0&

and about 7.2% on Zen2:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=777.457.0&plot.1=932.457.0&plot.2=687.457.0&

The graphs above show use of both LTO and PGO but LTO is not necessary.

I was able to bisect the regression to commit r13-4272-g8caf155a3d6e23 (i386: Only enable small loop unrolling in backend [PR 107692]).  parest_r is also about 4% slower when compiled with this revision than with the previous one on Intel CascadeLake.
Comment 1 Jakub Jelinek 2024-05-21 09:18:51 UTC
GCC 13.3 is being released, retargeting bugs to GCC 13.4.