LNT reports that 541.leela_r from SPEC 2017 intrate suite regressed when compiled with both PGO and LTO with -Ofast -march=native on all machines in the first week of January: zen3: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=477.397.0 zen2: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=286.397.0 zen1: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=17.397.0 kaby: https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=16.397.0 On my zen2 desktop I have bisected the regression, or at least most of it, to r12-6208-gebc853deb7cc04: ebc853deb7cc0487de9ef6e891a007ba853d1933 is the first bad commit commit ebc853deb7cc0487de9ef6e891a007ba853d1933 Author: Richard Biener <rguenther@suse.de> Date: Tue Jan 4 11:59:35 2022 +0100 tree-optimization/103690 - not up-to-date SSA and PRE DCE This avoids running simple_dce_from_worklist on partially not up-to-date SSA form (in unreachable code regions) by scheduling CFG cleanup manually as is done anyway when tail-merging runs. 2022-01-04 Richard Biener <rguenther@suse.de> PR tree-optimization/103690 * tree-pass.h (tail_merge_optimize): Adjust. * tree-ssa-tail-merge.c (tail_merge_optimize): Pass in whether to re-split critical edges, move CFG cleanup ... * tree-ssa-pre.c (pass_pre::execute): ... here, before simple_dce_from_worklist and delay freeing inserted_exprs from ... (fini_pre): .. here.
OK, so the only effect I can think of is that simple_dce_from_worklist can end up removing the last stmt in a BB and thus _eventually_ expose BB merging CFG cleanup opportunities. I also notice that while tail_merge_optimize altered todo by clearing TODO_cleanup_cfg, PRE just did (and still does) - todo |= tail_merge_optimize (todo); + todo |= tail_merge_optimize (todo, need_crit_edge_split); so it would have retained TODO_cleanup_cfg, something we now do not. The code is all somewhat of a mess due to the embedded tail-merge and I tried to do as little changes as possible this late in the cycle. I'll try to reproduce and see if keeping TODO_cleanup_cfg around helps.
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c index ab24fa98a1f..2bdfae5482f 100644 --- a/gcc/tree-ssa-pre.c +++ b/gcc/tree-ssa-pre.c @@ -4442,7 +4442,6 @@ pass_pre::execute (function *fun) if (todo & TODO_cleanup_cfg) { cleanup_tree_cfg (); - todo &= ~TODO_cleanup_cfg; need_crit_edge_split = true; } should fix that
(In reply to Richard Biener from comment #2) > > should fix that I can confirm that it does.
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:2f62294dec1f3af59dd7505c058b0af38c2d1524 commit r12-6527-g2f62294dec1f3af59dd7505c058b0af38c2d1524 Author: Richard Biener <rguenther@suse.de> Date: Wed Jan 12 15:25:07 2022 +0100 tree-optimization/103990 - fix CFG cleanup regression from PRE change This adjusts the CFG cleanup flow back to what it was before the last change which fixes the observed regression of 541.leela_r with LTO and FDO. 2022-01-12 Richard Biener <rguenther@suse.de> PR tree-optimization/103990 * tree-pass.h (tail_merge_optimize): Drop unused argument. * tree-ssa-tail-merge.c (tail_merge_optimize): Likewise. * tree-ssa-pre.c (pass_pre::execute): Retain TODO_cleanup_cfg and adjust call to tail_merge_optimize.
Fixed.