[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

hubicka at kam dot mff.cuni.cz gcc-bugzilla@gcc.gnu.org
Mon Nov 8 08:33:21 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997

--- Comment #16 from hubicka at kam dot mff.cuni.cz ---
Note that it still seems to me that the crossed_loop_header handling is
overly conservative.  We have:

@ -2771,6 +2771,7 @@ jt_path_registry::cancel_invalid_paths
(vec<jump_thread_edge *> &path)
   bool seen_latch = false;
   int loops_crossed = 0;
   bool crossed_latch = false;
+  bool crossed_loop_header = false;
   // Use ->dest here instead of ->src to ignore the first block.  The
   // first block is allowed to be in a different loop, since it'll be
   // redirected.  See similar comment in profitable_path_p: "we don't
@@ -2804,6 +2805,14 @@ jt_path_registry::cancel_invalid_paths
(vec<jump_thread_edge *> &path)
          ++loops_crossed;
        }

+      // ?? Avoid threading through loop headers that remain in the
+      // loop, as such threadings tend to create sub-loops which
+      // _might_ be OK ??.
+      if (e->dest->loop_father->header == e->dest
+         && !flow_loop_nested_p (exit->dest->loop_father,
+                                 e->dest->loop_father))
+       crossed_loop_header = true;
+
       if (flag_checking && !m_backedge_threads)
        gcc_assert ((path[i]->e->flags & EDGE_DFS_BACK) == 0);
     }
@@ -2829,6 +2838,21 @@ jt_path_registry::cancel_invalid_paths
(vec<jump_thread_edge *> &path)
       cancel_thread (&path, "Path crosses loops");
       return true;
     }
+  // The path should either start and end in the same loop or exit the
+  // loop it starts in but never enter a loop.  This also catches
+  // creating irreducible loops, not only rotation.
+  if (entry->src->loop_father != exit->dest->loop_father
+      && !flow_loop_nested_p (exit->src->loop_father,
+                             entry->dest->loop_father))
+    {
+      cancel_thread (&path, "Path rotates loop");
+      return true;
+    }
+  if (crossed_loop_header)
+    {
+      cancel_thread (&path, "Path crosses loop header but does not exit it");
+      return true;
+    }
   return false;
 }

If I read it correctly, for a path that enters the loop and later leaves
it (where threading is desirable since we skip the whole loop) the logic
above will still return true (after finishing the whole walk which seems
like a waste).

This may trigger more often at -Os since we limit loop header copying.

And indeed, fixing profile updating would be nice.  Why the updating
code is not reused across different threaders?  (I wrote several thread
updating functions for varioius threaders introduced & remoed in the
past and I wonder why we need to keep reinventing it)


More information about the Gcc-bugs mailing list