[Bug rtl-optimization/93264] [10 Regression] ICE in cfg_layout_redirect_edge_and_branch_force, at cfgrtl.c:4522

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Apr 2 12:26:53 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93264

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Pilot error.

loop->header is in the cold partition, both latch sources are as well,
the loop entry source is in the hot partition.  We're correctly
redirecting that from hot -> cold to hot -> cold state so

  if (e->flags & EDGE_CROSSING
      && BB_PARTITION (e->src) == BB_PARTITION (dest)
      && simplejump_p (BB_END (src)))
    {

doesn't apply anyways so we run into

edge
try_redirect_by_replacing_jump (edge e, basic_block target, bool in_cfglayout)
{
...
  /* If we are partitioning hot/cold basic blocks, we don't want to
     mess up unconditional or indirect jumps that cross between hot
     and cold sections.

     Basic block partitioning may result in some jumps that appear to
     be optimizable (or blocks that appear to be mergeable), but which really
     must be left untouched (they are required to make it safely across
     partition boundaries).  See  the comments at the top of
     bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */

  if (BB_PARTITION (src) != BB_PARTITION (target))
    return NULL;

where the referenced comment says

   IMPORTANT NOTE: This optimization causes some messy interactions
   with the cfg cleanup optimizations; those optimizations want to
   merge blocks wherever possible, and to collapse indirect jump
   sequences (change "A jumps to B jumps to C" directly into "A jumps
   to C").  Those optimizations can undo the jump fixes that
   partitioning is required to make (see above), in order to ensure
   that jumps attempting to cross section boundaries are really able
   to cover whatever distance the jump requires (on many architectures
   conditional or unconditional jumps are not able to reach all of
   memory).  Therefore tests have to be inserted into each such
   optimization to make sure that it does not undo stuff necessary to
   cross partition boundaries.  This would be much less of a problem
   if we could perform this optimization later in the compilation, but
   unfortunately the fact that we may need to create indirect jumps
   (through registers) requires that this optimization be performed
   before register allocation.

but any such fixup jumps would appear inside the original section
(and not crossing).  So preserving the crossing state should be a
good enough and better check?

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index fb551e3efc0..f2783b1d7ed 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -1046,7 +1046,7 @@ try_redirect_by_replacing_jump (edge e, basic_block
target, bool in_cfglayout)
      partition boundaries).  See  the comments at the top of
      bb-reorder.c:partition_hot_cold_basic_blocks for complete details.  */

-  if (BB_PARTITION (src) != BB_PARTITION (target))
+  if (BB_PARTITION (e->dest) != BB_PARTITION (target))
     return NULL;

   /* We can replace or remove a complex jump only when we have exactly

covers the testcase but will inhibit redirects allowed previously
when e->src is hot, e->dest was cold and target is now hot.  But that
should be covered by the special-case using simplejump_p.

Ah, OK, here the jump is an indirect one, so not simple.  And we'd
need to replace it with an indirect one for partitioning correctness
which we are not able to do here.

For the testcase at hand we could use sth else than make_forwarder_block,
like manually create a new BB, redirect all latches to it and then
create a new edge to the old header from it.  The redirects/creations
would be all in the same partition.  But then there may be the case
of a latch edge being crossing (a very cold latch in a hot loop) and
we'd face the very same issue.

Currently loop analysis thinks it can always make loops only have a single
latch so "failure" isn't an option here.

So we need to teach cfgrtl.c to redirect forming a similar instruction
as it was there before (create another indirect jump, but after reload
this needs a register - we don't know whether the jump target reg is
live).  Ugh.

On the testcase itself

diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
index 77254b31b42..66260fa34f1 100644
--- a/gcc/modulo-sched.c
+++ b/gcc/modulo-sched.c
@@ -1347,8 +1347,7 @@ sms_schedule (void)
   edge latch_edge;
   HOST_WIDE_INT trip_count, max_trip_count;

-  loop_optimizer_init (LOOPS_HAVE_PREHEADERS
-                      | LOOPS_HAVE_RECORDED_EXITS);
+  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
   if (number_of_loops (cfun) <= 1)
     {
       loop_optimizer_finalize ();

works.


More information about the Gcc-bugs mailing list