[Bug tree-optimization/86865] [9 Regression] Wrong code w/ -O2 -floop-parallelize-all -fstack-reuse=none -fwrapv -fno-tree-ch -fno-tree-dce -fno-tree-dominator-opts -fno-tree-loop-ivcanon

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Jan 24 16:19:00 GMT 2019


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86865

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sebpop at gmail dot com

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Hmm, actually the original AST already doesn't match the original GIMPLE:

[scheduler] original ast:
for (int c0 = 0; c0 <= 1; c0 += 1) {
  S_3(c0);
  for (int c1 = 0; c1 <= 7; c1 += 1)
    S_4(c0, c1);
  S_9(c0);
}

vs. GIMPLE

  <bb 6> [local count: 134036760]:
  # prephitmp_8 = PHI <0(2), _1(9)>
  if (prephitmp_8 >= 0)
    goto <bb 3>; [89.00%]
  else
    goto <bb 7>; [11.00%]  <exit>

  <bb 3> [local count: 119292716]:
  sa = {};

  <bb 4> [local count: 954449108]:
  # us_18 = PHI <0(3), us_11(8)>
  yt[us_18] = 0;
  us_11 = us_18 + 1;
  if (us_11 != 8)
    goto <bb 8>; [87.50%]
  else
    goto <bb 5>; [12.50%]

  <bb 8> [local count: 835156388]:
  goto <bb 4>; [100.00%]

  <bb 5> [local count: 119292717]:

  <bb 9> [local count: 119292717]:
  _1 = prephitmp_8 + -1;
  xy = _1;
  goto <bb 6>;

here the number of latch executions is one (and BB 9 and thus xy = _1
executes once) while the ISL AST has S_9 executed twice.  So the
bug is in how we translate GIMPLE to ISL which assumes do {} while
style loops rather than adjusting iteration domains.  I've never
groked the code there fully so the "easiest" way out would be to
require do {} while style loops similar to what the vectorizer
requires.

Index: gcc/graphite-scop-detection.c
===================================================================
--- gcc/graphite-scop-detection.c       (revision 268010)
+++ gcc/graphite-scop-detection.c       (working copy)
@@ -555,8 +555,15 @@ scop_detection::can_represent_loop (loop
   tree niter;
   struct tree_niter_desc niter_desc;

-  return single_exit (loop)
-    && !(loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP)
+  /* We can only handle do {} while () style loops correctly.  */
+  edge exit = single_exit (loop);
+  if (!exit
+      || !single_pred_p (loop->latch)
+      || exit->src != single_pred (loop->latch)
+      || !empty_block_p (loop->latch))
+    return false;
+
+  return !(loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP)
     && number_of_iterations_exit (loop, single_exit (loop), &niter_desc,
false)
     && niter_desc.control.no_overflow
     && (niter = number_of_latch_executions (loop))

which in turn FAILs

FAIL: gcc.dg/graphite/scop-21.c scan-tree-dump-times graphite "number of SCoPs:
1" 1
FAIL: gcc.dg/graphite/pr69728.c scan-tree-dump graphite "loop nest optimized"

where loop header copying is "broken" by jump threading for scop-21.c.


More information about the Gcc-bugs mailing list