This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/66642] New: transform_to_exit_first_loop_alt doesn't use result of low iteration count loop


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66642

            Bug ID: 66642
           Summary: transform_to_exit_first_loop_alt doesn't use result of
                    low iteration count loop
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Created attachment 35831
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35831&action=edit
patch to produce test case

Using attached patch, we exercise the low iteration count loop generated by the
parloops pass.

The libgomp.c/parloops-exit-first-loop-alt-3.c testcase fails:
...
PASS: libgomp.c/parloops-exit-first-loop-alt-2.c (test for excess errors)
PASS: libgomp.c/parloops-exit-first-loop-alt-2.c execution test
PASS: libgomp.c/parloops-exit-first-loop-alt-3.c (test for excess errors)
FAIL: libgomp.c/parloops-exit-first-loop-alt-3.c execution test
PASS: libgomp.c/parloops-exit-first-loop-alt-4.c (test for excess errors)
PASS: libgomp.c/parloops-exit-first-loop-alt-4.c execution test
PASS: libgomp.c/parloops-exit-first-loop-alt.c (test for excess errors)
PASS: libgomp.c/parloops-exit-first-loop-alt.c execution test
...

The problem is the following.

Before transform_to_exit_first_loop, we have loop header bb4, loop latch bb6,
and loop exit bb5:
...
  <bb 4>:
  # sum_17 = PHI <1(11), sum_11(6)>
  # ivtmp_24 = PHI <0(11), ivtmp_6(6)>
  i_16 = (int) ivtmp_24;
  _7 = (long unsigned int) i_16;
  _8 = _7 * 4;
  _9 = pretmp_23 + _8;
  _10 = *_9;
  sum_11 = _10 + sum_17;
  i_12 = i_16 + 1;
  i.1_3 = (unsigned int) i_12;
  if (ivtmp_24 < _19)
    goto <bb 6>;
  else
    goto <bb 5>;

  <bb 5>:
  # sum_20 = PHI <sum_11(4), sum_25(8)>
  goto <bb 7>;

  <bb 6>:
  ivtmp_6 = ivtmp_24 + 1;
  goto <bb 4>;
...

After transform_to_exit_first_loop, we still have loop header bb4 and loop
latch bb6, but the loop exit is now bb14:
...
  <bb 4>:
  # sum_27 = PHI <1(11), sum_11(6)>
  # ivtmp_28 = PHI <0(11), ivtmp_6(6)>
  if (ivtmp_28 < _19)
    goto <bb 13>;
  else
    goto <bb 14>;

  <bb 13>:
  # sum_17 = PHI <sum_27(4)>
  # ivtmp_24 = PHI <ivtmp_28(4)>
  i_16 = (int) ivtmp_24;
  _7 = (long unsigned int) i_16;
  _8 = _7 * 4;
  _9 = pretmp_23 + _8;
  _10 = *_9;
  sum_11 = _10 + sum_17;
  i_12 = i_16 + 1;
  i.1_3 = (unsigned int) i_12;
  goto <bb 6>;

  <bb 14>:
  # sum_29 = PHI <sum_27(4)>
  ivtmp_30 = _19;
  i_31 = (int) ivtmp_30;
  _32 = (long unsigned int) i_31;
  _33 = _32 * 4;
  _34 = pretmp_23 + _33;
  _35 = *_34;
  sum_36 = _35 + sum_29;
  i_37 = i_31 + 1;
  i.1_38 = (unsigned int) i_37;

  <bb 5>:
  # sum_20 = PHI <sum_36(14), sum_25(8)>
  goto <bb 7>;

  <bb 6>:
  ivtmp_6 = ivtmp_24 + 1;
  goto <bb 4>;
...

A bit later, separate_decls_in_region inserts a .paral_data_store based load in
the new exit block, assuming that the exit block has a single predecessor (the
loop header bb4):
...
  <bb 14>:
  .paral_data_load.11_42 = &.paral_data_store.10;
  sum_29 = .paral_data_load.11_42->sum.7;
  ivtmp_30 = _19;
  i_31 = (int) ivtmp_30;
  _32 = (long unsigned int) i_31;
  _33 = _32 * 4;
  _34 = pretmp_23 + _33;
  _35 = *_34;
  sum_36 = _35 + sum_29;
  i_37 = i_31 + 1;
  i.1_38 = (unsigned int) i_37;
...


However, with transform_to_exit_first_loop_alt we keep loop latch bb6 and loop
exit bb5, but we get a new loop header bb13:
...
  <bb 11>:
  goto <bb 13>;

  <bb 4>:
  # sum_17 = PHI <sum_27(13)>
  # ivtmp_24 = PHI <ivtmp_28(13)>
  i_16 = (int) ivtmp_24;
  _7 = (long unsigned int) i_16;
  _8 = _7 * 4;
  _9 = pretmp_23 + _8;
  _10 = *_9;
  sum_11 = _10 + sum_17;
  i_12 = i_16 + 1;
  i.1_3 = (unsigned int) i_12;
  goto <bb 6>;

  <bb 13>:
  # sum_27 = PHI <sum_11(6), 1(11)>
  # ivtmp_28 = PHI <ivtmp_6(6), 0(11)>
  if (ivtmp_28 < n_4(D))
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 5>:
  # sum_20 = PHI <sum_27(13), sum_25(8)>
  goto <bb 7>;
...

The loop exit bb5 is also reached from bb8, the loop header of the low
iteration count loop.

So when separate_decls_in_region inserts a .paral_data_store based load in the
exit block, it destroys the value coming from the low iteration count loop:
...
  <bb 5>:
  .paral_data_load.12_32 = &.paral_data_store.11;
  sum_20 = .paral_data_load.12_32->sum.7;
  goto <bb 7>;
...

The fix is probably to make sure that we split the exit edge during
transform_to_exit_first_loop_alt.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]