[Bug tree-optimization/103721] [12 regression] wrong code generated for loop with conditional since r12-4790-g4b3a325f07acebf4

amacleod at redhat dot com gcc-bugzilla@gcc.gnu.org
Thu Jan 6 20:13:56 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103721

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jeffreyalaw at gmail dot com

--- Comment #3 from Andrew Macleod <amacleod at redhat dot com> ---
After the initial loop tweaking, the IL that the threader sees in 
v.c.111t.threadfull1 is:

;;   basic block 2, loop depth 0
  goto <bb 10>; [100.00%]

;;   basic block 3, loop depth 1
  ipos.0_2 = ipos;
  if (ipos.0_2 != 0)
    goto <bb 6>; [50.00%]
  else
    goto <bb 4>; [50.00%]

;;   basic block 4, loop depth 1

;;   basic block 6, loop depth 1
  # searchVolume_11 = PHI <1(4), 0(3)>
  # currentVolume_10 = PHI <searchVolume_5(4), searchVolume_5(3)>

;;   basic block 10, loop depth 1
  # searchVolume_5 = PHI <searchVolume_11(6), 1111(2)>
  # currentVolume_6 = PHI <currentVolume_10(6), 0(2)>
  _7 = searchVolume_5 != currentVolume_6;
  _8 = searchVolume_5 != 0;
  _9 = _7 & _8;
  if (_9 != 0)
    goto <bb 3>; [89.00%]
  else
    goto <bb 7>; [11.00%]


It looks to me like it decides to thread 2->10, which means it turns bb2 into
something like:


 # searchVolume_5 = 1111
  # currentVolume_6 = 0
  _7 = searchVolume_5 != currentVolume_6;    // folds to 1
  _8 = searchVolume_5 != 0;                  // folds to 1
  _9 = _7 & _8;                              //folds to 1
  if (_9 != 0)                               // folds to goto bb3 
    goto <bb 3>; [89.00%]
  else
    goto <bb 7>; [11.00%]

And then it updates the PHIS in BB10 to not have an edge from bb2:    (note I
am doing this by hand, not actually renaming any ssa_names.)

;;   basic block 10, loop depth 1
  # searchVolume_5 = PHI <searchVolume_11(6)>
  # currentVolume_6 = PHI <currentVolume_10(6)>
  _7 = searchVolume_5 != currentVolume_6;
  _8 = searchVolume_5 != 0;
  _9 = _7 & _8;
  if (_9 != 0)
    goto <bb 3>; [89.00%]
  else
    goto <bb 7>; [11.00%]

The problem would seem to be that when we thread 2->10, we are actually peeling
off an iteration of the loop. the PHIs in BB6:
;;   basic block 6, loop depth 1
  # searchVolume_11 = PHI <1(4), 0(3)>
  # currentVolume_10 = PHI <searchVolume_5(4), searchVolume_5(3)>

I think currentVolume_10 is picking up searchVolume_5 calulated from the
threaded entry point, which is the constant 1111... and we are "losing" the
information that it could also be the value of searchVolume_11 from the
previous iteration. 

Threading is out of my wheel house, but Its not clear to me how you could even
update the PHI nodes properly if you try to thread that path... 
And its starting to give me a headache thinking about it :-)  

It seem that needs to be a new phi inserted in BB3 which sets searchvolume_5 =
PHI <1111(2), searchVolume_11(10)>  Or something to that efffect.
something is missing anyway.


More information about the Gcc-bugs mailing list