[Bug tree-optimization/103721] [12 regression] wrong code generated for loop with conditional since r12-4790-g4b3a325f07acebf4
amacleod at redhat dot com
gcc-bugzilla@gcc.gnu.org
Thu Jan 6 20:13:56 GMT 2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103721
Andrew Macleod <amacleod at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jeffreyalaw at gmail dot com
--- Comment #3 from Andrew Macleod <amacleod at redhat dot com> ---
After the initial loop tweaking, the IL that the threader sees in
v.c.111t.threadfull1 is:
;; basic block 2, loop depth 0
goto <bb 10>; [100.00%]
;; basic block 3, loop depth 1
ipos.0_2 = ipos;
if (ipos.0_2 != 0)
goto <bb 6>; [50.00%]
else
goto <bb 4>; [50.00%]
;; basic block 4, loop depth 1
;; basic block 6, loop depth 1
# searchVolume_11 = PHI <1(4), 0(3)>
# currentVolume_10 = PHI <searchVolume_5(4), searchVolume_5(3)>
;; basic block 10, loop depth 1
# searchVolume_5 = PHI <searchVolume_11(6), 1111(2)>
# currentVolume_6 = PHI <currentVolume_10(6), 0(2)>
_7 = searchVolume_5 != currentVolume_6;
_8 = searchVolume_5 != 0;
_9 = _7 & _8;
if (_9 != 0)
goto <bb 3>; [89.00%]
else
goto <bb 7>; [11.00%]
It looks to me like it decides to thread 2->10, which means it turns bb2 into
something like:
# searchVolume_5 = 1111
# currentVolume_6 = 0
_7 = searchVolume_5 != currentVolume_6; // folds to 1
_8 = searchVolume_5 != 0; // folds to 1
_9 = _7 & _8; //folds to 1
if (_9 != 0) // folds to goto bb3
goto <bb 3>; [89.00%]
else
goto <bb 7>; [11.00%]
And then it updates the PHIS in BB10 to not have an edge from bb2: (note I
am doing this by hand, not actually renaming any ssa_names.)
;; basic block 10, loop depth 1
# searchVolume_5 = PHI <searchVolume_11(6)>
# currentVolume_6 = PHI <currentVolume_10(6)>
_7 = searchVolume_5 != currentVolume_6;
_8 = searchVolume_5 != 0;
_9 = _7 & _8;
if (_9 != 0)
goto <bb 3>; [89.00%]
else
goto <bb 7>; [11.00%]
The problem would seem to be that when we thread 2->10, we are actually peeling
off an iteration of the loop. the PHIs in BB6:
;; basic block 6, loop depth 1
# searchVolume_11 = PHI <1(4), 0(3)>
# currentVolume_10 = PHI <searchVolume_5(4), searchVolume_5(3)>
I think currentVolume_10 is picking up searchVolume_5 calulated from the
threaded entry point, which is the constant 1111... and we are "losing" the
information that it could also be the value of searchVolume_11 from the
previous iteration.
Threading is out of my wheel house, but Its not clear to me how you could even
update the PHI nodes properly if you try to thread that path...
And its starting to give me a headache thinking about it :-)
It seem that needs to be a new phi inserted in BB3 which sets searchvolume_5 =
PHI <1111(2), searchVolume_11(10)> Or something to that efffect.
something is missing anyway.
More information about the Gcc-bugs
mailing list