[RFC] More jump threading restrictions in the presence of loops.
Michael Matz
matz@suse.de
Mon Oct 4 16:29:05 GMT 2021
Hello,
On Mon, 4 Oct 2021, Jeff Law wrote:
> And just in case it got lost. Here's the analysis on 960218-1 on visium:
>
> We've got this going into ethread:
>
> int f (int x)
> {
> int D.1420;
> int a;
>
> ;; basic block 2, loop depth 0, maybe hot
> ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)
> ;; pred: ENTRY (FALLTHRU,EXECUTABLE)
> a_4 = ~x_3(D);
> goto <bb 4>; [INV]
> ;; succ: 4 (FALLTHRU,EXECUTABLE)
>
> ;; basic block 3, loop depth 1, maybe hot
> ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
> ;; pred: 4 (TRUE_VALUE,EXECUTABLE)
> gl = a_1;
> ;; succ: 4 (FALLTHRU,DFS_BACK,EXECUTABLE)
>
> ;; basic block 4, loop depth 1, maybe hot
> ;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
> ;; pred: 2 (FALLTHRU,EXECUTABLE)
> ;; 3 (FALLTHRU,DFS_BACK,EXECUTABLE)
> # a_1 = PHI <a_4(2), 0(3)>
> if (a_1 != 0)
> goto <bb 3>; [INV]
> else
> goto <bb 5>; [INV]
> ;; succ: 3 (TRUE_VALUE,EXECUTABLE)
> ;; 5 (FALSE_VALUE,EXECUTABLE)
>
> ;; basic block 5, loop depth 0, maybe hot
> ;; prev block 4, next block 1, flags: (NEW, REACHABLE, VISITED)
> ;; pred: 4 (FALSE_VALUE,EXECUTABLE)
> return;
> ;; succ: EXIT
>
> }
First notice that this doesn't have an empty latch block to start with
(in fact, there is no separate latch block at all), so the loop optimizers
haven't been initialized for simple latches at this point. Still, I would
say that even then we want to be careful with the latch edge, as putting
code onto it will most probably create a problem downstream once we _do_
want to intialize the loop optimizers for simple latches. So, that we are
careful here is okay, we are just too careful in this specific case.
> There's a pretty obvious jump thread in there. 3->4->5.
>
> We used to allow this, but it's currently being rejected and I'm not
> sure it should.
>
> In simplest terms we're threading from inside the loop, through the
> latch to a point outside the loop. At first glance, that seems safe.
Is at least the unrestricted jump threader (the one after loop optimizers)
picking this up?
Independend of that, I agree, that this specific instance of threading
through the latch block, even though followed by to-be-copied
instructions, is okay. We need to determine precisely when that is okay,
though. In this case there are several reasons I could see why this is
so:
(a) while the thread is through the latch block, it's _not_ through the
latch edge (which is 4->3). That would create the problem, so here
we're fine.
(b) even if it were through the latch edge, and would be followed by
to-be-copied instructions (and hence fill the latch edge) the ultimate
end of the path is outside the loop, so all the edges and blocks that
would now carry instructions would also be outside the loop, and hence
be no problem
Now, capture this precisely ;-)
I think something like so: we have a candidate path
S -> L -> B1 (-> B2 ... Bn)
(Start, Latch, Blocks 1 to n following the latch). (I think in our
notation that means that the jump in L is redirected to Bn, and all code
from B1..Bn be copied, right? Do we even support multiple blocks after
the to-be-redirected jump?)
Now if L is a latch block, but L->B1 is no latch/back edge : no
restrictions apply.
Otherwise, L->B1 is a latch/back edge (that means B1 is a loop header): if
all of B2..Bn-1 don't contain jumps back into the loop, and Bn is outside
the loop, then the thread is okay as well. (B2..Bn-1 can be inside the
loop, as long as they don't contain jumps back into the loop, after
copying by the threader, they don't create problems: their copies will be
placed outside the loop and won't generate side entries back into the
loop; the copied latch edge will not be a latch edge anymore, but a loop
exit edge).
It's quite possible that the full generality above is not necessary.
It's probably enough to just not deal with the (b) case above (and
continue to reject it), and simply only accept the candidate path if none
of the involved edges is a latch/back edge (despite one block being the
latch block). Especially so because I'm not convinced that I handled
the idea of case (b) correctly above ;-)
Ciao,
Michael.
More information about the Gcc-patches
mailing list