This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [committed][PATCH] Fix bogus propagation in DOM


On Mon, Nov 20, 2017 at 7:33 PM, Jeff Law <law@redhat.com> wrote:
> On 11/20/2017 03:25 AM, Richard Biener wrote:
>> On Sun, Nov 19, 2017 at 9:16 PM, Jeff Law <law@redhat.com> wrote:
>>> On my local branch gcc.dg/torture/pr56349.c fails by sending GCC into an
>>> infinite loop trying to simplify a self-referring statement. ie
>>> something like
>>>
>>> x_1 = x_1 + 10;
>>>
>>> That, of course, shouldn't be happening in SSA form.  After some digging
>>> I've found the culprit.
>>>
>>> Let's say we've got a PHI.
>>>
>>> a_1 = PHI (a_0, a_2)
>>>
>>> If DOM decides that the edge associated with a_2 is not executable, then
>>> DOM will consider the PHI a degenerate and enter a_1 = a_0 into its
>>> equivalence table.
>>>
>>> That in turn will result in propagation of a_0 into uses of a_1.
>>>
>>> That, of course, isn't right.  There's nothing that guarantees that the
>>> definition of a_0 dominates the uses of a_1.  In the testcase that bogus
>>> propagation cascades and eventually results in a self-referring node
>>> like I showed above.
>>>
>>> The solution here is to note whether or not we ignored any PHI
>>> arguments.  If we do and the equivalence we want to enter is SSA_NAME =
>>> SSA_NAME, then we must reject the equivalence.    Obviously if we wanted
>>> to enter SSA_NAME = CONST, then we can still do so.
>>>
>>> Bootstrapped and regression tested on x86_64.  Installing on the trunk.
>>
>> Hmm, but if the edge isn't executable then it will be removed and thus the
>> definition _will_ dominate.  So I think the error happens elsewhere.  With
>> the change you remove one of the advantages of tracking unexecutable
>> edges, namely that we can treat those merges optimistically resulting in
>> more CSE.
>>
>> You didn't add a testcase so I can't have a quick look myself.
>>
>> Short: I think you're papering over an issue elsehwere.
> Depends on your point of view :-)
>
> It's not something I think you can trigger on the trunk right now.  But
> the testcase is pr56349.  You need the embedded vrp bits installed into
> DOM and for DOM to also use those bits to detect branches that have a
> static destination.
>
> So I'll just walk you through it...
>
> At the start of the second DOM pass we have this:
>
> f ()
> {
>   int k__lsm.8;
>   int * k;
>   int a;
>   int _2;
>   int b.0_3;
>   int _4;
>   unsigned int ivtmp_5;
>   int _6;
>   short int iftmp.3_14;
>   int _15;
>   int b.4_25;
>   short int iftmp.3_27;
>   short int c.2_33;
>   unsigned int ivtmp_38;
>
> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   a_9 = 1;
>   ivtmp_38 = 1;
>   a_31 = a_9 + 1;
>   ivtmp_5 = ivtmp_38 + 4294967295;
>   _2 = 1;
>   b.0_3 = b;
>   _4 = b.0_3 | _2;
>   b = _4;
>   if (_4 == 0)
>     goto <bb 12>; [66.00%]
>   else
>     goto <bb 10>; [34.00%]
> ;;    succ:       12
> ;;                10
>
> ;;   basic block 3, loop depth 1
> ;;    pred:       5
> ;;                3
>   goto <bb 3>; [100.00%]
> ;;    succ:       3
>
> ;;   basic block 4, loop depth 0
> ;;    pred:       11
> ;;                12
> lbl1:
>   c.2_33 = c;
> ;;    succ:       5
>
> ;;   basic block 5, loop depth 1
> ;;    pred:       4
> ;;                5
>   if (c.2_33 != 0)
>     goto <bb 3>; [85.00%]
>   else
>     goto <bb 5>; [15.00%]
> ;;    succ:       3
> ;;                5
>
> ;;   basic block 6, loop depth 0
> ;;    pred:       11
>   b.4_25 = b;
>   if (b.4_25 != 0)
>     goto <bb 7>; [50.00%]
>   else
>     goto <bb 8>; [50.00%]
> ;;    succ:       7
> ;;                8
>
> ;;   basic block 7, loop depth 0
> ;;    pred:       6
>   iftmp.3_27 = (short int) b.4_25;
>   goto <bb 9>; [100.00%]
> ;;    succ:       9
>
> ;;   basic block 8, loop depth 0
> ;;    pred:       6
> ;;                12
>   # k_37 = PHI <k_30(6), 0B(12)>
> ;;    succ:       9
>
> ;;   basic block 9, loop depth 0
> ;;    pred:       7
> ;;                8
>   # iftmp.3_14 = PHI <iftmp.3_27(7), 2(8)>
>   # k_44 = PHI <k_30(7), k_37(8)>
>   c = iftmp.3_14;
>   if (iftmp.3_14 != 0)
>     goto <bb 10>; [50.00%]
>   else
>     goto <bb 11>; [50.00%]
> ;;    succ:       10
> ;;                11
>
> ;;   basic block 10, loop depth 0
> ;;    pred:       2
> ;;                9
>   # k_10 = PHI <0B(2), k_44(9)>
> lbl2:
>   b = 0;
> ;;    succ:       11
>
> ;;   basic block 11, loop depth 0
> ;;    pred:       10
> ;;                9
>   # k_11 = PHI <k_10(10), k_44(9)>
>   k_30 = k_11 + 4;
>   _6 = MEM[(int *)k_11 + 4B];
>   if (_6 != 0)
>     goto <bb 6>; [66.00%]
>   else
>     goto <bb 4>; [34.00%]
> ;;    succ:       6
> ;;                4
>
> ;;   basic block 12, loop depth 0
> ;;    pred:       2
>   _15 = MEM[(int *)0B];
>   if (_15 != 0)
>     goto <bb 8>; [66.00%]
>   else
>     goto <bb 4>; [34.00%]
> ;;    succ:       8
> ;;                4
>
> }
>
> The first tidbit of interest is we can statically determine that bb2
> will always transfer control to bb10.  The edge 2->12 is marked as not
> executable.  That also means that 12->4 and 12->8 are not executable as
> well.
>
> The PHI in BB8 is critical:
>
>   # k_37 = PHI <k_30(6), 0B(12)>
>
> With the edge 8->12 being unexecutable the PHI is essentially k_37 =
> k_30 and we'll try to propagate k_30 into the uses of k_37.
>
> The first use of interest is BB9.
>
>
>   # k_44 = PHI <k_30(7), k_37(8)>
>
> Which turns into
>   # k_44 = PHI <k_30(7), k_30(8)>
>
> And we've lost.  The assignment to k_30 does not currently dominate k_44
> and won't until after we've done cfg cleanups and updated the dominator
> information.
>
> How does this matter?  We have to continue to follow things through DOM.
>
>
> We can then propagate k_30 into uses of k_44, which we do for the PHIs
> in BB10 and BB11 which turn into:
>
> <bb 10> [local count: 0]:
> # k_10 = PHI <0B(2), k_30(9)>
>
>
> <bb 11> [local count: 1]:
> # k_11 = PHI <k_10(10), k_30(9)>
> k_30 = k_11 + 4;
> _6 = MEM[(int *)k_11 + 4B];
> if (_6 != 0)
>   goto <bb 6>; [66.00%]
> else
>   goto <bb 4>; [34.00%]
>
>
> Now the fun part.
>
> After we have finished optimizing bb9, we try to thread its outgoing
> edges.  Of particular interest is the edge 9->11.
>
> We do temporary propagation as part of the jump threading process.  ie,
> when we traverse 9->11 we know that k_11 will have the value k_30.  So
> we transform the assignment
>
> k_30 = k_11 + 4
>
> into
>
> k_30 = k_30 + 4
>
> Then we try to simplify that statement and infinitely recurse.
>
>
> I considered trying to key behavior on EDGE_DFS_BACK (6->8).  It'd be
> something like don't record equivalences from a degenerate PHI where the
> remaining edge(s) are EDGE_DFS_BACK.  But I just couldn't convince
> myself that was actually reasonable or sufficient in general.

The key point is that you can never record "symbolic" equivalences on
backedges.  Because in SSA you rely on PHIs to fence SSA names from
leaking cross iterations.  That is, _1 in iteration 1 is obviously not equal
to _1 in iteration 2.

Other passes like VRP and value-numbering need to take care of this
as well.

You can of course record that _1 is 2 (constant) even across a backedge.

So I think your consideration was exactly correct.

Richard.

> Jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]