[PATCH] Be careful about combined chain with length == 0 (PR, tree-optimization/70754).

Richard Biener richard.guenther@gmail.com
Thu Jan 19 11:23:00 GMT 2017


On Thu, Jan 19, 2017 at 11:25 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Thu, Jan 19, 2017 at 9:42 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Jan 18, 2017 at 4:32 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Wed, Jan 18, 2017 at 2:54 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Wed, Jan 18, 2017 at 11:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>> Hello.
>>>>>
>>>>>
>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>>>
>>>>> Ready to be installed?
>>>>
>>>> I'm not sure.  If we have such zero distance refs in the IL at the
>>>> time pcom runs then not handling
>>>> them will pessimize code-gen for cases where they are part of a larger
>>>> chain.  Esp. I don't like
>>> Do you mean different chains distributed because of MAX_DISTANCE by
>>> "larger chain"?  With the patch, such chain of refs would still be
>>> pred-commoned, just the arithmetic operation not combined, which could
>>> be handled by later DOM?
>>>> another stmt_dominates_stmt_p call and thus rather not handle length
>>>> == 0 at all...
>>> Not handle length == 0 chains at all may be sub-optimal.  As you said,
>>> such chain of refs at the point may simply because previous dom/cse
>>> fail to analyze the references.
>>>>
>>>> We already seem to go great length in associating stuff when combining
>>>> stuff thus isn't this
>>>> maybe an artifact of this association?  Maybe we simply need to sort
>>>> the new chain after
>>>> combining it so the root stmt comes last?
>>>>
>>>> Note that there seems to be only a single length per chain but not all
>>>> refs in a chain need to
>>>> have the same distance.  This means your fix is likely incomplete?
>>>> What prevents the situation
>>>> to arise for distance != 0?
>>> Yes, it's possible for two refs have the same distance in a chain with
>>> length > 0.  But that should not be a problem, because existing uses
>>> are replaced by newly generated PHI variables which always dominate
>>> the uses, right?
>>
>> I must admit I don't know predcom in such detail but then can we handle
>> distance == 0 by simply inserting a PHI for those as well (a degenerate
>> one of course)?  Or can for distance == 0 the ref be not loop invariant?
> Not sure if I understand the question correctly.  Distance is
> difference of niter between one ref and the root ref of the chain, so
> 0 distance/length doesn't mean a loop invariant, it's very likely two
> (exactly the same) references in each loop iteration, the address of
> reference is still a SCEV.  OTOH, invariant chain has invariant
> address, and is handled separately.  For the first question, it's
> length, rather than distance that decides how the chain is handled.
> For length > 0 chain, we have to insert PHIs to pass carried result of
> memory reference, even some refs may have 0 distance to the root ref.
>>
>> Note that for length == 0 all refs in the chain will have a dependence distance
>> of zero.  So my first argument probably doesn't hold and we could simply
>> remove handling of length == 0 chains and rely on CSE?
> I am not sure, that CSE opportunity of references exists at this point
> means previous cse pass failed for some reason.

Or a later pass introduced it (in this case, the vectorizer).

> Predcom could be the
> only pass that can handle such case as it understands data reference
> better.  Note Martin's patch is not to skip handling of length == 0
> chain, later ref will still be CSEed with result of root ref, only the
> combination operation like chain1 + chain2 is skipped.  In this case,
> following dom should be able to handle such (loop independent) cse
> opportunities.

I must admit I don't completely understand the consequences of this
disabling but of course DOM should also be able to handle the CSE
(ok, DOM is known to be quite weak with memory equivalence but
it's not that data-dependence is really better in all cases).

Can't we simply re-order refs in new_chain appropriately or handle
this case in combinable_refs_p instead?

That is, I understand the patch as a hack as it should be always
possible to find dominating refs?

In fact the point of the assert tells us a simpler fix may be

Index: tree-predcom.c
===================================================================
--- tree-predcom.c      (revision 244519)
+++ tree-predcom.c      (working copy)
@@ -2330,6 +2334,12 @@ combine_chains (chain_p ch1, chain_p ch2
          break;
        }
     }
+  if (new_chain->length == 0
+      && ! new_chain->has_max_use_after)
+    {
+      release_chain (new_chain);
+      return NULL;
+    }

   ch1->combined = true;
   ch2->combined = true;

which obviously matches the assert we run into for the testcase?  I'd
be ok with that
(no stmt_dominates_stmt_p, heh) with a suitable comment before it.

Richard.


>
> Thanks,
> bin
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>>
>>>> Richard.
>>>>
>>>>> Martin
>>>>>



More information about the Gcc-patches mailing list