This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls


On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 27/01/12 21:37, Tom de Vries wrote:
>> On 24/01/12 11:40, Richard Guenther wrote:
>>> On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>>> Richard,
>>>> Jakub,
>>>>
>>>> the following patch fixes PR51879.
>>>>
>>>> Consider the following test-case:
>>>> ...
>>>> int bar (int);
>>>> void baz (int);
>>>>
>>>> void
>>>> foo (int y)
>>>> {
>>>> ?int a;
>>>> ?if (y == 6)
>>>> ? ?a = bar (7);
>>>> ?else
>>>> ? ?a = bar (7);
>>>> ?baz (a);
>>>> }
>>>> ...
>>>>
>>>> after compiling at -02, the representation looks like this before tail-merging:
>>>> ...
>>>> ?# BLOCK 3 freq:1991
>>>> ?# PRED: 2 [19.9%] ?(true,exec)
>>>> ?# .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)>
>>>> ?# USE = nonlocal
>>>> ?# CLB = nonlocal
>>>> ?aD.1709_3 = barD.1703 (7);
>>>> ?goto <bb 5>;
>>>> ?# SUCC: 5 [100.0%] ?(fallthru,exec)
>>>>
>>>> ?# BLOCK 4 freq:8009
>>>> ?# PRED: 2 [80.1%] ?(false,exec)
>>>> ?# .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)>
>>>> ?# USE = nonlocal
>>>> ?# CLB = nonlocal
>>>> ?aD.1709_4 = barD.1703 (7);
>>>> ?# SUCC: 5 [100.0%] ?(fallthru,exec)
>>>>
>>>> ?# BLOCK 5 freq:10000
>>>> ?# PRED: 3 [100.0%] ?(fallthru,exec) 4 [100.0%] ?(fallthru,exec)
>>>> ?# aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)>
>>>> ?# .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)>
>>>> ?# .MEMD.1714_9 = VDEF <.MEMD.1714_5>
>>>> ?# USE = nonlocal
>>>> ?# CLB = nonlocal
>>>> ?bazD.1705 (aD.1709_1);
>>>> ?# VUSE <.MEMD.1714_9>
>>>> ?return;
>>>> ...
>>>>
>>>> the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8
>>>> to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.
>>>>
>>>> The patch makes sure non-const/pure call results (gimple_vdef and
>>>> gimple_call_lhs) are properly value numbered.
>>>>
>>>> Bootstrapped and reg-tested on x86_64.
>>>>
>>>> ok for stage1?
>>>
>>> The following cannot really work:
>>>
>>> @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
>>> ? ?result = vn_reference_lookup_1 (&vr1, NULL);
>>> ? ?if (result)
>>> ? ? ?{
>>> - ? ? ?changed = set_ssa_val_to (lhs, result);
>>> + ? ? ?tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
>>> + ? ? ?if (vdef)
>>> + ? ? ? changed |= set_ssa_val_to (vdef, result_vdef);
>>> + ? ? ?changed |= set_ssa_val_to (lhs, result);
>>>
>>> because 'result' may be not an SSA name. ?It might also not have
>>> a proper definition statement (if VN_INFO (result)->needs_insertion
>>> is true). ?So you at least need to guard things properly.
>>>
>>
>> Right. And that also doesn't work if the function is without lhs, such as in the
>> new test-case pr51879-6.c.
>>
>> I fixed this by storing both lhs and vdef, such that I don't have to derive
>> the vdef from the lhs.
>>
>>> (On a side-note - I _did_ want to remove value-numbering virtual operands
>>> at some point ...)
>>>
>>
>> Doing so willl hurt performance of tail-merging in its current form.
>> OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
>> value numbering as used in tail-merging has its limitations too.
>> Do you have any ideas how to address that one?
>>
>>> @@ -3359,8 +3366,10 @@ visit_use (tree use)
>>> ? ? ? ? ? /* ??? ?We should handle stores from calls. ?*/
>>> ? ? ? ? ? else if (TREE_CODE (lhs) == SSA_NAME)
>>> ? ? ? ? ? ? {
>>> + ? ? ? ? ? ? tree vuse = gimple_vuse (stmt);
>>> ? ? ? ? ? ? ? if (!gimple_call_internal_p (stmt)
>>> - ? ? ? ? ? ? ? ? && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
>>> + ? ? ? ? ? ? ? ? && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
>>> + ? ? ? ? ? ? ? ? ? ? || (vuse && SSA_VAL (vuse) != VN_TOP)))
>>> ? ? ? ? ? ? ? ? changed = visit_reference_op_call (lhs, stmt);
>>> ? ? ? ? ? ? ? else
>>> ? ? ? ? ? ? ? ? changed = defs_to_varying (stmt);
>>>
>>> ... exactly because of the issue that a stmt has multiple defs. ?Btw,
>>> vuse should have been visited here or be part of our SCC, so, why do
>>> you need this check?
>>>
>>
>> Removed now, that was a workaround for a bug in an earlier version of the patch,
>> that I didn't need anymore.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> OK for stage1?
>>
>
> Richard,
>
> quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
> ...
> I think these fixes hint at that we should
> use "structural" equality as fallback if value-numbering doesn't equate
> two stmt effects. ?Thus, treat two stmts with exactly the same operands
> and flags as equal and using value-numbering to canonicalize operands
> (when they are SSA names) for that comparison, or use VN entirely
> if there are no side-effects on the stmt.
>
> Changing value-numbering of virtual operands, even if it looks correct in the
> simple cases you change, doesn't look like a general solution for the missed
> tail merging opportunities.
> ...
>
> The test-case pr51879-6.c shows a case where improving value numbering will help
> tail-merging, but structural equality comparison not:
> ...
> ?# BLOCK 3 freq:1991
> ?# PRED: 2 [19.9%] ?(true,exec)
> ?# .MEMD.1717_7 = VDEF <.MEMD.1717_6(D)>
> ?# USE = nonlocal
> ?# CLB = nonlocal
> ?blaD.1708 (5);
> ?# .MEMD.1717_8 = VDEF <.MEMD.1717_7>
> ?# USE = nonlocal
> ?# CLB = nonlocal
> ?aD.1712_3 = barD.1704 (7);
> ?goto <bb 5>;
> ?# SUCC: 5 [100.0%] ?(fallthru,exec)
>
> ?# BLOCK 4 freq:8009
> ?# PRED: 2 [80.1%] ?(false,exec)
> ?# .MEMD.1717_9 = VDEF <.MEMD.1717_6(D)>
> ?# USE = nonlocal
> ?# CLB = nonlocal
> ?blaD.1708 (5);
> ?# .MEMD.1717_10 = VDEF <.MEMD.1717_9>
> ?# USE = nonlocal
> ?# CLB = nonlocal
> ?aD.1712_4 = barD.1704 (7);
> ?# SUCC: 5 [100.0%] ?(fallthru,exec)
>
> ?# BLOCK 5 freq:10000
> ?# PRED: 3 [100.0%] ?(fallthru,exec) 4 [100.0%] ?(fallthru,exec)
> ?# aD.1712_1 = PHI <aD.1712_3(3), aD.1712_4(4)>
> ?# .MEMD.1717_5 = PHI <.MEMD.1717_8(3), .MEMD.1717_10(4)>
> ?# .MEMD.1717_11 = VDEF <.MEMD.1717_5>
> ?# USE = nonlocal
> ?# CLB = nonlocal
> ?bazD.1706 (aD.1712_1);
> ?# VUSE <.MEMD.1717_11>
> ?return;
> ...
> by structurally comparing the gimples in both blocks, we can see that these are
> equal.
> What we can't see, is that aD.1712_3 and aD.1712_4 have the same value. For
> that, we need value numbering to number the vuses the same. And if we get value
> numbering to number these values equal, we don't need the structural comparison.
>
> In this case structural comparison cannot be used as fallback for
> value-numbering. So, ok for trunk?
>
> And if not ok, do you have a suggestion of how to deal with this otherwise?

Hmm.  I'm not sure we can conclude that they have the same value!

+int bar (int);
+void baz (int);
+void bla (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    {
+      bla (5);
+      a = bar (7);
+    }
+  else
+    {
+      bla (5);
+      a = bar (7);
+    }
+  baz (a);
+}

I can implement bla as

void bla(int) { a = random (); }

thus a would not have the same value on both paths - but that is not what
matters - because obviously we can still merge the blocks.  That is,
we cannot be sure that the side-effects on memory of bla (5) and bla (5)
are the same.  But is that not what your value-numbering changes will assert?
(not sure how you achieve this, but it looks like you insert/lookup general
calls, thus not pure/const ones - that could easily lead to miscompiles(?)).

Richard.

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]