This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls


On 27/01/12 21:37, Tom de Vries wrote:
> On 24/01/12 11:40, Richard Guenther wrote:
>> On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>> Richard,
>>> Jakub,
>>>
>>> the following patch fixes PR51879.
>>>
>>> Consider the following test-case:
>>> ...
>>> int bar (int);
>>> void baz (int);
>>>
>>> void
>>> foo (int y)
>>> {
>>>  int a;
>>>  if (y == 6)
>>>    a = bar (7);
>>>  else
>>>    a = bar (7);
>>>  baz (a);
>>> }
>>> ...
>>>
>>> after compiling at -02, the representation looks like this before tail-merging:
>>> ...
>>>  # BLOCK 3 freq:1991
>>>  # PRED: 2 [19.9%]  (true,exec)
>>>  # .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)>
>>>  # USE = nonlocal
>>>  # CLB = nonlocal
>>>  aD.1709_3 = barD.1703 (7);
>>>  goto <bb 5>;
>>>  # SUCC: 5 [100.0%]  (fallthru,exec)
>>>
>>>  # BLOCK 4 freq:8009
>>>  # PRED: 2 [80.1%]  (false,exec)
>>>  # .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)>
>>>  # USE = nonlocal
>>>  # CLB = nonlocal
>>>  aD.1709_4 = barD.1703 (7);
>>>  # SUCC: 5 [100.0%]  (fallthru,exec)
>>>
>>>  # BLOCK 5 freq:10000
>>>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>>>  # aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)>
>>>  # .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)>
>>>  # .MEMD.1714_9 = VDEF <.MEMD.1714_5>
>>>  # USE = nonlocal
>>>  # CLB = nonlocal
>>>  bazD.1705 (aD.1709_1);
>>>  # VUSE <.MEMD.1714_9>
>>>  return;
>>> ...
>>>
>>> the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8
>>> to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.
>>>
>>> The patch makes sure non-const/pure call results (gimple_vdef and
>>> gimple_call_lhs) are properly value numbered.
>>>
>>> Bootstrapped and reg-tested on x86_64.
>>>
>>> ok for stage1?
>>
>> The following cannot really work:
>>
>> @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
>>    result = vn_reference_lookup_1 (&vr1, NULL);
>>    if (result)
>>      {
>> -      changed = set_ssa_val_to (lhs, result);
>> +      tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
>> +      if (vdef)
>> +       changed |= set_ssa_val_to (vdef, result_vdef);
>> +      changed |= set_ssa_val_to (lhs, result);
>>
>> because 'result' may be not an SSA name.  It might also not have
>> a proper definition statement (if VN_INFO (result)->needs_insertion
>> is true).  So you at least need to guard things properly.
>>
> 
> Right. And that also doesn't work if the function is without lhs, such as in the
> new test-case pr51879-6.c.
> 
> I fixed this by storing both lhs and vdef, such that I don't have to derive
> the vdef from the lhs.
> 
>> (On a side-note - I _did_ want to remove value-numbering virtual operands
>> at some point ...)
>>
> 
> Doing so willl hurt performance of tail-merging in its current form.
> OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
> value numbering as used in tail-merging has its limitations too.
> Do you have any ideas how to address that one?
> 
>> @@ -3359,8 +3366,10 @@ visit_use (tree use)
>>           /* ???  We should handle stores from calls.  */
>>           else if (TREE_CODE (lhs) == SSA_NAME)
>>             {
>> +             tree vuse = gimple_vuse (stmt);
>>               if (!gimple_call_internal_p (stmt)
>> -                 && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
>> +                 && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
>> +                     || (vuse && SSA_VAL (vuse) != VN_TOP)))
>>                 changed = visit_reference_op_call (lhs, stmt);
>>               else
>>                 changed = defs_to_varying (stmt);
>>
>> ... exactly because of the issue that a stmt has multiple defs.  Btw,
>> vuse should have been visited here or be part of our SCC, so, why do
>> you need this check?
>>
> 
> Removed now, that was a workaround for a bug in an earlier version of the patch,
> that I didn't need anymore.
> 
> Bootstrapped and reg-tested on x86_64.
> 
> OK for stage1?
> 

Richard,

quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
...
I think these fixes hint at that we should
use "structural" equality as fallback if value-numbering doesn't equate
two stmt effects.  Thus, treat two stmts with exactly the same operands
and flags as equal and using value-numbering to canonicalize operands
(when they are SSA names) for that comparison, or use VN entirely
if there are no side-effects on the stmt.

Changing value-numbering of virtual operands, even if it looks correct in the
simple cases you change, doesn't look like a general solution for the missed
tail merging opportunities.
...

The test-case pr51879-6.c shows a case where improving value numbering will help
tail-merging, but structural equality comparison not:
...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1717_7 = VDEF <.MEMD.1717_6(D)>
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_8 = VDEF <.MEMD.1717_7>
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_3 = barD.1704 (7);
  goto <bb 5>;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1717_9 = VDEF <.MEMD.1717_6(D)>
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_10 = VDEF <.MEMD.1717_9>
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_4 = barD.1704 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:10000
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1712_1 = PHI <aD.1712_3(3), aD.1712_4(4)>
  # .MEMD.1717_5 = PHI <.MEMD.1717_8(3), .MEMD.1717_10(4)>
  # .MEMD.1717_11 = VDEF <.MEMD.1717_5>
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1706 (aD.1712_1);
  # VUSE <.MEMD.1717_11>
  return;
...
by structurally comparing the gimples in both blocks, we can see that these are
equal.
What we can't see, is that aD.1712_3 and aD.1712_4 have the same value. For
that, we need value numbering to number the vuses the same. And if we get value
numbering to number these values equal, we don't need the structural comparison.

In this case structural comparison cannot be used as fallback for
value-numbering. So, ok for trunk?

And if not ok, do you have a suggestion of how to deal with this otherwise?

Thanks,
- Tom

> Thanks,
> - Tom
> 
>> Thanks,
>> Richard.
>>
> 
> 2012-01-27  Tom de Vries  <tom@codesourcery.com>
> 
> 	PR tree-optimization/51879
> 	* tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field.
> 	* tree-ssa-sccvn.c (visit_reference_op_call): Handle gimple_vdef.
> 	Handle case that lhs is NULL_TREE.
> 	(visit_use): Handle non-pure/const calls and calls without result using
> 	visit_reference_op_call.
> 
> 	gcc.dg/pr51879.c: New test.
> 	gcc.dg/pr51879-2.c: Same.
> 	gcc.dg/pr51879-3.c: Same.
> 	gcc.dg/pr51879-4.c: Same.
> 	gcc.dg/pr51879-6.c: Same.
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]