This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.


On Fri, Nov 13, 2015 at 7:31 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Nov 13, 2015 at 2:13 PM, Jeff Law <law@redhat.com> wrote:
>> On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote:
>>
>>>
>>> 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch
>>>
>>>
>>>  From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 2001
>>> From: Ajit Kumar Agarwal<ajitkum@xilix.com>
>>> Date: Wed, 7 Oct 2015 20:50:40 +0200
>>> Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used
>>> inside
>>>   loop for LICM and IVOPTS.
>>>
>>> Changes are done in the Loop Invariant(LICM) at RTL level and also the
>>> Induction variable optimization based on SSA representation. The current
>>> logic used in LICM for register used inside the loops is changed. The
>>> Live Out of the loop latch node and the Live in of the destination of
>>> the exit nodes is used to set the Loops Liveness at the exit of the Loop.
>>> The register used is the number of live variables at the exit of the
>>> Loop calculated above.
>>>
>>> For Induction variable optimization on tree SSA representation, the
>>> register
>>> used logic is based on the number of phi nodes at the loop header to
>>> represent
>>> the liveness at the loop.  Current Logic used only the number of phi nodes
>>> at
>>> the loop header.  Changes are made to represent the phi operands also live
>>> at
>>> the loop. Thus number of phi operands also gets incremented in the number
>>> of
>>> registers used.
>>>
>>> ChangeLog:
>>> 2015-10-09  Ajit Agarwal<ajitkum@xilinx.com>
>>>
>>>         * loop-invariant.c (compute_loop_liveness): New.
>>>         (determine_regs_used): New.
>>>         (find_invariants_to_move): Use of determine_regs_used.
>>>         * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi
>>>         arguments for register used.
>>
>> I think Bin rejected the tree-ssa-loop-ivopts change.  However, the
>> loop-invariant change is still pending, right?
> Ah, reject is a strong word, I am just being dumb and don't understand
> why it's a general better estimation yet.
> Maybe Richard have some inputs here?

Not really.  I agree with Bin that the change doesn't look like an improvement
by design (might be one by accident for some benchmarks).

Richard.

> Thanks,
> bin
>>
>>
>>>
>>> Signed-off-by:Ajit Agarwalajitkum@xilinx.com
>>> ---
>>>   gcc/loop-invariant.c       | 72
>>> +++++++++++++++++++++++++++++++++++++---------
>>>   gcc/tree-ssa-loop-ivopts.c |  4 +--
>>>   2 files changed, 60 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
>>> index 52c8ae8..e4291c9 100644
>>> --- a/gcc/loop-invariant.c
>>> +++ b/gcc/loop-invariant.c
>>> @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain)
>>>       }
>>>   }
>>>
>>> +static int
>>> +determine_regs_used()
>>> +{
>>> +  unsigned int j;
>>> +  unsigned int reg_used = 2;
>>> +  bitmap_iterator bi;
>>> +
>>> +  EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi)
>>> +    (reg_used) ++;
>>> +
>>> +  return reg_used;
>>> +}
>>
>> Isn't this just bitmap_count_bits (regs_live) + 2?
>>
>>
>>> @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void)
>>>       }
>>>   }
>>>
>>> -
>>> +static void
>>> +calculate_loop_liveness (void)
>>
>> Needs a function comment.
>>
>>
>>> +{
>>> +  basic_block bb;
>>> +  struct loop *loop;
>>>
>>> -/* Move the invariants out of the loops.  */
>>> +  FOR_EACH_LOOP (loop, 0)
>>> +    if (loop->aux == NULL)
>>> +      {
>>> +        loop->aux = xcalloc (1, sizeof (struct loop_data));
>>> +        bitmap_initialize (&LOOP_DATA (loop)->regs_live, &reg_obstack);
>>> +     }
>>> +
>>> +  FOR_EACH_BB_FN (bb, cfun)
>>
>> Why loop over blocks here?  Why not just iterate through all the loops in
>> the loop structure.  Order isn't particularly important AFAICT for this
>> code.
>>
>>
>>
>>> +       {
>>> +         int  i;
>>> +         edge e;
>>> +         vec<edge> edges;
>>> +         edges = get_loop_exit_edges (loop);
>>> +         FOR_EACH_VEC_ELT (edges, i, e)
>>> +         {
>>> +           bitmap_ior_into (&LOOP_DATA (loop)->regs_live,
>>> DF_LR_OUT(e->src));
>>> +           bitmap_ior_into (&LOOP_DATA (loop)->regs_live,
>>> DF_LR_IN(e->dest));
>>
>> Space before the open-paren in the previous two lines
>> DF_LR_OUT (e->src) and FD_LR_INT (e->dest))
>>
>>
>>> +         }
>>> +      }
>>> +  }
>>> +}
>>> +
>>> +/* Move the invariants  ut of the loops.  */
>>
>> Looks like you introduced a typo.
>>
>> I'd like to see testcases which show the change in # regs used computation
>> helping generate better code.
>>
>> And  I'd also like to see some background information on why you think this
>> is a more accurate measure for the number of registers used in the loop.
>> regs_used AFAICT is supposed to be an estimate of the registers live around
>> the loop.  So ISTM that you get that value by live-out set on the backedge
>> of the loop.  I guess you get somethign similar by looking at the exit
>> edge's source block's live-out set.  But I don't see any value  in including
>> stuff live at the block outside the loop.
>>
>> It also seems fairly non-intuitive.  Get the block's latch and use its
>> live-out set.  That seems more intuitive.
>>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]