This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: "Bin.Cheng" <amker dot cheng at gmail dot com>
- Cc: Jeff Law <law at redhat dot com>, Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
- Date: Fri, 13 Nov 2015 11:18:20 +0100
- Subject: Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.
- Authentication-results: sourceware.org; auth=none
- References: <37378DC5BCD0EE48BA4B082E0B55DFAA4299E44D at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <56457FA0 dot 1070201 at redhat dot com> <CAHFci2-hsxg0fzcnVe5NKc6_Ff1+EkOoSm6+6u=RJXcLj59N-g at mail dot gmail dot com>
On Fri, Nov 13, 2015 at 7:31 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Nov 13, 2015 at 2:13 PM, Jeff Law <law@redhat.com> wrote:
>> On 10/07/2015 10:32 PM, Ajit Kumar Agarwal wrote:
>>
>>>
>>> 0001-RFC-Patch-Optimized-changes-in-the-register-used-ins.patch
>>>
>>>
>>> From f164fd80953f3cffd96a492c8424c83290cd43cc Mon Sep 17 00:00:00 2001
>>> From: Ajit Kumar Agarwal<ajitkum@xilix.com>
>>> Date: Wed, 7 Oct 2015 20:50:40 +0200
>>> Subject: [PATCH] [RFC, Patch]: Optimized changes in the register used
>>> inside
>>> loop for LICM and IVOPTS.
>>>
>>> Changes are done in the Loop Invariant(LICM) at RTL level and also the
>>> Induction variable optimization based on SSA representation. The current
>>> logic used in LICM for register used inside the loops is changed. The
>>> Live Out of the loop latch node and the Live in of the destination of
>>> the exit nodes is used to set the Loops Liveness at the exit of the Loop.
>>> The register used is the number of live variables at the exit of the
>>> Loop calculated above.
>>>
>>> For Induction variable optimization on tree SSA representation, the
>>> register
>>> used logic is based on the number of phi nodes at the loop header to
>>> represent
>>> the liveness at the loop. Current Logic used only the number of phi nodes
>>> at
>>> the loop header. Changes are made to represent the phi operands also live
>>> at
>>> the loop. Thus number of phi operands also gets incremented in the number
>>> of
>>> registers used.
>>>
>>> ChangeLog:
>>> 2015-10-09 Ajit Agarwal<ajitkum@xilinx.com>
>>>
>>> * loop-invariant.c (compute_loop_liveness): New.
>>> (determine_regs_used): New.
>>> (find_invariants_to_move): Use of determine_regs_used.
>>> * tree-ssa-loop-ivopts.c (determine_set_costs): Consider the phi
>>> arguments for register used.
>>
>> I think Bin rejected the tree-ssa-loop-ivopts change. However, the
>> loop-invariant change is still pending, right?
> Ah, reject is a strong word, I am just being dumb and don't understand
> why it's a general better estimation yet.
> Maybe Richard have some inputs here?
Not really. I agree with Bin that the change doesn't look like an improvement
by design (might be one by accident for some benchmarks).
Richard.
> Thanks,
> bin
>>
>>
>>>
>>> Signed-off-by:Ajit Agarwalajitkum@xilinx.com
>>> ---
>>> gcc/loop-invariant.c | 72
>>> +++++++++++++++++++++++++++++++++++++---------
>>> gcc/tree-ssa-loop-ivopts.c | 4 +--
>>> 2 files changed, 60 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
>>> index 52c8ae8..e4291c9 100644
>>> --- a/gcc/loop-invariant.c
>>> +++ b/gcc/loop-invariant.c
>>> @@ -1413,6 +1413,19 @@ set_move_mark (unsigned invno, int gain)
>>> }
>>> }
>>>
>>> +static int
>>> +determine_regs_used()
>>> +{
>>> + unsigned int j;
>>> + unsigned int reg_used = 2;
>>> + bitmap_iterator bi;
>>> +
>>> + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (curr_loop)->regs_live, 0, j, bi)
>>> + (reg_used) ++;
>>> +
>>> + return reg_used;
>>> +}
>>
>> Isn't this just bitmap_count_bits (regs_live) + 2?
>>
>>
>>> @@ -2055,9 +2057,43 @@ calculate_loop_reg_pressure (void)
>>> }
>>> }
>>>
>>> -
>>> +static void
>>> +calculate_loop_liveness (void)
>>
>> Needs a function comment.
>>
>>
>>> +{
>>> + basic_block bb;
>>> + struct loop *loop;
>>>
>>> -/* Move the invariants out of the loops. */
>>> + FOR_EACH_LOOP (loop, 0)
>>> + if (loop->aux == NULL)
>>> + {
>>> + loop->aux = xcalloc (1, sizeof (struct loop_data));
>>> + bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack);
>>> + }
>>> +
>>> + FOR_EACH_BB_FN (bb, cfun)
>>
>> Why loop over blocks here? Why not just iterate through all the loops in
>> the loop structure. Order isn't particularly important AFAICT for this
>> code.
>>
>>
>>
>>> + {
>>> + int i;
>>> + edge e;
>>> + vec<edge> edges;
>>> + edges = get_loop_exit_edges (loop);
>>> + FOR_EACH_VEC_ELT (edges, i, e)
>>> + {
>>> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live,
>>> DF_LR_OUT(e->src));
>>> + bitmap_ior_into (&LOOP_DATA (loop)->regs_live,
>>> DF_LR_IN(e->dest));
>>
>> Space before the open-paren in the previous two lines
>> DF_LR_OUT (e->src) and FD_LR_INT (e->dest))
>>
>>
>>> + }
>>> + }
>>> + }
>>> +}
>>> +
>>> +/* Move the invariants ut of the loops. */
>>
>> Looks like you introduced a typo.
>>
>> I'd like to see testcases which show the change in # regs used computation
>> helping generate better code.
>>
>> And I'd also like to see some background information on why you think this
>> is a more accurate measure for the number of registers used in the loop.
>> regs_used AFAICT is supposed to be an estimate of the registers live around
>> the loop. So ISTM that you get that value by live-out set on the backedge
>> of the loop. I guess you get somethign similar by looking at the exit
>> edge's source block's live-out set. But I don't see any value in including
>> stuff live at the block outside the loop.
>>
>> It also seems fairly non-intuitive. Get the block's latch and use its
>> live-out set. That seems more intuitive.
>>