This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: regs_used estimation in IVOPTS seriously flawed
- From: "Bin.Cheng" <amker dot cheng at gmail dot com>
- To: Bingfeng Mei <bmei at broadcom dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Fri, 20 Jun 2014 13:25:17 +0800
- Subject: Re: regs_used estimation in IVOPTS seriously flawed
- Authentication-results: sourceware.org; auth=none
- References: <B71DF1153024A14EABB94E39368E44A6042CA205 at SJEXCHMB13 dot corp dot ad dot broadcom dot com>
On Tue, Jun 17, 2014 at 10:59 PM, Bingfeng Mei <bmei@broadcom.com> wrote:
> Hi,
> I am looking at a performance regression in our code. A big loop produces
> and uses a lot of temporary variables inside the loop body. The problem
> appears that IVOPTS pass creates even more induction variables (from original
> 2 to 27). It causes a lot of register spilling later and performance
Do you have a simplified case which can be posted here? I guess it
affects some other targets too.
> take a severe hit. I looked into tree-ssa-loop-ivopts.c, it does call
> estimate_reg_pressure_cost function to take # of registers into
> consideration. The second parameter passed as data->regs_used is supposed
> to represent old register usage before IVOPTS.
>
> return size + estimate_reg_pressure_cost (size, data->regs_used, data->speed,
> data->body_includes_call);
>
> In this case, it is mere 2 by following calculation. Essentially, it only counts
> all loop invariant registers, ignoring all registers produced/used inside the loop.
There are two kinds of registers produced/used inside the loop. One
is induction variable irrelevant, it includes non-linear uses as
mentioned by Richard. The other kind relates to induction variable
rewrite, and one issue with this kind is expression generated during
iv use rewriting is not reflecting the estimated one in ivopt very
well.
Thanks,
bin
>
> n = 0;
> for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
> {
> phi = gsi_stmt (psi);
> op = PHI_RESULT (phi);
>
> if (virtual_operand_p (op))
> continue;
>
> if (get_iv (data, op))
> continue;
>
> n++;
> }
>
> EXECUTE_IF_SET_IN_BITMAP (data->relevant, 0, j, bi)
> {
> struct version_info *info = ver_info (data, j);
>
> if (info->inv_id && info->has_nonlin_use)
> n++;
> }
>
> data->regs_used = n;
>
> I believe how regs_used is calculated is seriously flawed,
> or estimate_reg_pressure_cost is problematic if n_old is
> only supposed to be loop invariant registers. Either way,
> it affects how IVOPTS makes decision and could result in
> worse code. What do you think? Any idea on how to improve
> this?
>
>
> Thanks,
> Bingfeng
>
--
Best Regards.