This is the mail archive of the
mailing list for the GCC project.
RE: regs_used estimation in IVOPTS seriously flawed
- From: Bingfeng Mei <bmei at broadcom dot com>
- To: Bin.Cheng <amker dot cheng at gmail dot com>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Fri, 20 Jun 2014 09:01:43 +0000
- Subject: RE: regs_used estimation in IVOPTS seriously flawed
- Authentication-results: sourceware.org; auth=none
- References: <B71DF1153024A14EABB94E39368E44A6042CA205 at SJEXCHMB13 dot corp dot ad dot broadcom dot com> <CAHFci29vqVpLZGqqi-PQ+8R0+ity+Qbp7VaO+ksQProj9BqJeA at mail dot gmail dot com>
> -----Original Message-----
> From: Bin.Cheng [mailto:firstname.lastname@example.org]
> Sent: 20 June 2014 06:25
> To: Bingfeng Mei
> Cc: email@example.com
> Subject: Re: regs_used estimation in IVOPTS seriously flawed
> On Tue, Jun 17, 2014 at 10:59 PM, Bingfeng Mei <firstname.lastname@example.org> wrote:
> > Hi,
> > I am looking at a performance regression in our code. A big loop
> > and uses a lot of temporary variables inside the loop body. The
> > appears that IVOPTS pass creates even more induction variables (from
> > 2 to 27). It causes a lot of register spilling later and performance
> Do you have a simplified case which can be posted here? I guess it
> affects some other targets too.
> > take a severe hit. I looked into tree-ssa-loop-ivopts.c, it does call
> > estimate_reg_pressure_cost function to take # of registers into
> > consideration. The second parameter passed as data->regs_used is
> > to represent old register usage before IVOPTS.
> > return size + estimate_reg_pressure_cost (size, data->regs_used,
> > data->body_includes_call);
> > In this case, it is mere 2 by following calculation. Essentially, it
> only counts
> > all loop invariant registers, ignoring all registers produced/used
> inside the loop.
> There are two kinds of registers produced/used inside the loop. One
> is induction variable irrelevant, it includes non-linear uses as
> mentioned by Richard. The other kind relates to induction variable
> rewrite, and one issue with this kind is expression generated during
> iv use rewriting is not reflecting the estimated one in ivopt very
As a short term solution, I tried some simple non-linear functions as Richard suggested
to penalize using too many IVs. For example, the following cost in
ivopts_global_cost_for_size fixed my regression and actually improves performance
slightly over a set of benchmarks we usually use.
return size * (1 + size * 0.2)
+ estimate_reg_pressure_cost (size, data->regs_used, data->speed,
The trouble is choice of this non-linear function could be highly target dependent
(# of registers?). I don't have setup to prove performance gain for other targets.
I also tried counting all SSA names and divide it by a factor. It does seem to work
Long term, if we have infrastructure to analyze maximal live variable in a loop
at tree-level, that would be great for many loop optimizations.