[PATCH] ira: Scale save/restore costs of callee save registers with block frequency
Vladimir Makarov
vmakarov@redhat.com
Thu Oct 5 16:16:17 GMT 2023
On 10/3/23 10:07, Surya Kumari Jangala wrote:
> ira: Scale save/restore costs of callee save registers with block frequency
>
> In assign_hard_reg(), when computing the costs of the hard registers, the
> cost of saving/restoring a callee-save hard register in prolog/epilog is
> taken into consideration. However, this cost is not scaled with the entry
> block frequency. Without scaling, the cost of saving/restoring is quite
> small and this can result in a callee-save register being chosen by
> assign_hard_reg() even though there are free caller-save registers
> available. Assigning a callee save register to a pseudo that is live
> in the entire function and across a call will cause shrink wrap to fail.
Thank you for addressing this part of code. Sometimes changes looking
obvious have unpredicted results. I remember experimenting with
different heuristics for this code long time ago when 32-bit x86 target
was the major one and this was the best variant I found. Since a lot of
changes happened since then, I decided to benchmark your change.
This change is increasing x86-64 spec2017 code size by 0.67% in
average. The increase is very stable for 20 spec2017 benchmarks. Only
code for bwaves is smaller (by 0.01%). The specfp2017 performance is
the same. There is one positive impact, specin2017 improved by 0.6%
(8.59 vs 8.54) mainly because of improvement of xalamcbmk (2.5%) and
exchange (5%).
So I propose to make this change only when it is not an optimization for
the code size. Also please be prepared that there might be testsuite
failures on other targets: some targets are overconstrained by tests
expecting specific generated code.
> 2023-10-03 Surya Kumari Jangala <jskumari@linux.ibm.com>
>
> gcc/
> PR rtl-optimization/111673
> * ira-color.cc (assign_hard_reg): Scale save/restore costs of
> callee save registers with block frequency.
>
> gcc/testsuite/
> PR rtl-optimization/111673
> * gcc.target/powerpc/pr111673/c: New test.
> ---
>
> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> index f2e8ea34152..eb20c52310d 100644
> --- a/gcc/ira-color.cc
> +++ b/gcc/ira-color.cc
> @@ -2175,7 +2175,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
> add_cost = ((ira_memory_move_cost[mode][rclass][0]
> + ira_memory_move_cost[mode][rclass][1])
> * saved_nregs / hard_regno_nregs (hard_regno,
> - mode) - 1);
> + mode) - 1)
> + * REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> cost += add_cost;
> full_cost += add_cost;
> }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr111673.c b/gcc/testsuite/gcc.target/powerpc/pr111673.c
> new file mode 100644
> index 00000000000..e0c0f85460a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr111673.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile { target lp64 } } */
> +/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */
> +
> +/* Verify there is an early return without the prolog and shrink-wrap
> + the function. */
> +
> +int f (int);
> +int
> +advance (int dz)
> +{
> + if (dz > 0)
> + return (dz + dz) * dz;
> + else
> + return dz * f (dz);
> +}
> +
> +/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 "pro_and_epilogue" } } */
>
More information about the Gcc-patches
mailing list