[PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class
Richard Sandiford
richard.sandiford@linaro.org
Tue May 22 17:17:00 GMT 2018
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> A recent commit removing '*' from the md files caused a large regression
> in h264ref.
> It turns out aarch64_ira_change_pseudo_allocno_class is no longer
> effective after the
> SVE changes, and the combination results in the regression. This patch
> fixes it by
> using the new POINTER_AND_FP_REGS register class which is now used
> instead of ALL_REGS.
> Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>
> Passes regress, OK for commit?
>
> Since it is a regression introduced in GCC8, OK to backport to GCC8?
>
> ChangeLog:
> 2018-05-22 Wilco Dijkstra <wdijkstr@arm.com>
>
> * config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
> Use POINTER_AND_FP_REGSinstead of ALL_REGS.
> * config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase
> cost of r=w alternative.
> --
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index 2ebd256329c1a6a6b790d16955cbcee3feca456c..3d5fe44b53198a92afb726712c6e9dee890afe38 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2961,7 +2961,7 @@ (define_insn "*aarch64_get_lane_zero_extendsi<mode>"
> ;; is guaranteed so upper bits should be considered undefined.
> ;; RTL uses GCC vector extension indices throughout so flip only for assembly.
> (define_insn "aarch64_get_lane<mode>"
> - [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv")
> + [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv")
> (vec_select:<VEL>
> (match_operand:VALL_F16 1 "register_operand" "w, w, w")
> (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 47d98dfd095cdcd15908a86091cf2f8a4d6137b1..a119760c7f332aded200fa1b5bcfb1bbac7b6420 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1059,16 +1059,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
> }
>
> /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.
> - The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have
> - the same cost even if ALL_REGS has a much larger cost. ALL_REGS is also
> - used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory
> - cost (in this case the best class is the lowest cost one). Using ALL_REGS
> - irrespectively of its cost results in bad allocations with many redundant
> - int<->FP moves which are expensive on various cores.
> - To avoid this we don't allow ALL_REGS as the allocno class, but force a
> - decision between FP_REGS and GENERAL_REGS. We use the allocno class if it
> - isn't ALL_REGS. Similarly, use the best class if it isn't ALL_REGS.
> - Otherwise set the allocno class depending on the mode.
> + The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and
> + GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much
> + higher cost. POINTER_AND_FP_REGS is also used if the cost of both FP_REGS
> + and GENERAL_REGS is lower than the memory cost (in this case the best class
> + is the lowest cost one). Using POINTER_AND_FP_REGS irrespectively of its
> + cost results in bad allocations with many redundant int<->FP moves which
> + are expensive on various cores.
> + To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but
> + force a decision between FP_REGS and GENERAL_REGS. We use the allocno class
> + if it isn't POINTER_AND_FP_REGS. Similarly, use the best class if it isn't
> + POINTER_AND_FP_REGS. Otherwise set the allocno class depending on the mode.
> The result of this is that it is no longer inefficient to have a higher
> memory move cost than the register move cost.
> */
> @@ -1079,10 +1080,10 @@ aarch64_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class,
> {
> machine_mode mode;
>
> - if (allocno_class != ALL_REGS)
> + if (allocno_class != POINTER_AND_FP_REGS)
> return allocno_class;
>
> - if (best_class != ALL_REGS)
> + if (best_class != POINTER_AND_FP_REGS)
> return best_class;
>
> mode = PSEUDO_REGNO_MODE (regno);
I think it'd be better to use !reg_class_subset_p (POINTER_AND_FP_REGS, ...)
instead of ... != POINTER_AND_FP_REGS, since this in principle still applies
to ALL_REGS too.
FWIW, the patch looks good to me with that change.
Thanks,
Richard
More information about the Gcc-patches
mailing list