[PATCH][AArch64] Fix aarch64_ira_change_pseudo_allocno_class

Richard Sandiford richard.sandiford@linaro.org
Tue May 22 17:17:00 GMT 2018


Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:

> A recent commit removing '*' from the md files caused a large regression
> in h264ref.
> It turns out aarch64_ira_change_pseudo_allocno_class is no longer
> effective after the
> SVE changes, and the combination results in the regression.  This patch
> fixes it by
> using the new POINTER_AND_FP_REGS register class which is now used
> instead of ALL_REGS.
> Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>
> Passes regress, OK for commit?
>
> Since it is a regression introduced in GCC8, OK to backport to GCC8?
>
> ChangeLog:
> 2018-05-22  Wilco Dijkstra  <wdijkstr@arm.com>
>
> 	* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
> 	Use POINTER_AND_FP_REGSinstead of ALL_REGS.
> 	* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase
> cost of r=w alternative.
> --
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index 2ebd256329c1a6a6b790d16955cbcee3feca456c..3d5fe44b53198a92afb726712c6e9dee890afe38 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -2961,7 +2961,7 @@ (define_insn "*aarch64_get_lane_zero_extendsi<mode>"
>  ;; is guaranteed so upper bits should be considered undefined.
>  ;; RTL uses GCC vector extension indices throughout so flip only for assembly.
>  (define_insn "aarch64_get_lane<mode>"
> -  [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv")
> +  [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv")
>  	(vec_select:<VEL>
>  	  (match_operand:VALL_F16 1 "register_operand" "w, w, w")
>  	  (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 47d98dfd095cdcd15908a86091cf2f8a4d6137b1..a119760c7f332aded200fa1b5bcfb1bbac7b6420 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1059,16 +1059,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
>  }
>  
>  /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.
> -   The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have
> -   the same cost even if ALL_REGS has a much larger cost.  ALL_REGS is also
> -   used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory
> -   cost (in this case the best class is the lowest cost one).  Using ALL_REGS
> -   irrespectively of its cost results in bad allocations with many redundant
> -   int<->FP moves which are expensive on various cores.
> -   To avoid this we don't allow ALL_REGS as the allocno class, but force a
> -   decision between FP_REGS and GENERAL_REGS.  We use the allocno class if it
> -   isn't ALL_REGS.  Similarly, use the best class if it isn't ALL_REGS.
> -   Otherwise set the allocno class depending on the mode.
> +   The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and
> +   GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much
> +   higher cost.  POINTER_AND_FP_REGS is also used if the cost of both FP_REGS
> +   and GENERAL_REGS is lower than the memory cost (in this case the best class
> +   is the lowest cost one).  Using POINTER_AND_FP_REGS irrespectively of its
> +   cost results in bad allocations with many redundant int<->FP moves which
> +   are expensive on various cores.
> +   To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but
> +   force a decision between FP_REGS and GENERAL_REGS.  We use the allocno class
> +   if it isn't POINTER_AND_FP_REGS.  Similarly, use the best class if it isn't
> +   POINTER_AND_FP_REGS.  Otherwise set the allocno class depending on the mode.
>     The result of this is that it is no longer inefficient to have a higher
>     memory move cost than the register move cost.
>  */
> @@ -1079,10 +1080,10 @@ aarch64_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class,
>  {
>    machine_mode mode;
>  
> -  if (allocno_class != ALL_REGS)
> +  if (allocno_class != POINTER_AND_FP_REGS)
>      return allocno_class;
>  
> -  if (best_class != ALL_REGS)
> +  if (best_class != POINTER_AND_FP_REGS)
>      return best_class;
>  
>    mode = PSEUDO_REGNO_MODE (regno);

I think it'd be better to use !reg_class_subset_p (POINTER_AND_FP_REGS, ...)
instead of ... != POINTER_AND_FP_REGS, since this in principle still applies
to ALL_REGS too.

FWIW, the patch looks good to me with that change.

Thanks,
Richard



More information about the Gcc-patches mailing list