This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RS6000] PR89271, gcc.target/powerpc/vsx-simode2.c


Hi!

On Thu, Mar 28, 2019 at 12:10:06AM +1030, Alan Modra wrote:
> This patch makes a number of corrections to rs6000_register_move_cost,
> adds a new register union class, GEN_OR_VSX_REGS, and adjusts insn
> alternative to suit.

Cool beans.

[ snip various great explanations, thanks! ]

> TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS is needed for rs6000 in order
> to fix the 20% cactus_adm spec regression when using GEN_OR_VSX_REGS
> as an allocno class.  It is similar to the aarch64 version but without
> any selection by regno mode if the best class is a union class.

It would be nice if we could do without such hacks.  Alas.

> OK assuming Pat's latest spec test run doesn't contain any nasty
> surprises?

Not for stage 4, no.  Sorry.  But it should be great in stage 1 :-)

> diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
> index 9fb36e41e7d..20c59f89c8f 100644
> --- a/gcc/config/rs6000/darwin.h
> +++ b/gcc/config/rs6000/darwin.h
> @@ -346,7 +346,8 @@ extern int darwin_emit_branch_islands;
>        && reg_class_subset_p (BASE_REGS, (CLASS)))		\
>     ? BASE_REGS							\
>     : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT			\
> -      && (CLASS) == GEN_OR_FLOAT_REGS)				\
> +      && ((CLASS) == GEN_OR_FLOAT_REGS				\
> +	  || (CLASS) == GEN_OR_VSX_REGS))			\
>     ? GENERAL_REGS						\
>     : (CLASS))

Darwin doesn't do VSX at all...  But maybe there is something that can get
allocated to both FPRs and VRs, sure.  And GPRs.

This PREFERRED_RELOAD_CLASS, but it makes me worried if there are places
where we should care about this change for correctness.

> @@ -20236,7 +20239,8 @@ rs6000_preferred_reload_class (rtx x, enum reg_class rclass)
>        return NO_REGS;
>      }
>  
> -  if (GET_MODE_CLASS (mode) == MODE_INT && rclass == GEN_OR_FLOAT_REGS)
> +  if (GET_MODE_CLASS (mode) == MODE_INT
> +      && (rclass == GEN_OR_FLOAT_REGS || rclass == GEN_OR_VSX_REGS))
>      return GENERAL_REGS;

Maybe do this whenever rclass contains the GPRs?

> @@ -34966,22 +34970,56 @@ rs6000_register_move_cost (machine_mode mode,
>  			   reg_class_t from, reg_class_t to)
>  {
>    int ret;
> +  reg_class_t rclass;
>  
>    if (TARGET_DEBUG_COST)
>      dbg_cost_ctrl++;
>  
> +  /* If we have VSX, we can easily move between FPR or Altivec registers,
> +     otherwise we can only easily move within classes.
> +     Do this first so we give best-case answers for union classes
> +     containing both gprs and vsx regs.  */
> +  HARD_REG_SET to_vsx, from_vsx;
> +  COPY_HARD_REG_SET (to_vsx, reg_class_contents[to]);
> +  AND_HARD_REG_SET (to_vsx, reg_class_contents[VSX_REGS]);
> +  COPY_HARD_REG_SET (from_vsx, reg_class_contents[from]);
> +  AND_HARD_REG_SET (from_vsx, reg_class_contents[VSX_REGS]);

This is a bit expensive to run at every call of rs6000_register_move_cost.
Can it be precomputed easily?

> +	{
> +	  if (TARGET_DIRECT_MOVE
> +	      && (rs6000_tune == PROCESSOR_POWER8
> +		  || rs6000_tune == PROCESSOR_POWER9))

TARGET_DIRECT_MOVE is always on for these CPUs.  Should this also use the
m*vsr* cost with say -mcpu=power7 -mtune=power9?  If so you can simplify
this condition.  Or maybe it give problems elsewhere?  It's not really
worth spending to much time on, separate -mtune isn't used much at all.

> +static reg_class_t
> +rs6000_ira_change_pseudo_allocno_class (int regno ATTRIBUTE_UNUSED,
> +					reg_class_t allocno_class,
> +					reg_class_t best_class)
> +{
> +  switch (allocno_class)
> +    {
> +    default:
> +      break;

Please put default cases at the end.

> +      gcc_checking_assert (best_class == GEN_OR_VSX_REGS
> +			   || best_class == GEN_OR_FLOAT_REGS
> +			   || best_class == VSX_REGS
> +			   || best_class == ALTIVEC_REGS
> +			   || best_class == FLOAT_REGS
> +			   || best_class == GENERAL_REGS
> +			   || best_class == BASE_REGS);

This list makes me think...  Should there be a GEN_OR_ALTIVEC as well?


Segher


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]