This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RS6000] PR89271, gcc.target/powerpc/vsx-simode2.c
- From: Segher Boessenkool <segher at kernel dot crashing dot org>
- To: Alan Modra <amodra at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 28 Mar 2019 13:08:55 -0500
- Subject: Re: [RS6000] PR89271, gcc.target/powerpc/vsx-simode2.c
- References: <20190327134006.GA3194@bubble.grove.modra.org>
Hi!
On Thu, Mar 28, 2019 at 12:10:06AM +1030, Alan Modra wrote:
> This patch makes a number of corrections to rs6000_register_move_cost,
> adds a new register union class, GEN_OR_VSX_REGS, and adjusts insn
> alternative to suit.
Cool beans.
[ snip various great explanations, thanks! ]
> TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS is needed for rs6000 in order
> to fix the 20% cactus_adm spec regression when using GEN_OR_VSX_REGS
> as an allocno class. It is similar to the aarch64 version but without
> any selection by regno mode if the best class is a union class.
It would be nice if we could do without such hacks. Alas.
> OK assuming Pat's latest spec test run doesn't contain any nasty
> surprises?
Not for stage 4, no. Sorry. But it should be great in stage 1 :-)
> diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
> index 9fb36e41e7d..20c59f89c8f 100644
> --- a/gcc/config/rs6000/darwin.h
> +++ b/gcc/config/rs6000/darwin.h
> @@ -346,7 +346,8 @@ extern int darwin_emit_branch_islands;
> && reg_class_subset_p (BASE_REGS, (CLASS))) \
> ? BASE_REGS \
> : (GET_MODE_CLASS (GET_MODE (X)) == MODE_INT \
> - && (CLASS) == GEN_OR_FLOAT_REGS) \
> + && ((CLASS) == GEN_OR_FLOAT_REGS \
> + || (CLASS) == GEN_OR_VSX_REGS)) \
> ? GENERAL_REGS \
> : (CLASS))
Darwin doesn't do VSX at all... But maybe there is something that can get
allocated to both FPRs and VRs, sure. And GPRs.
This PREFERRED_RELOAD_CLASS, but it makes me worried if there are places
where we should care about this change for correctness.
> @@ -20236,7 +20239,8 @@ rs6000_preferred_reload_class (rtx x, enum reg_class rclass)
> return NO_REGS;
> }
>
> - if (GET_MODE_CLASS (mode) == MODE_INT && rclass == GEN_OR_FLOAT_REGS)
> + if (GET_MODE_CLASS (mode) == MODE_INT
> + && (rclass == GEN_OR_FLOAT_REGS || rclass == GEN_OR_VSX_REGS))
> return GENERAL_REGS;
Maybe do this whenever rclass contains the GPRs?
> @@ -34966,22 +34970,56 @@ rs6000_register_move_cost (machine_mode mode,
> reg_class_t from, reg_class_t to)
> {
> int ret;
> + reg_class_t rclass;
>
> if (TARGET_DEBUG_COST)
> dbg_cost_ctrl++;
>
> + /* If we have VSX, we can easily move between FPR or Altivec registers,
> + otherwise we can only easily move within classes.
> + Do this first so we give best-case answers for union classes
> + containing both gprs and vsx regs. */
> + HARD_REG_SET to_vsx, from_vsx;
> + COPY_HARD_REG_SET (to_vsx, reg_class_contents[to]);
> + AND_HARD_REG_SET (to_vsx, reg_class_contents[VSX_REGS]);
> + COPY_HARD_REG_SET (from_vsx, reg_class_contents[from]);
> + AND_HARD_REG_SET (from_vsx, reg_class_contents[VSX_REGS]);
This is a bit expensive to run at every call of rs6000_register_move_cost.
Can it be precomputed easily?
> + {
> + if (TARGET_DIRECT_MOVE
> + && (rs6000_tune == PROCESSOR_POWER8
> + || rs6000_tune == PROCESSOR_POWER9))
TARGET_DIRECT_MOVE is always on for these CPUs. Should this also use the
m*vsr* cost with say -mcpu=power7 -mtune=power9? If so you can simplify
this condition. Or maybe it give problems elsewhere? It's not really
worth spending to much time on, separate -mtune isn't used much at all.
> +static reg_class_t
> +rs6000_ira_change_pseudo_allocno_class (int regno ATTRIBUTE_UNUSED,
> + reg_class_t allocno_class,
> + reg_class_t best_class)
> +{
> + switch (allocno_class)
> + {
> + default:
> + break;
Please put default cases at the end.
> + gcc_checking_assert (best_class == GEN_OR_VSX_REGS
> + || best_class == GEN_OR_FLOAT_REGS
> + || best_class == VSX_REGS
> + || best_class == ALTIVEC_REGS
> + || best_class == FLOAT_REGS
> + || best_class == GENERAL_REGS
> + || best_class == BASE_REGS);
This list makes me think... Should there be a GEN_OR_ALTIVEC as well?
Segher