This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.


Toma Tabacu <Toma.Tabacu@imgtec.com> writes:
> > From: Matthew Fortune
> > > +    /* The third argument needs to be in SImode in order to succesfully
> > > match
> > > +       the operand from the insn definition.  */
> >
> > Please refer to operand here not argument as it is the second argument
> > to the builtin but third operand of the instruction.  Also double ss in
> > successfully.
> >
> 
> I have rewritten the comment to address these mistakes.
> 
> > > +    case CODE_FOR_loongson_pshufh:
> > > +    case CODE_FOR_loongson_psllh:
> > > +    case CODE_FOR_loongson_psllw:
> > > +    case CODE_FOR_loongson_psrah:
> > > +    case CODE_FOR_loongson_psraw:
> > > +    case CODE_FOR_loongson_psrlh:
> > > +    case CODE_FOR_loongson_psrlw:
> > > +      gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> > > +      ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> > > +      ops[2].mode = SImode;
> > > +      break;
> > > +
> > >      case CODE_FOR_msa_addvi_b:
> > >      case CODE_FOR_msa_addvi_h:
> > >      case CODE_FOR_msa_addvi_w:
> >
> > For the record, given paradoxical subregs are a headache...
> > I am OK with this on the basis that the argument to psllh etc is actually
> > a uint8_t which means that bits 8 upwards are guaranteed to be zero so
> > the subreg can be eliminated without any explicit sign or zero extension
> > inserted.  This is the same kind of optimisation that combine would
> > perform when eliminating zero extension.
> >
> > Please can you check that a zero extension is inserted for the following
> > case with -O2 or above:
> >
> > int16x4_t testme(int16x4_t s, int amount)
> > {
> >   return psllh_s (s, amount);
> > }
> >
> > If my understanding is correct there should be an ANDI 0xff inserted
> > or similar.
> >
> 
> The ANDI 0xff is present for -O0, after the first time the value is loaded
> from memory, but it is not generated for -O1 and -O2.
> I'm not seeing any zero extension happening for -O1 and -O2.

That's not what I hoped but is what I was concerned about as I believe it
means we have a change of behaviour.  It boils down to simply ignoring the
argument type of unsigned char.  My guess is that a zero extension is
created but then immediately eliminated because of the paradoxical subreg.

I think you need to create a temporary and perform the zero extension to
ensure we honour the unsigned char operand:

  rtx new_dst = gen_reg_rtx (SImode);
  emit_insn (gen_zero_extendqisi2 (new_dst, ops[2].value));
  ops[2].value = foo;

This should mean that the testcase I sent always has a zero extension but if
you change the type of 'amount' to be unsigned char then there should not be
a zero extension as the argument will be assumed to be correctly zero extended
already and the explicitly introduced zero_extend will be eliminated.

Apologies for not proposing this before.

Thanks,
Matthew

> 
> The only change in the patch below is the fixed comment.
> 
> Regards,
> Toma
> 
> gcc/
> 
> 	* config/mips/mips.c (mips_expand_builtin_insn): Put the QImode
> 	argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
> 	builtins into an SImode paradoxical SUBREG.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index da7fa8f..e5b2d9a 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -16574,6 +16574,20 @@ mips_expand_builtin_insn (enum insn_code icode, unsigned int
> nops,
> 
>    switch (icode)
>      {
> +    /* The third operand of these instructions is in SImode, so we need to
> +       bring the corresponding builtin argument from QImode into SImode.  */
> +    case CODE_FOR_loongson_pshufh:
> +    case CODE_FOR_loongson_psllh:
> +    case CODE_FOR_loongson_psllw:
> +    case CODE_FOR_loongson_psrah:
> +    case CODE_FOR_loongson_psraw:
> +    case CODE_FOR_loongson_psrlh:
> +    case CODE_FOR_loongson_psrlw:
> +      gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> +      ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> +      ops[2].mode = SImode;
> +      break;
> +
>      case CODE_FOR_msa_addvi_b:
>      case CODE_FOR_msa_addvi_h:
>      case CODE_FOR_msa_addvi_w:


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]