[PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.

Matthew Fortune Matthew.Fortune@imgtec.com
Tue Jan 31 12:15:00 GMT 2017


Toma Tabacu <Toma.Tabacu@imgtec.com> writes:
> The builtins for the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
> Loongson instructions have the third argument's type set to UQI while
> its corresponding insn operand is in SImode.
> 
> This results in the following error when matching insn operands:
> 
> ../gcc/gcc/include/loongson.h: In function 'test_psllw_s':
> ../gcc/gcc/include/loongson.h:483:10: error: invalid argument to built-
> in function
>    return __builtin_loongson_psllw_s (s, amount);
>           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> This causes the loongson-simd.c and loongson-shift-count-truncated-1.c
> tests to fail.
> 
> This patch fixes this by wrapping the QImode builtin argument inside a
> paradoxical SUBREG with SImode, which will successfully match against
> the insn operand.
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma
> 
> gcc/
> 
> 	* config/mips/mips.c (mips_expand_builtin_insn): Put the QImode
> 	argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
> 	builtins into an SImode paradoxical SUBREG.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index
> da7fa8f..f1ca6e2 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -16574,6 +16574,20 @@ mips_expand_builtin_insn (enum insn_code icode,
> unsigned int nops,
> 
>    switch (icode)
>      {
> +    /* The third argument needs to be in SImode in order to succesfully
> match
> +       the operand from the insn definition.  */

Please refer to operand here not argument as it is the second argument
to the builtin but third operand of the instruction.  Also double ss in 
successfully.

> +    case CODE_FOR_loongson_pshufh:
> +    case CODE_FOR_loongson_psllh:
> +    case CODE_FOR_loongson_psllw:
> +    case CODE_FOR_loongson_psrah:
> +    case CODE_FOR_loongson_psraw:
> +    case CODE_FOR_loongson_psrlh:
> +    case CODE_FOR_loongson_psrlw:
> +      gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> +      ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> +      ops[2].mode = SImode;
> +      break;
> +
>      case CODE_FOR_msa_addvi_b:
>      case CODE_FOR_msa_addvi_h:
>      case CODE_FOR_msa_addvi_w:

For the record, given paradoxical subregs are a headache...
I am OK with this on the basis that the argument to psllh etc is actually
a uint8_t which means that bits 8 upwards are guaranteed to be zero so
the subreg can be eliminated without any explicit sign or zero extension
inserted.  This is the same kind of optimisation that combine would
perform when eliminating zero extension.

Please can you check that a zero extension is inserted for the following
case with -O2 or above:

int16x4_t testme(int16x4_t s, int amount)
{
  return psllh_s (s, amount);
}

If my understanding is correct there should be an ANDI 0xff inserted
or similar.

OK with those changes and confirmation of the test above.

Thanks,
Matthew



More information about the Gcc-patches mailing list