[PATCH][RFC] __builtin_shuffle sometimes should produce zip1 rather than TBL (PR82199)

Richard Sandiford richard.sandiford@arm.com
Wed Jul 8 14:48:39 GMT 2020


Dmitrij Pochepko <dmitrij.pochepko@bell-sw.com> writes:
> Hi,
>
> thank you for looking into this.
>
> I prepared new patch with all your comments addressed.

Thanks, looks good, just a couple of minor things:

> @@ -20090,6 +20092,62 @@ aarch64_evpc_trn (struct expand_vec_perm_d *d)
>    return true;
>  }
>  
> +/* Try to re-encode the PERM constant so it use the bigger size up.

maybe s/use bigger size up/combines odd and even elements/

> +   This rewrites constants such as {0, 1, 4, 5}/V4SF to {0, 2}/V2DI.
> +   We retry with this new constant with the full suite of patterns.  */
> +static bool
> +aarch64_evpc_reencode (struct expand_vec_perm_d *d)
> +{
> +  expand_vec_perm_d newd;
> +  unsigned HOST_WIDE_INT nelt;
> +
> +  if (d->vec_flags != VEC_ADVSIMD)
> +    return false;
> +
> +  /* Get the new mode.  Always twice the size of the inner
> +     and half the elements.  */
> +  poly_uint64 vec_bits = GET_MODE_BITSIZE (d->vmode);
> +  unsigned int new_elt_bits = GET_MODE_UNIT_BITSIZE (d->vmode) * 2;
> +  auto new_elt_mode = int_mode_for_size (new_elt_bits, false).require ();
> +  machine_mode new_mode = aarch64_simd_container_mode (new_elt_mode, vec_bits);
> +
> +  if (new_mode == word_mode)
> +    return false;
> +
> +  /* to_constant is safe since this routine is specific to Advanced SIMD
> +     vectors.  */
> +  nelt = d->perm.length ().to_constant ();
> +
> +  vec_perm_builder newpermconst;
> +  newpermconst.new_vector (nelt / 2, nelt / 2, 1);
> +
> +  /* Convert the perm constant if we can.  Require even, odd as the pairs.  */
> +  for (unsigned int i = 0; i < nelt; i += 2)
> +    {
> +      poly_int64 elt_poly0 = d->perm[i];
> +      poly_int64 elt_poly1 = d->perm[i+1];
> +      if (!elt_poly0.is_constant () || !elt_poly1.is_constant ())
> +	return false;
> +      unsigned int elt0 = elt_poly0.to_constant ();
> +      unsigned int elt1 = elt_poly1.to_constant ();
> +      if ((elt0 & 1) != 0 || elt0 + 1 != elt1)
> +	return false;
> +      newpermconst.quick_push (elt0 / 2);

It should be possible to do this without the to_constants, e.g.:

  poly_int64 elt0 = d->perm[i];
  poly_int64 elt1 = d->perm[i + 1];
  poly_int64 newelt;
  if (!multiple_p (elt0, 2, &newelt) || maybe_ne (elt0 + 1, elt1))
    return false;

(The coding conventions require spaces around “+”, even though I agree
“[i+1]” looks better.)

Looks good otherwise.

Richard


More information about the Gcc-patches mailing list