[PATCH][RFC] __builtin_shuffle sometimes should produce zip1 rather than TBL (PR82199)
Richard Sandiford
richard.sandiford@arm.com
Wed Jul 8 14:48:39 GMT 2020
Dmitrij Pochepko <dmitrij.pochepko@bell-sw.com> writes:
> Hi,
>
> thank you for looking into this.
>
> I prepared new patch with all your comments addressed.
Thanks, looks good, just a couple of minor things:
> @@ -20090,6 +20092,62 @@ aarch64_evpc_trn (struct expand_vec_perm_d *d)
> return true;
> }
>
> +/* Try to re-encode the PERM constant so it use the bigger size up.
maybe s/use bigger size up/combines odd and even elements/
> + This rewrites constants such as {0, 1, 4, 5}/V4SF to {0, 2}/V2DI.
> + We retry with this new constant with the full suite of patterns. */
> +static bool
> +aarch64_evpc_reencode (struct expand_vec_perm_d *d)
> +{
> + expand_vec_perm_d newd;
> + unsigned HOST_WIDE_INT nelt;
> +
> + if (d->vec_flags != VEC_ADVSIMD)
> + return false;
> +
> + /* Get the new mode. Always twice the size of the inner
> + and half the elements. */
> + poly_uint64 vec_bits = GET_MODE_BITSIZE (d->vmode);
> + unsigned int new_elt_bits = GET_MODE_UNIT_BITSIZE (d->vmode) * 2;
> + auto new_elt_mode = int_mode_for_size (new_elt_bits, false).require ();
> + machine_mode new_mode = aarch64_simd_container_mode (new_elt_mode, vec_bits);
> +
> + if (new_mode == word_mode)
> + return false;
> +
> + /* to_constant is safe since this routine is specific to Advanced SIMD
> + vectors. */
> + nelt = d->perm.length ().to_constant ();
> +
> + vec_perm_builder newpermconst;
> + newpermconst.new_vector (nelt / 2, nelt / 2, 1);
> +
> + /* Convert the perm constant if we can. Require even, odd as the pairs. */
> + for (unsigned int i = 0; i < nelt; i += 2)
> + {
> + poly_int64 elt_poly0 = d->perm[i];
> + poly_int64 elt_poly1 = d->perm[i+1];
> + if (!elt_poly0.is_constant () || !elt_poly1.is_constant ())
> + return false;
> + unsigned int elt0 = elt_poly0.to_constant ();
> + unsigned int elt1 = elt_poly1.to_constant ();
> + if ((elt0 & 1) != 0 || elt0 + 1 != elt1)
> + return false;
> + newpermconst.quick_push (elt0 / 2);
It should be possible to do this without the to_constants, e.g.:
poly_int64 elt0 = d->perm[i];
poly_int64 elt1 = d->perm[i + 1];
poly_int64 newelt;
if (!multiple_p (elt0, 2, &newelt) || maybe_ne (elt0 + 1, elt1))
return false;
(The coding conventions require spaces around “+”, even though I agree
“[i+1]” looks better.)
Looks good otherwise.
Richard
More information about the Gcc-patches
mailing list