[PATCH][GCC][PR target/98177] aarch64: SVE: ICE in expand_direct_optab_fn

Wed Dec 16 12:49:32 GMT 2020

Przemyslaw Wirkus <Przemyslaw.Wirkus@arm.com> writes:
> > This is a bug in the vectoriser: the vectoriser shouldn't generate
> > IFN_REDUC_MAX calls that the target doesn't support.
> > 
> > I think the problem comes from using the wrong interface to get the index
> > type for a COND_REDUCTION.  vectorizable_reduction has:
> > 
> >       cr_index_vector_type = build_vector_type (cr_index_scalar_type,
> >                                                 nunits_out);
> > 
> > which means that for fixed-length SVE we get a V2SI (a 64-bit Advanced SIMD
> > vector) instead of a VNx2SI (an SVE vector that stores SI elements in DI
> > containers).  It should be using:
> > 
> >       cr_index_vector_type = get_same_sized_vectype (cr_index_scalar_type,
> >                                                      vectype_out);
> > 
> > instead.  Same idea for the build_vector_type call in
> > vect_create_epilog_for_reduction.

Note that for this last bit I meant:

      tree vectype_unsigned = build_vector_type
	(scalar_type_unsigned, TYPE_VECTOR_SUBPARTS (vectype));

which should become:

      tree vectype_unsigned = get_same_sized_vectype (scalar_type_unsigned,
						      vectype);

This is the “transform” code that partners the “analysis” code that
you're patching.  Changing one but not the other would cause problems
if (say) the Advanced SIMD REDUC_MAX patterns were disabled.  We'd then
correctly pick an SVE mode like VNx4SI when doing the analysis, but generate
an unsupported V4SI REDUC_MAX in vect_create_epilog_for_reduction.
That in turn would trip the kind of expand-time assert that was
reported in the PR, just for a different case.

It's better for the modes to match up anyway: we should use a VNx4SI
reduction when operating on SVE and a V4SI reducation when operating on
Advanced SIMD.  This is particularly true for big endian, where mixing
SVE and Advanced SIMD can involve a permute.

> diff --git a/gcc/testsuite/g++.target/aarch64/pr98177-1.C b/gcc/testsuite/g++.target/aarch64/pr98177-1.C
> new file mode 100644
> index 0000000000000000000000000000000000000000..a776b7352f966f6b1d870ed51a7c94647bc46d80
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/pr98177-1.C
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -march=armv8.2-a+sve -msve-vector-bits=128" } */
> +
> +int a, b;
> +short c;
> +void d(long e) {
> +  for (int f = 0; f < b; f += 1)
> +    for (short g = 0; g < c; g += 5)
> +      a = (short)e;
> +}

It'd be better to put these g++.target/aarch64/sve and drop the -march
option.  That way we'll test with the user's specified -march or -mcpu
if that -march/-mcpu already supports SVE.

Same idea for the other tests (including the C ones).

OK for trunk with those changes, thanks.

Richard