[RFC, vectorizer] Allow half_type for left shift in vect_operation_fits_smaller_type?
Richard Biener
rguenther@suse.de
Thu Sep 21 14:49:00 GMT 2017
On Thu, 21 Sep 2017, Jon Beniston wrote:
> Hi,
>
> The GCC vectorizer can't vectorize the following loop even though the target
> supports 2-lane SIMD left shift.
>
> short a[256], b[256];
> foo ()
> {
> int i;
> for (i=0; i<256; i++)
> { a[i] = b[i] << 4; }
> }
>
> The reason seems to be GCC is promoting the source from short to int, then
> performing left shift on int type and finally a type demotion is done to
> covert it back to short. Below is the related tree dump:
>
> _2 = (intD.1) _1;
> # RANGE [-524288, 524272] NONZERO 4294967280
> _3 = _2 << 4;
> # RANGE [-32768, 32767] NONZERO 65520
> _4 = (short intD.10) _3;
> # .MEM_8 = VDEF <.MEM_14>
> aD.1888[i_13] = _4;
>
> I checked tree-vect-patterns.c and found there is a pattern recognizer
> "vect_recog_over_widening_pattern" to recognize such sequences already.
>
> But, in vect_operation_fits_smaller_type, it only recognizes the sequences
> when the promoted type is 4 times wider than the original type. The reason
> seems to be the original proposal at:
>
> https://gcc.gnu.org/ml/gcc-patches/2011-07/msg01472.html
>
> is to handle the following sequences where three types are involved, and the
> width, T_PROMOTED = 2 * T_INTER = 4 * T_ORIG.
>
> T_ORIG a;
> T_PROMOTED b, c;
> T_INTER d;
>
> b = (T_PROMOTED) a;
> c = b << 2;
> d = (T_INTER) c;
>
> While we could also handle the following sequence where only two types are
> involved, and T_PROMOTED = 2 * T_ORIG
>
> T_ORIG a;
> T_PROMOTED b, c, d;
>
> b = (T_PROMOTED) a;
> c = b << 2;
> d = (T_ORIG) c;
>
> Performing the left shift on T_ORIG directly should be equal to performing
> it on T_PROMOTED then converting back to T_ORIG.
>
> x86-64/AArch64/PPC64 bootstrap OK (finished on gcc farms) and no regression
> on check-gcc/g++.
>
> gcc/
> 2017-09-21 Jon Beniston <jon@beniston.com>
>
> * tree-vect-patterns.c (vect_opertion_fits_smaller_type): Allow
> half_type for LSHIFT_EXPR.
>
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index cdad261..0abf37c 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -1318,7 +1318,12 @@ vect_operation_fits_smaller_type (gimple *stmt, tree
> def, tree *new_type,
> break;
>
> case LSHIFT_EXPR:
> - /* Try intermediate type - HALF_TYPE is not enough for sure. */
> + /* Try half_type. */
> + if (TYPE_PRECISION (type) == TYPE_PRECISION (half_type) * 2
> + && vect_supportable_shift (code, half_type))
> + break;
> +
> + /* Try intermediate type. */
> if (TYPE_PRECISION (type) < (TYPE_PRECISION (half_type) * 4))
> return false;
Not digged long into this "interesting" function but this case is
only valid if type == final type and if the result is not shifted
back. vect_recog_over_widening_pattern works on a whole sequence
of stmts after all, thus
b = (T_PROMOTED) a;
c = b << 2;
d = b >> 2;
e = (T_ORIG) b;
would be miscompiled by your new case.
Richard.
More information about the Gcc-patches
mailing list