This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Vectorizing abs(char/short/int) on x86.
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Cong Hou <congh at google dot com>
- Cc: "Joseph S. Myers" <joseph at codesourcery dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
- Date: Thu, 24 Oct 2013 08:18:57 +0200
- Subject: Re: [PATCH] Vectorizing abs(char/short/int) on x86.
- Authentication-results: sourceware.org; auth=none
- References: <CAK=A3=3J2J_ZTT3qxO6z32kSNpA7dVFobYg=SQNi0udMY8E5VA at mail dot gmail dot com> <Pine dot LNX dot 4 dot 64 dot 1310231548200 dot 28882 at digraph dot polyomino dot org dot uk> <CAK=A3=3-GYLuGa5Di7fD7Cr+Kj0P_nhWLBh2wetE7dB0w4X3uQ at mail dot gmail dot com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Wed, Oct 23, 2013 at 09:40:21PM -0700, Cong Hou wrote:
> On Wed, Oct 23, 2013 at 8:52 AM, Joseph S. Myers
> <joseph@codesourcery.com> wrote:
> > On Tue, 22 Oct 2013, Cong Hou wrote:
> >
> >> For abs(char/short), type conversions are needed as the current abs()
> >> function/operation does not accept argument of char/short type.
> >> Therefore when we want to get the absolute value of a char_val using
> >> abs (char_val), it will be converted into abs ((int) char_val). It
> >> then can be vectorized, but the generated code is not efficient as
> >> lots of packings and unpackings are envolved. But if we convert
> >> (char) abs ((int) char_val) to abs (char_val), the vectorizer will be
> >> able to generate better code. Same for short.
> >
> > ABS_EXPR has undefined overflow behavior. Thus, abs ((int) -128) is
> > defined (and we also define the subsequent conversion of +128 to signed
> > char, which ISO C makes implementation-defined not undefined), and
> > converting to an ABS_EXPR on char would wrongly make it undefined. For
> > such a transformation to be valid (in the absence of VRP saying that -128
> > isn't a possible value) you'd need a GIMPLE representation for
> > ABS_EXPR<overflow:wrap>, as distinct from ABS_EXPR<overflow:undefined>.
> > You don't have the option there is for some arithmetic operations of
> > converting to a corresponding operation on unsigned types.
> >
>
> Yes, you are right. The method I use can guarantee wrapping on
> overflow (either shift-xor-sub or max(x, -x)). Can I just add the
> condition if (flag_wrapv) before the conversion I made to prevent the
> undefined behavior on overflow?
What HW insns you expand to is one thing, but if some GCC pass assumes that
ABS_EXPR always returns non-negative value (many do, look e.g. at
tree_unary_nonnegative_warnv_p, extract_range_from_unary_expr_1,
simplify_const_relational_operation, etc., you'd need to grep for all
ABS_EXPR/ABS occurrences) and optimizes code based on that fact, you get
wrong code because (char) abs((char) -128) is well defined.
If we change ABS_EXPR/ABS definition that it is well defined on the most
negative value of the typ (resp. mode), then we loose all those
optimizations, if we do that only for the char/short types, it would be
quite weird, though we could keep the benefits, but at the RTL level we'd
need to treat that way all the modes equal to short's mode and smaller (so,
for sizeof(short) == sizeof(int) target even int's mode).
The other possibility is not to create the ABS_EXPRs of char/short anywhere,
solve the vectorization issues either through tree-vect-patterns.c or
as part of the vectorization type demotion/promotions, see the recent
discussions for that, you'd represent the short/char abs for the vectorized
loop say using the shift-xor-sub or builtin etc. and if you want to do the
same thing for scalar code, you'd just have combiner try to match some
sequence.
Jakub