This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Vectorizing abs(char/short/int) on x86.


On Wed, Oct 23, 2013 at 09:40:21PM -0700, Cong Hou wrote:
> On Wed, Oct 23, 2013 at 8:52 AM, Joseph S. Myers
> <joseph@codesourcery.com> wrote:
> > On Tue, 22 Oct 2013, Cong Hou wrote:
> >
> >> For abs(char/short), type conversions are needed as the current abs()
> >> function/operation does not accept argument of char/short type.
> >> Therefore when we want to get the absolute value of a char_val using
> >> abs (char_val), it will be converted into abs ((int) char_val). It
> >> then can be vectorized, but the generated code is not efficient as
> >> lots of packings and unpackings are envolved. But if we convert
> >> (char) abs ((int) char_val) to abs (char_val), the vectorizer will be
> >> able to generate better code. Same for short.
> >
> > ABS_EXPR has undefined overflow behavior.  Thus, abs ((int) -128) is
> > defined (and we also define the subsequent conversion of +128 to signed
> > char, which ISO C makes implementation-defined not undefined), and
> > converting to an ABS_EXPR on char would wrongly make it undefined.  For
> > such a transformation to be valid (in the absence of VRP saying that -128
> > isn't a possible value) you'd need a GIMPLE representation for
> > ABS_EXPR<overflow:wrap>, as distinct from ABS_EXPR<overflow:undefined>.
> > You don't have the option there is for some arithmetic operations of
> > converting to a corresponding operation on unsigned types.
> >
> 
> Yes, you are right. The method I use can guarantee wrapping on
> overflow (either shift-xor-sub or max(x, -x)). Can I just add the
> condition if (flag_wrapv) before the conversion I made to prevent the
> undefined behavior on overflow?

What HW insns you expand to is one thing, but if some GCC pass assumes that
ABS_EXPR always returns non-negative value (many do, look e.g. at
tree_unary_nonnegative_warnv_p, extract_range_from_unary_expr_1,
simplify_const_relational_operation, etc., you'd need to grep for all
ABS_EXPR/ABS occurrences) and optimizes code based on that fact, you get
wrong code because (char) abs((char) -128) is well defined.
If we change ABS_EXPR/ABS definition that it is well defined on the most
negative value of the typ (resp. mode), then we loose all those
optimizations, if we do that only for the char/short types, it would be
quite weird, though we could keep the benefits, but at the RTL level we'd
need to treat that way all the modes equal to short's mode and smaller (so,
for sizeof(short) == sizeof(int) target even int's mode).

The other possibility is not to create the ABS_EXPRs of char/short anywhere,
solve the vectorization issues either through tree-vect-patterns.c or
as part of the vectorization type demotion/promotions, see the recent
discussions for that, you'd represent the short/char abs for the vectorized
loop say using the shift-xor-sub or builtin etc. and if you want to do the
same thing for scalar code, you'd just have combiner try to match some
sequence.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]