]> gcc.gnu.org Git - gcc.git/commit
Add support for conditional xorsign [PR96373]
authorRichard Sandiford <richard.sandiford@arm.com>
Fri, 27 Jan 2023 17:03:51 +0000 (17:03 +0000)
committerRichard Sandiford <richard.sandiford@arm.com>
Fri, 27 Jan 2023 17:03:51 +0000 (17:03 +0000)
commit7486fe153adaa868f36248b72f3e78d18b1b3ba1
treea032a0741ea03f90921b41e02085a23d71dfde48
parent553f8003ba5ecfdf0574a171692843ef838226b4
Add support for conditional xorsign [PR96373]

This patch is an optimisation, but it's also a prerequisite for
fixing PR96373 without regressing vect-xorsign_exec.c.

Currently the vectoriser vectorises:

  for (i = 0; i < N; i++)
    r[i] = a[i] * __builtin_copysignf (1.0f, b[i]);

as two unconditional operations (copysign and mult).
tree-ssa-math-opts.cc later combines them into an "xorsign" function.
This works for both Advanced SIMD and SVE.

However, with the fix for PR96373, the vectoriser will instead
generate a conditional multiplication (IFN_COND_MUL).  Something then
needs to fold copysign & IFN_COND_MUL to the equivalent of a conditional
xorsign.  Three obvious options were:

(1) Extend tree-ssa-math-opts.cc.
(2) Do the fold in match.pd.
(3) Leave it to rtl combine.

I'm against (3), because this isn't a target-specific optimisation.
(1) would be possible, but would involve open-coding a lot of what
match.pd does for us.  And, in contrast to doing the current
tree-ssa-math-opts.cc optimisation in match.pd, there should be
no danger of (2) happening too early.  If we have an IFN_COND_MUL
then we're already past the stage of simplifying the original
source code.

There was also a choice between adding a conditional xorsign ifn
and simply open-coding the xorsign.  The latter seems simpler,
and means less boiler-plate for target-specific code.

The signed_or_unsigned_type_for change is needed to make sure
that we stay in "SVE space" when doing the optimisation on 128-bit
fixed-length SVE.

gcc/
PR tree-optimization/96373
* tree.h (sign_mask_for): Declare.
* tree.cc (sign_mask_for): New function.
(signed_or_unsigned_type_for): For vector types, try to use the
related_int_vector_mode.
* genmatch.cc (commutative_op): Handle conditional internal functions.
* match.pd: Fold an IFN_COND_MUL+copysign into an IFN_COND_XOR+and.

gcc/testsuite/
PR tree-optimization/96373
* gcc.target/aarch64/sve/cond_xorsign_1.c: New test.
* gcc.target/aarch64/sve/cond_xorsign_2.c: Likewise.
gcc/genmatch.cc
gcc/match.pd
gcc/testsuite/gcc.target/aarch64/sve/cond_xorsign_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_xorsign_2.c [new file with mode: 0644]
gcc/tree.cc
gcc/tree.h
This page took 0.067515 seconds and 6 git commands to generate.