This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, rs6000] Fold vector shifts in GIMPLE
> On Jun 6, 2017, at 11:37 AM, Will Schmidt <will_schmidt@vnet.ibm.com> wrote:
>
> On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote:
>>> On Jun 1, 2017, at 2:48 AM, Richard Biener <richard.guenther@gmail.com> wrote:
>>>
>>> On Wed, May 31, 2017 at 10:01 PM, Will Schmidt
>>> <will_schmidt@vnet.ibm.com> wrote:
>>>> Hi,
>>>>
>>>> Add support for early expansion of vector shifts. Including
>>>> vec_sl (shift left), vec_sr (shift right), vec_sra (shift
>>>> right algebraic), vec_rl (rotate left).
>>>> Part of this includes adding the vector shift right instructions to
>>>> the list of those instructions having an unsigned second argument.
>>>>
>>>> The VSR (vector shift right) folding is a bit more complex than
>>>> the others. This is due to requiring arg0 be unsigned for an algebraic
>>>> shift before the gimple RSHIFT_EXPR assignment is built.
>>>
>>> Jakub, do we sanitize that undefinedness of left shifts of negative values
>>> and/or overflow of left shift of nonnegative values?
>
>
> On Thu, 2017-06-01 at 10:17 +0200, Jakub Jelinek wrote:
>> We don't yet, see PR77823 - all I've managed to do before stage1 was over
>> was instrumentation of signed arithmetic integer overflow on vectors,
>> division, shift etc. are tasks maybe for this stage1.
>>
>> That said, shift instrumentation in particular is done early because every
>> FE has different rules, and so if it is coming from target builtins that are
>> folded into something, it wouldn't be instrumented anyway.
>
>
> On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote:
>>>
>>> Will, how is that defined in the intrinsics operation? It might need similar
>>> treatment as the abs case.
>>
>> Answering for Will -- vec_sl is defined to simply shift bits off the end to the
>> left and fill with zeros from the right, regardless of whether the source type
>> is signed or unsigned. The result type is signed iff the source type is
>> signed. So a negative value can become positive as a result of the
>> operation.
>>
>> The same is true of vec_rl, which will naturally rotate bits regardless of
>> signedness.
>
>
>>>
>>> [I'd rather make the negative left shift case implementation defined
>>> given C and C++ standards
>>> do not agree to 100% AFAIK]
>
> With the above answers, how does this one stand?
>
> [ I have no issue adding the TYPE_OVERFLOW_WRAPS logic to treat some of
> the cases differently, I'm just unclear on whether none/some/all of the
> shifts will require that logic. :-) ]
I have to defer to Richard here, I don't know the subtleties well enough.
Bill
>
> thanks,
> -Will
>
>
>
>
>>>
>>> Richard.
>>>
>>>> [gcc]
>>>>
>>>> 2017-05-26 Will Schmidt <will_schmidt@vnet.ibm.com>
>>>>
>>>> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling
>>>> for early expansion of vector shifts (sl,sr,sra,rl).
>>>> (builtin_function_type): Add vector shift right instructions
>>>> to the unsigned argument list.
>>>>
>>>> [gcc/testsuite]
>>>>
>>>> 2017-05-26 Will Schmidt <will_schmidt@vnet.ibm.com>
>>>>
>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New.
>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New.
>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New.
>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New.
>>>>
>>>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>>>> index 8adbc06..6ee0bfd 100644
>>>> --- a/gcc/config/rs6000/rs6000.c
>>>> +++ b/gcc/config/rs6000/rs6000.c
>>>> @@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>>>> gsi_replace (gsi, g, true);
>>>> return true;
>>>> }
>>>> + /* Flavors of vec_rotate_left . */
>>>> + case ALTIVEC_BUILTIN_VRLB:
>>>> + case ALTIVEC_BUILTIN_VRLH:
>>>> + case ALTIVEC_BUILTIN_VRLW:
>>>> + case P8V_BUILTIN_VRLD:
>>>> + {
>>>> + arg0 = gimple_call_arg (stmt, 0);
>>>> + arg1 = gimple_call_arg (stmt, 1);
>>>> + lhs = gimple_call_lhs (stmt);
>>>> + gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
>>>> + gimple_set_location (g, gimple_location (stmt));
>>>> + gsi_replace (gsi, g, true);
>>>> + return true;
>>>> + }
>>>> + /* Flavors of vector shift right algebraic. vec_sra{b,h,w} -> vsra{b,h,w}. */
>>>> + case ALTIVEC_BUILTIN_VSRAB:
>>>> + case ALTIVEC_BUILTIN_VSRAH:
>>>> + case ALTIVEC_BUILTIN_VSRAW:
>>>> + case P8V_BUILTIN_VSRAD:
>>>> + {
>>>> + arg0 = gimple_call_arg (stmt, 0);
>>>> + arg1 = gimple_call_arg (stmt, 1);
>>>> + lhs = gimple_call_lhs (stmt);
>>>> + gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
>>>> + gimple_set_location (g, gimple_location (stmt));
>>>> + gsi_replace (gsi, g, true);
>>>> + return true;
>>>> + }
>>>> + /* Flavors of vector shift left. builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}. */
>>>> + case ALTIVEC_BUILTIN_VSLB:
>>>> + case ALTIVEC_BUILTIN_VSLH:
>>>> + case ALTIVEC_BUILTIN_VSLW:
>>>> + case P8V_BUILTIN_VSLD:
>>>> + {
>>>> + arg0 = gimple_call_arg (stmt, 0);
>>>> + arg1 = gimple_call_arg (stmt, 1);
>>>> + lhs = gimple_call_lhs (stmt);
>>>> + gimple *g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1);
>>>> + gimple_set_location (g, gimple_location (stmt));
>>>> + gsi_replace (gsi, g, true);
>>>> + return true;
>>>> + }
>>>> + /* Flavors of vector shift right. */
>>>> + case ALTIVEC_BUILTIN_VSRB:
>>>> + case ALTIVEC_BUILTIN_VSRH:
>>>> + case ALTIVEC_BUILTIN_VSRW:
>>>> + case P8V_BUILTIN_VSRD:
>>>> + {
>>>> + arg0 = gimple_call_arg (stmt, 0);
>>>> + arg1 = gimple_call_arg (stmt, 1);
>>>> + lhs = gimple_call_lhs (stmt);
>>>> + gimple *g;
>>>> + /* convert arg0 to unsigned */
>>>> + arg0 = convert(unsigned_type_for(TREE_TYPE(arg0)),arg0);
>>>> + tree arg0_uns = create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(arg0)));
>>>> + g = gimple_build_assign(arg0_uns,arg0);
>>>> + gimple_set_location (g, gimple_location (stmt));
>>>> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
>>>> + /* convert lhs to unsigned and do the shift. */
>>>> + tree lhs_uns = create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(lhs)));
>>>> + g = gimple_build_assign (lhs_uns, RSHIFT_EXPR, arg0_uns, arg1);
>>>> + gimple_set_location (g, gimple_location (stmt));
>>>> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
>>>> + /* convert lhs back to a signed type for the return. */
>>>> + lhs_uns = convert(signed_type_for(TREE_TYPE(lhs)),lhs_uns);
>>>> + g = gimple_build_assign(lhs,lhs_uns);
>>>> + gimple_set_location (g, gimple_location (stmt));
>>>> + gsi_replace (gsi, g, true);
>>>> + return true;
>>>> + }
>>>> default:
>>>> break;
>>>> }
>>>> @@ -19128,6 +19198,14 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
>>>> h.uns_p[2] = 1;
>>>> break;
>>>>
>>>> + /* unsigned second arguments (vector shift right). */
>>>> + case ALTIVEC_BUILTIN_VSRB:
>>>> + case ALTIVEC_BUILTIN_VSRH:
>>>> + case ALTIVEC_BUILTIN_VSRW:
>>>> + case P8V_BUILTIN_VSRD:
>>>> + h.uns_p[2] = 1;
>>>> + break;
>>>> +
>>>> default:
>>>> break;
>>>> }
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c
>>>> new file mode 100644
>>>> index 0000000..ebe91e7
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c
>>>> @@ -0,0 +1,66 @@
>>>> +/* Verify that overloaded built-ins for vec_sl with char
>>>> + inputs produce the right results. */
>>>> +
>>>> +/* { dg-do compile } */
>>>> +/* { dg-require-effective-target powerpc_altivec_ok } */
>>>> +/* { dg-options "-maltivec -O2" } */
>>>> +
>>>> +#include <altivec.h>
>>>> +
>>>> +//# vec_sl - shift left
>>>> +//# vec_sr - shift right
>>>> +//# vec_sra - shift right algebraic
>>>> +//# vec_rl - rotate left
>>>> +
>>>> +vector signed char
>>>> +testsl_signed (vector signed char x, vector unsigned char y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned char
>>>> +testsl_unsigned (vector unsigned char x, vector unsigned char y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector signed char
>>>> +testsr_signed (vector signed char x, vector unsigned char y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned char
>>>> +testsr_unsigned (vector unsigned char x, vector unsigned char y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector signed char
>>>> +testsra_signed (vector signed char x, vector unsigned char y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned char
>>>> +testsra_unsigned (vector unsigned char x, vector unsigned char y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector signed char
>>>> +testrl_signed (vector signed char x, vector unsigned char y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned char
>>>> +testrl_unsigned (vector unsigned char x, vector unsigned char y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-assembler-times "vslb" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrb" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrab" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vrlb" 2 } } */
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-int.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-int.c
>>>> new file mode 100644
>>>> index 0000000..e9c5fe1
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-int.c
>>>> @@ -0,0 +1,61 @@
>>>> +/* Verify that overloaded built-ins for vec_sl with int
>>>> + inputs produce the right results. */
>>>> +
>>>> +/* { dg-do compile } */
>>>> +/* { dg-require-effective-target powerpc_altivec_ok } */
>>>> +/* { dg-options "-maltivec -O2" } */
>>>> +
>>>> +#include <altivec.h>
>>>> +
>>>> +vector signed int
>>>> +testsl_signed (vector signed int x, vector unsigned int y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned int
>>>> +testsl_unsigned (vector unsigned int x, vector unsigned int y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector signed int
>>>> +testsr_signed (vector signed int x, vector unsigned int y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned int
>>>> +testsr_unsigned (vector unsigned int x, vector unsigned int y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector signed int
>>>> +testsra_signed (vector signed int x, vector unsigned int y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned int
>>>> +testsra_unsigned (vector unsigned int x, vector unsigned int y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector signed int
>>>> +testrl_signed (vector signed int x, vector unsigned int y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned int
>>>> +testrl_unsigned (vector unsigned int x, vector unsigned int y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-assembler-times "vslw" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrw" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsraw" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vrlw" 2 } } */
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c
>>>> new file mode 100644
>>>> index 0000000..97b82cf
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c
>>>> @@ -0,0 +1,63 @@
>>>> +/* Verify that overloaded built-ins for vec_sl with long long
>>>> + inputs produce the right results. */
>>>> +
>>>> +/* { dg-do compile } */
>>>> +/* { dg-require-effective-target powerpc_p8vector_ok } */
>>>> +/* { dg-options "-mpower8-vector -O2" } */
>>>> +
>>>> +#include <altivec.h>
>>>> +
>>>> +vector signed long long
>>>> +testsl_signed (vector signed long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned long long
>>>> +testsl_unsigned (vector unsigned long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector signed long long
>>>> +testsr_signed (vector signed long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned long long
>>>> +testsr_unsigned (vector unsigned long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector signed long long
>>>> +testsra_signed (vector signed long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +/* watch for PR 79544 here (vsrd / vsrad issue) */
>>>> +vector unsigned long long
>>>> +testsra_unsigned (vector unsigned long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector signed long long
>>>> +testrl_signed (vector signed long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned long long
>>>> +testrl_unsigned (vector unsigned long long x, vector unsigned long long y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-assembler-times "vsld" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrd" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrad" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vrld" 2 } } */
>>>> +
>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-short.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-short.c
>>>> new file mode 100644
>>>> index 0000000..4ca7c18
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-short.c
>>>> @@ -0,0 +1,61 @@
>>>> +/* Verify that overloaded built-ins for vec_sl with short
>>>> + inputs produce the right results. */
>>>> +
>>>> +/* { dg-do compile } */
>>>> +/* { dg-require-effective-target powerpc_altivec_ok } */
>>>> +/* { dg-options "-maltivec -O2" } */
>>>> +
>>>> +#include <altivec.h>
>>>> +
>>>> +vector signed short
>>>> +testsl_signed (vector signed short x, vector unsigned short y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned short
>>>> +testsl_unsigned (vector unsigned short x, vector unsigned short y)
>>>> +{
>>>> + return vec_sl (x, y);
>>>> +}
>>>> +
>>>> +vector signed short
>>>> +testsr_signed (vector signed short x, vector unsigned short y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned short
>>>> +testsr_unsigned (vector unsigned short x, vector unsigned short y)
>>>> +{
>>>> + return vec_sr (x, y);
>>>> +}
>>>> +
>>>> +vector signed short
>>>> +testsra_signed (vector signed short x, vector unsigned short y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned short
>>>> +testsra_unsigned (vector unsigned short x, vector unsigned short y)
>>>> +{
>>>> + return vec_sra (x, y);
>>>> +}
>>>> +
>>>> +vector signed short
>>>> +testrl_signed (vector signed short x, vector unsigned short y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +vector unsigned short
>>>> +testrl_unsigned (vector unsigned short x, vector unsigned short y)
>>>> +{
>>>> + return vec_rl (x, y);
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-assembler-times "vslh" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrh" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vsrah" 2 } } */
>>>> +/* { dg-final { scan-assembler-times "vrlh" 2 } } */