[PATCH] rs6000: Vector shift-right should honor modulo semantics

Bill Schmidt wschmidt@linux.ibm.com
Mon Feb 11 02:42:00 GMT 2019


On 2/10/19 4:05 PM, Segher Boessenkool wrote:
> Hi Bill,
>
> On Sun, Feb 10, 2019 at 10:10:02AM -0600, Bill Schmidt wrote:
>> I've added executable tests for both shift-right algebraic and shift-right logical.
>> Both fail prior to applying the patch, and work correctly afterwards.
> Please add a test for left shifts, as well?

Can do.  I verified that left shifts were not broken and figured a test case
had been added then, but have not checked.  Good to test this particular
scenario, though.

>
>> 2019-02-08  Bill Schmidt  <wschmidt@linux.ibm.com>
>>
>> 	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
>> 	and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
>> 	for correct semantics.  Also, don't expand a vector-splat if there
>> 	is a type mismatch; let the back end handle it.
> Does it always result in just the shift instruction now?  Does the modulo
> ever remain?  (Maybe at -O0?)  Modulo is hugely expensive; it is always
> modulo by a power of two, so a simple bitmask, so maybe write that directly
> instead?

We always get the shift.  For -mcpu=power8, we always load the mask from
memory rather than generating the vspltisb, which is not ideal code generation,
but is at least correct.

For -mcpu=power9, we get close, but have some bad register allocation and
an unnecessary extend:

        xxspltib 0,4   <- why not just xxspltib 32,4?
        xxlor 32,0,0   <- wasted copy
        vextsb2d 0,0   <- unnecessary due to vsrad semantics
        vsrad 2,2,0

Again, this is at least correct.  We have more work to do to produce the
most efficient code, but we have PR89213 open for that.

>
>> 2019-02-08  Bill Schmidt  <wschmidt@linux.ibm.com>
>>
>> 	* gcc.target/powerpc/srad-modulo.c: New.
>> 	* gcc.target/powerpc/srd-modulo.c: New.
> Please put "vec-" in the testcase name.  You may need to rename vec-shift.c
> and/or vec-shr.c, which are not as generic as their names suggest.

Ok.

>
>> @@ -16072,6 +16116,13 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
>>  	arg0 = gimple_call_arg (stmt, 0);
>>  	lhs = gimple_call_lhs (stmt);
>>  
>> +	/* It's sometimes useful to use one of these to build a
>> +	   splat for V2DImode, since the upper bits will be ignored.
>> +	   Don't fold if we detect that situation, as we end up
>> +	   losing the splat instruction in this case.  */
>> +	if (size != TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (TREE_TYPE (lhs)))))
>> +	  return false;
> This isn't really detecting that situation...  Instead, it doesn't fold
> whenever the size of the splatted elements isn't the same as the size of
> the elts of the target vector.  That's probably perfectly safe, but please
> spell it out.  It's fine to mention the motivating case, of course.

Yep, will correct.  Actually, as I look back at my notes, I believe that this
change is not necessary after all (same code generated with and without it).
I'll verify.

>
>> Index: gcc/testsuite/gcc.target/powerpc/srad-modulo.c
>> ===================================================================
>> --- gcc/testsuite/gcc.target/powerpc/srad-modulo.c	(nonexistent)
>> +++ gcc/testsuite/gcc.target/powerpc/srad-modulo.c	(working copy)
>> @@ -0,0 +1,43 @@
>> +/* Test that using a character splat to set up a shift-right algebraic
>> +   for a doubleword vector works correctly after gimple folding.  */
>> +
>> +/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */
>> +/* { dg-options { "-O3" } } */
> powerpc64*-*-* is almost always wrong.  I don't think you need to limit
> to 64-bit at all here, but if you do, test for lp64 instead.

Ok.

>
> Testing for vsx_hw but not enabling vsx is probably wrong, too.

Weird.  I just tried adding -mvsx and I get this peculiar error we've seen
before about AMD graphics card offloading:

spawn -ignore SIGHUP /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc -B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ /home/wschmidt/gcc/gcc-mainline-t\
est2/gcc/testsuite/gcc.target/powerpc/srad-modulo.c -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O2 -mvsx -lm -o \
./srad-modulo.exe^M
^[[01m^[[Kcc1:^[[m^[[K ^[[01;31m^[[Kerror: ^[[m^[[Kargument to '^[[01m^[[K-O^[[m^[[K' should be a non-negative integer, '^[[01m^[[Kg^[[m^[[K', '^[[01m^[[Ks^[[\
m^[[K' or '^[[01m^[[Kfast^[[m^[[K'^M
compiler exited with status 1
Executing on host: /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc -B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ offload_gcn7262.c    -fno-diagnosti\
cs-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never  -foffload=amdgcn-unknown-amdhsa -S -o offload_gcn7262.s    (timeout = 300)
spawn -ignore SIGHUP /home/wschmidt/gcc/build/gcc-mainline-test2/gcc/xgcc -B/home/wschmidt/gcc/build/gcc-mainline-test2/gcc/ offload_gcn7262.c -fno-diagnostic\
s-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -foffload=amdgcn-unknown-amdhsa -S -o offload_gcn7262.s^M
xgcc: fatal error: GCC is not configured to support amdgcn-unknown-amdhsa as offload target^M
compilation terminated.^M
compiler exited with status 1
FAIL: gcc.target/powerpc/srad-modulo.c (test for excess errors)
Excess errors:
^[[01m^[[Kcc1:^[[m^[[K ^[[01;31m^[[Kerror: ^[[m^[[Kargument to '^[[01m^[[K-O^[[m^[[K' should be a non-negative integer, '^[[01m^[[Kg^[[m^[[K', '^[[01m^[[Ks^[[\
m^[[K' or '^[[01m^[[Kfast^[[m^[[K'

Any ideas what's causing this?  I can't test with -mvsx until it's fixed.
Do we still have a bug open on this?

>
> Does it need -O3, does -O2 not work?

-O2 is fine, can change to that.

>
> Should this testcase check expected machine code as well?

I think we should defer that to the fix for PR89213.  We aren't at
ideal code quite yet.

>
>> --- gcc/testsuite/gcc.target/powerpc/srd-modulo.c	(nonexistent)
>> +++ gcc/testsuite/gcc.target/powerpc/srd-modulo.c	(working copy)
>> +vui64_t
>> +test_sradi_4 (vui64_t a)
>> +{
>> +  return vec_sradi (a, 4);
>> +}
> Pasto?  (srad vs. srd).

Ecch.  Will fix.  

>
>
> Segher
>



More information about the Gcc-patches mailing list