[PATCH] s390: Add expander for uaddc/usubc optabs

Andreas Krebbel krebbel@linux.ibm.com
Mon Nov 18 08:32:29 GMT 2024


Hi Stefan,


On 11/12/24 10:35, Stefan Schulze Frielinghaus wrote:
>>> +  rtx cond = gen_rtx_LTU (<MODE>mode, gen_rtx_REG (CCL1mode, CC_REGNUM), const0_rtx);
>>> +  if (operands[4] == const0_rtx)
>>> +    emit_insn (gen_add<mode>3_carry1_cc (operands[0], operands[2], operands[3]));
>>> +  else
>> If we would just generate the alc pattern with a carry in of 0, wouldn't the
>> other optimizers be able to figure out that the a normal add would do here?
>> This path does not seem to get exercised by your testcases.
> The zero carry in can occur due to
> https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a7aec76a74dd38524be325343158d3049b6ab3ac
> which is why we have a special case just in order to emit an add with
> carry out only.

Ok. I was hoping that this downgrade from add with carry in/out to add 
with carry out would happen somewhat "automatically", if the the carry 
in happens to be a constant zero. But probably not.

Your testcases invokes the pattern also with a constant 0 as carry in, 
but since you prevent inlining the pattern is never expanded with a 
const0_rtx. The testcase in the commit above is x86 specific, so it 
might make sense to add an invocation which triggers the code path 
explicitly. Just to be sure.

>> create a CCU mode comparison result, which is currently consumed as CCL1 by
>> the ALC pattern. This seems to be inconsistent. s390_emit_compare returns
>> the condition, which I think needs to be fed into the alc_carry1_cc pattern
>> as input condition.
>>
>> Also is LTU really the correct code here? CCU + LTU would expect CC1, but
>> you want CC2 or CC3 for the carry in, so GTU should be the better choice.
>> s390_alc_comparison should make sure that only valid combinations are
>> accepted, CCU + GTU would be one of them.
> I was coming up with my own condition since conditions created by
> s390_emit_compare() are of void mode which is why the alc predicates
> s390_alc_comparison() failed since these require GPR mode.  I've fixed
> that by using PUT_MODE_RAW on those.  I think we could also remove the
> mode from the match_operand which might be more appropriate.  I've done
> it the latter way for sub<mode>3_slb_borrow1_cc.  Once we have settled
> for one or the other version I will align uaddc/usubc.

The PLUS/MINUS arithmetic operations require both operands to have a 
proper integer mode. So I think dropping the mode from match_operand 
would be wrong. On the other hand, an IF_THEN_ELSE always requires the 
comparison to have VOIDmode, that's what s390_emit_compare is supposed 
to be used for. So taking what s390_emit_compare generates and changing 
the mode is the right thing here. Since PUT_MODE is always a bit ugly, 
we might also want go with extending s390_emit_compare with a target 
mode operand defaulting to VOIDmode instead.

>>> +/* { dg-do run } */
>>> +/* { dg-options "-O2 -mzarch -save-temps -fdump-tree-optimized" }  */
>>> +/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 2 "optimized" { target lp64 } } } */
>>> +/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 1 "optimized" { target { ! lp64 } } } } */
>>> +/* { dg-final { scan-assembler-times "\\talcr\\t" 1 } } */
>>> +/* { dg-final { scan-assembler-times "\\talcgr\\t" 1 { target lp64 } } } */
>> Your checks seem to rely on the testcase being compiled with at least
>> -march=z13.

I think for the run test you would have to make sure that the test is 
not executed on machines older than z13 then.

/* { dg-do run { target { s390_useable_hw } } } */


Bye,

Andreas



More information about the Gcc-patches mailing list