[PATCH] s390: Add expander for uaddc/usubc optabs
Andreas Krebbel
krebbel@linux.ibm.com
Mon Nov 18 08:32:29 GMT 2024
Hi Stefan,
On 11/12/24 10:35, Stefan Schulze Frielinghaus wrote:
>>> + rtx cond = gen_rtx_LTU (<MODE>mode, gen_rtx_REG (CCL1mode, CC_REGNUM), const0_rtx);
>>> + if (operands[4] == const0_rtx)
>>> + emit_insn (gen_add<mode>3_carry1_cc (operands[0], operands[2], operands[3]));
>>> + else
>> If we would just generate the alc pattern with a carry in of 0, wouldn't the
>> other optimizers be able to figure out that the a normal add would do here?
>> This path does not seem to get exercised by your testcases.
> The zero carry in can occur due to
> https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a7aec76a74dd38524be325343158d3049b6ab3ac
> which is why we have a special case just in order to emit an add with
> carry out only.
Ok. I was hoping that this downgrade from add with carry in/out to add
with carry out would happen somewhat "automatically", if the the carry
in happens to be a constant zero. But probably not.
Your testcases invokes the pattern also with a constant 0 as carry in,
but since you prevent inlining the pattern is never expanded with a
const0_rtx. The testcase in the commit above is x86 specific, so it
might make sense to add an invocation which triggers the code path
explicitly. Just to be sure.
>> create a CCU mode comparison result, which is currently consumed as CCL1 by
>> the ALC pattern. This seems to be inconsistent. s390_emit_compare returns
>> the condition, which I think needs to be fed into the alc_carry1_cc pattern
>> as input condition.
>>
>> Also is LTU really the correct code here? CCU + LTU would expect CC1, but
>> you want CC2 or CC3 for the carry in, so GTU should be the better choice.
>> s390_alc_comparison should make sure that only valid combinations are
>> accepted, CCU + GTU would be one of them.
> I was coming up with my own condition since conditions created by
> s390_emit_compare() are of void mode which is why the alc predicates
> s390_alc_comparison() failed since these require GPR mode. I've fixed
> that by using PUT_MODE_RAW on those. I think we could also remove the
> mode from the match_operand which might be more appropriate. I've done
> it the latter way for sub<mode>3_slb_borrow1_cc. Once we have settled
> for one or the other version I will align uaddc/usubc.
The PLUS/MINUS arithmetic operations require both operands to have a
proper integer mode. So I think dropping the mode from match_operand
would be wrong. On the other hand, an IF_THEN_ELSE always requires the
comparison to have VOIDmode, that's what s390_emit_compare is supposed
to be used for. So taking what s390_emit_compare generates and changing
the mode is the right thing here. Since PUT_MODE is always a bit ugly,
we might also want go with extending s390_emit_compare with a target
mode operand defaulting to VOIDmode instead.
>>> +/* { dg-do run } */
>>> +/* { dg-options "-O2 -mzarch -save-temps -fdump-tree-optimized" } */
>>> +/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 2 "optimized" { target lp64 } } } */
>>> +/* { dg-final { scan-tree-dump-times "\\.UADDC \\(" 1 "optimized" { target { ! lp64 } } } } */
>>> +/* { dg-final { scan-assembler-times "\\talcr\\t" 1 } } */
>>> +/* { dg-final { scan-assembler-times "\\talcgr\\t" 1 { target lp64 } } } */
>> Your checks seem to rely on the testcase being compiled with at least
>> -march=z13.
I think for the run test you would have to make sure that the test is
not executed on machines older than z13 then.
/* { dg-do run { target { s390_useable_hw } } } */
Bye,
Andreas
More information about the Gcc-patches
mailing list