[PATCH] RISC-V: Add implementation for builtin overflow

Jim Wilson jimw@sifive.com
Wed Feb 17 00:57:47 GMT 2021


On Thu, Jan 21, 2021 at 2:25 AM Levy <admin@levyhsu.com> wrote:

> Added implementation for builtin overflow detection, new patterns are
> listed below.
>

For rv32 SImode and rv64 DImode, the unsigned add/sub and signed/unsigned
mul patterns seem to give the same result as the default code generation.
That has me wondering if we really need the patterns.

For rv64 SImode, the signed add/sub patterns are generating worse code. Signed
add overflow without the pattern is
        add     a1,a0,a1
        sext.w  a0,a1
        bne     a1,a0,.L23
and with the pattern is
        mv      a5,a0
        addw    a0,a0,a1
        slt     a5,a0,a5
        slti    a1,a1,0
        bne     a1,a5,.L27
ignoring the register allocation issue we have one more instruction using the
pattern which is undesirable.  This needs a fix.  We could just use X instead
of GPR.  We could then optionally add rv64 SImode patterns.  If we don't
add them it looks like a bug, so I think we should add another pattern for
this.

For rv64 SImode, the unsigned add/sub patterns are generating better code.  We
have unnecessary zero extends without the patterns.  This suggests that the
unsigned add/sub patterns really are useful.  Only 1 of the 3 expansions is
useful, but if we only implement one of the 3 expansions that looks like a
bug so I think this is OK as is.

The signed/unsigned mul patterns only support rv32 SImode and rv64 DImode.
The rv64 SImode support is missing.  But even though there is no pattern
for it we can still get worse code for it.  An unpatched tree for signed
mul gives
        mul     a1,a0,a1
        sext.w  a0,a1
        bne     a1,a0,.L37
and a patched tree gives
        mul     a0,a0,a1
        sraiw   a4,a0,31
        srai    a5,a0,32
        bne     a4,a5,.L43
On the other hand, an unpatched tree for unsigned mul gives
        slli    a0,a0,32
        slli    a1,a1,32
        srli    a0,a0,32
        srli    a1,a1,32
        mul     a5,a0,a1
        slli    a4,a5,32
        srli    a4,a4,32
        bne     a5,a4,.L61
and a patched tree gives
        slli    a0,a0,32
        slli    a1,a1,32
        srli    a0,a0,32
        srli    a1,a1,32
        mul     a0,a0,a1
        srai    a5,a0,32
        bne     a5,zero,.L69
which is one instruction shorter.  In both cases, the difference is due to
the hook to set min precision to 32.  Presumably we need this hook for
add/sub, so we need to add the missing rv64 SImode signed mul pattern.
Since we have to have one of them, we may as well have all of them for
completeness.

There are a few minor style issues.

In the define_expands, the patterns aren't formatted properly.  An open
parenthesis should be aligned to an open paren at the same depth on the
line above.  If you are an emacs user, you can use M-x emacs-lisp-mode and
hit tab to align them.

In a define_expand the constraints aren't useful and shouldn't be
included.  In a define_expand that always emits its own RTL the pattern
isn't useful either.  Only the operand modes, numbers, and predicates are
useful.  This is why some define_expands only list the operands and don't
include the pattern.  But including the pattern is OK, it just isn't
required.  So the only real fix we need here is to drop the constraints.

I attached the testcase I used for testing purposes.  Every function should
be same size or smaller with the patch.

Jim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: overflow-test.c
Type: text/x-csrc
Size: 1949 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20210216/b2076e46/attachment-0001.bin>


More information about the Gcc-patches mailing list