[PATCH] RISC-V: Add implementation for builtin overflow
Jim Wilson
jimw@sifive.com
Wed Feb 17 00:57:47 GMT 2021
On Thu, Jan 21, 2021 at 2:25 AM Levy <admin@levyhsu.com> wrote:
> Added implementation for builtin overflow detection, new patterns are
> listed below.
>
For rv32 SImode and rv64 DImode, the unsigned add/sub and signed/unsigned
mul patterns seem to give the same result as the default code generation.
That has me wondering if we really need the patterns.
For rv64 SImode, the signed add/sub patterns are generating worse code. Signed
add overflow without the pattern is
add a1,a0,a1
sext.w a0,a1
bne a1,a0,.L23
and with the pattern is
mv a5,a0
addw a0,a0,a1
slt a5,a0,a5
slti a1,a1,0
bne a1,a5,.L27
ignoring the register allocation issue we have one more instruction using the
pattern which is undesirable. This needs a fix. We could just use X instead
of GPR. We could then optionally add rv64 SImode patterns. If we don't
add them it looks like a bug, so I think we should add another pattern for
this.
For rv64 SImode, the unsigned add/sub patterns are generating better code. We
have unnecessary zero extends without the patterns. This suggests that the
unsigned add/sub patterns really are useful. Only 1 of the 3 expansions is
useful, but if we only implement one of the 3 expansions that looks like a
bug so I think this is OK as is.
The signed/unsigned mul patterns only support rv32 SImode and rv64 DImode.
The rv64 SImode support is missing. But even though there is no pattern
for it we can still get worse code for it. An unpatched tree for signed
mul gives
mul a1,a0,a1
sext.w a0,a1
bne a1,a0,.L37
and a patched tree gives
mul a0,a0,a1
sraiw a4,a0,31
srai a5,a0,32
bne a4,a5,.L43
On the other hand, an unpatched tree for unsigned mul gives
slli a0,a0,32
slli a1,a1,32
srli a0,a0,32
srli a1,a1,32
mul a5,a0,a1
slli a4,a5,32
srli a4,a4,32
bne a5,a4,.L61
and a patched tree gives
slli a0,a0,32
slli a1,a1,32
srli a0,a0,32
srli a1,a1,32
mul a0,a0,a1
srai a5,a0,32
bne a5,zero,.L69
which is one instruction shorter. In both cases, the difference is due to
the hook to set min precision to 32. Presumably we need this hook for
add/sub, so we need to add the missing rv64 SImode signed mul pattern.
Since we have to have one of them, we may as well have all of them for
completeness.
There are a few minor style issues.
In the define_expands, the patterns aren't formatted properly. An open
parenthesis should be aligned to an open paren at the same depth on the
line above. If you are an emacs user, you can use M-x emacs-lisp-mode and
hit tab to align them.
In a define_expand the constraints aren't useful and shouldn't be
included. In a define_expand that always emits its own RTL the pattern
isn't useful either. Only the operand modes, numbers, and predicates are
useful. This is why some define_expands only list the operands and don't
include the pattern. But including the pattern is OK, it just isn't
required. So the only real fix we need here is to drop the constraints.
I attached the testcase I used for testing purposes. Every function should
be same size or smaller with the patch.
Jim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: overflow-test.c
Type: text/x-csrc
Size: 1949 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20210216/b2076e46/attachment-0001.bin>
More information about the Gcc-patches
mailing list