[ARC PATCH] Improve performance of SImode right shifts.
Roger Sayle
roger@nextmovesoftware.com
Thu Jul 11 18:45:20 GMT 2024
This patch improves the speed of ARC's ashrsi3 and lshrsi3, on CPUs
without a barrel shifter, when not optimizing for size. The current
implementations of right shifts by a constant are optimal for code
size, but at significant performance cost. By emitting an extra
instruction or two, when not optimizing for size, we can improve
performance (sometimes dramatically).
[al]shrsi3 #5 Before 4 insns@12 cycles, after 5 insns@5 cycles
Without -mswap
[al]shrsi3 #29 Before 4 insns@60 cycles, after 5 insns@31 cycles
With -mswap
lshrsi3 #29 Before 4 insns@60 cycles, after 6 insns@16 cycles
This patch has been minimally tested by building a cross-compiler
to arc-linux hosted on x86_64-pc-linux-gnu where there are no new
failures from "make -k check" in the compile-only tests.
Ok for mainline (after 3rd-party testing)?
2024-07-11 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/arc/arc.cc (arc_split_ashr): When not optimizing for
size; fully unroll ashr #5, on TARGET_SWAP for shifts between
19 and 29, perform ashr #16 using two instructions then
recursively perform the remaining shift, and for shifts by
odd amounts perform a single shift then the remainder
of the shift using a loop doing two bits per iteration.
(arc_split_lshr): Likewise.
Thanks in advance,
Roger
--
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patchas.txt
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240711/e9d405df/attachment.txt>
More information about the Gcc-patches
mailing list