]> gcc.gnu.org Git - gcc.git/commit
[xstormy16] Recognize/support swpn (swap nibbles) instruction.
authorRoger Sayle <roger@nextmovesoftware.com>
Sat, 29 Apr 2023 19:15:34 +0000 (20:15 +0100)
committerRoger Sayle <roger@nextmovesoftware.com>
Sat, 29 Apr 2023 19:15:34 +0000 (20:15 +0100)
commit58f3cbbd7e02c29f2abd6300c0fec053559e35b4
treeaa88a648d45075b32374fcadd5b15d5fc6015300
parent83c78cb0d78fffa8791f64ba36146d17c00c1b23
[xstormy16] Recognize/support swpn (swap nibbles) instruction.

This patch adds support for xstormy16's swap nibbles instruction (swpn).
For the test case:

short foo(short x) {
  return (x&0xff00) | ((x<<4)&0xf0) | ((x>>4)&0x0f);
}

GCC with -O2 currently generates the nine instruction sequence:
foo:    mov r7,r2
        asr r2,#4
        and r2,#15
        mov.w r6,#-256
        and r6,r7
        or r2,r6
        shl r7,#4
        and r7,#255
        or r2,r7
        ret

with this patch, we now generate:
foo: swpn r2
ret

To achieve this using combine's four instruction "combinations" requires
a little wizardry.  Firstly, define_insn_and_split are introduced to
treat logical shifts followed by bitwise-AND as macro instructions that
are split after reload.  This is sufficient to recognize a QImode
nibble swap, which can be implemented by swpn followed by either a
zero-extension or a sign-extension from QImode to HImode.  Then finally,
in the correct context, a QImode swap-nibbles pattern can be combined to
preserve the high-byte of a HImode word, matching the xstormy16's swpn
semantics.  The naming of the new code iterators is taken from i386.md.

2023-04-29  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/stormy16/stormy16.md (any_lshift): New code iterator.
(any_or_plus): Likewise.
(any_rotate): Likewise.
(*<any_lshift>_and_internal): New define_insn_and_split to
recognize a logical shift followed by an AND, and split it
again after reload.
(*swpn): New define_insn matching xstormy16's swpn.
(*swpn_zext): New define_insn recognizing swpn followed by
zero_extendqihi2, i.e. with the high byte set to zero.
(*swpn_sext): Likewise, for swpn followed by cbw.
(*swpn_sext_2): Likewise, for an alternate RTL form.
(*swpn_zext_ior): A pre-reload splitter so that an swpn+zext+ior
sequence is split in the correct place to recognize the *swpn_zext
followed by any_or_plus (ior, xor or plus) instruction.

gcc/testsuite/ChangeLog
* gcc.target/xstormy16/swpn-1.c: New QImode test case.
* gcc.target/xstormy16/swpn-2.c: New zero_extend test case.
* gcc.target/xstormy16/swpn-3.c: New sign_extend test case.
* gcc.target/xstormy16/swpn-4.c: New HImode test case.
gcc/config/stormy16/stormy16.md
gcc/testsuite/gcc.target/xstormy16/swpn-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/xstormy16/swpn-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/xstormy16/swpn-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/xstormy16/swpn-4.c [new file with mode: 0644]
This page took 0.061986 seconds and 6 git commands to generate.