This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Recognize a missed usage of a sbfiz instruction


Hi Luis,


On 02/02/18 14:38, Luis Machado wrote:
A customer reported the following missed opportunities to combine a couple
instructions into a sbfiz.

int sbfiz32 (int x)
{
  return x << 29 >> 10;
}

long sbfiz64 (long x)
{
  return x << 58 >> 20;
}

This gets converted to the following pattern:

(set (reg:SI 98)
    (ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ]))
        (const_int 6 [0x6])))

Currently, gcc generates the following:

sbfiz32:
        lsl x0, x0, 29
        asr x0, x0, 10
        ret

sbfiz64:
        lsl x0, x0, 58
        asr x0, x0, 20
        ret

It could generate this instead:

sbfiz32:
        sbfiz   w0, w0, 19, 3
        ret

sbfiz64::
        sbfiz   x0, x0, 38, 6
        ret

The unsigned versions already generate ubfiz for the same code, so the lack of
a sbfiz pattern may have been an oversight.

You're right. In fact, llvm will generate the SBFIZ form.


This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and
CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No
significant performance differences, probably due to the small number of
occurrences or cases outside hot areas.

Yeah, these little missed opportunities add up in the long run :)


Regression-tested and bootstrapped ok on aarch64-linux. Validated with
CPU2017 and CPU2006 runs.

I thought i'd put this up for review. I know we're still not in development
mode yet.

2018-02-02  Luis Machado  <luis.machado@linaro.org>

        gcc/
        * config/aarch64/aarch64.md (*ashift<mode>_extv_bfiz): New pattern.
        * testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test.

I'm sure you know this already, the testsuite entry will go into a ChangeLog
of its own in gcc/testsuite/ with the testsuite/ prefix removed.

---
 gcc/config/aarch64/aarch64.md                    | 13 +++++++++++++
 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 ++++++++++++++++++++++++
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a2a930..d336bf0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4828,6 +4828,19 @@
   [(set_attr "type" "bfx")]
 )

+;; Match sbfiz pattern in a shift left + shift right operation.
+
+(define_insn "*ashift<mode>_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+       (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")
+                                     (match_operand 2 "const_int_operand" "n")
+                                     (match_operand 3 "const_int_operand" "n"))
+                    (match_operand 4 "const_int_operand" "n")))]
+  "UINTVAL (operands[2]) < <GPI:sizen>"
+  "sbfiz\\t%<w>0, %<w>1, %4, %2"
+  [(set_attr "type" "bfx")]
+)

I think you need some more validation on operands 3 and 4.
Look at other similar patterns, but you want something like the aarch64_simd_shift_imm_<mode> predicate
on them to make sure they fall within [0, 63] or [0, 31].
Also, for operands 2 you probably want the aarch64_simd_shift_imm_offset_di to make sure it doesn't
allow a size of zero. Don't know if the RTL optimisers will try to create such wonky RTL, but
we tend to be defensive in the pattern about these things.

+
 ;; When the bit position and width of the equivalent extraction add up to 32
 ;; we can use a W-reg LSL instruction taking advantage of the implicit
 ;; zero-extension of the X-reg.
diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
new file mode 100644
index 0000000..931f8f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Check that a LSL followed by an ASR can be combined into a single SBFIZ
+   instruction.  */
+
+/* Using W-reg */
+
+int
+sbfiz32 (int x)
+{
+  return x << 29 >> 10;
+}
+
+/* Using X-reg */
+
+long
+sbfiz64 (long x)
+{
+  return x << 58 >> 20;
+}

long will be 32 bits when testing with -mabi=ilp32 so you want this to be long long,
or some other guaranteed 64-bit type.

Thanks,
Kyrill

+
+/* { dg-final { scan-assembler "sbfiz\tw" } } */
+/* { dg-final { scan-assembler "sbfiz\tx" } } */
--
2.7.4



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]