]> gcc.gnu.org Git - gcc.git/commit
RISC-V: Support vfwmul.vv combine lowering
authorJuzhe-Zhong <juzhe.zhong@rivai.ai>
Wed, 28 Jun 2023 04:15:12 +0000 (12:15 +0800)
committerLehua Ding <lehua.ding@rivai.ai>
Mon, 3 Jul 2023 09:22:28 +0000 (17:22 +0800)
commitbc32918b063b9fa3dffc8815478a81df6ad999ca
tree32d7a8b1f11de5be4ad9d080672c9f5c8ab5d249
parent3755ad7514978e88809a7ad98c10592e4814a6ef
RISC-V: Support vfwmul.vv combine lowering

Consider the following complicate case:
  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      {                                                                        \
dst[i] = (TYPE1) a[i] * (TYPE1) b[i];                                  \
dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i];                                \
dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i];                                \
dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i];                                \
      }                                                                        \
  }

TEST_TYPE (double, float)

Such complicate situation, Combine PASS can not combine extension of both operands on the fly.
So the combine PASS will first try to combine one of the combine extension, and then combine
the other. The combine flow is as follows:

Original IR:
(set (reg 0) (float_extend: (reg 1))
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (reg 0) (reg 3))

First step of combine:
(set (reg 3) (float_extend: (reg 2))
(set (reg 4) (mult: (float_extend: (reg 1) (reg 3))

Second step of combine:
(set (reg 4) (mult: (float_extend: (reg 1) (float_extend: (reg 2))

So, to enhance the combine optimization, we add a "pseudo vwfmul.wv" RTL pattern in autovec-opt.md
which is (set (reg 0) (mult (float_extend (reg 1) (reg 2)))).

gcc/ChangeLog:

* config/riscv/autovec-opt.md (@pred_single_widen_mul<any_extend:su><mode>): Change "@"
into "*" in pattern name which simplifies build files.
(*pred_single_widen_mul<any_extend:su><mode>): Ditto.
(*pred_single_widen_mul<mode>): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/widen/widen-3.c: Add floating-point.
* gcc.target/riscv/rvv/autovec/widen/widen-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c: New test.
gcc/config/riscv/autovec-opt.md
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-3.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-3.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-7.c [new file with mode: 0644]
This page took 0.061294 seconds and 5 git commands to generate.