gcc.gnu.org Git - gcc.git/commit

RISC-V: Optimize vsetvli of LCM INSERTED edge for user vsetvli [PR 109743]

Rebase to trunk and send V3 patch for:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617821.html

This patch is fixing: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109743.

This issue happens is because we are currently very conservative in optimization of user vsetvli.

Consider this following case:

bb 1:
  vsetvli a5,a4... (demand AVL = a4).
bb 2:
  RVV insn use a5 (demand AVL = a5).

LCM will hoist vsetvl of bb 2 into bb 1.
We don't do AVL propagation for this situation since it's complicated that
we should analyze the code sequence between vsetvli in bb 1 and RVV insn in bb 2.
They are not necessary the consecutive blocks.

This patch is doing the optimizations after LCM, we will check and eliminate the vsetvli
in LCM inserted edge if such vsetvli is redundant. Such approach is much simplier and safe.

code:
void
foo2 (int32_t *a, int32_t *b, int n)
{
  if (n <= 0)
      return;
  int i = n;
  size_t vl = __riscv_vsetvl_e32m1 (i);

  for (; i >= 0; i--)
  {
    vint32m1_t v = __riscv_vle32_v_i32m1 (a, vl);
    __riscv_vse32_v_i32m1 (b, v, vl);

    if (i >= vl)
      continue;

    if (i == 0)
      return;

    vl = __riscv_vsetvl_e32m1 (i);
  }
}

Before this patch:
foo2:
.LFB2:
.cfi_startproc
ble     a2,zero,.L1
mv      a4,a2
li      a3,-1
vsetvli a5,a2,e32,m1,ta,mu
vsetvli zero,a5,e32,m1,ta,ma  <- can be eliminated.
.L5:
vle32.v v1,0(a0)
vse32.v v1,0(a1)
bgeu    a4,a5,.L3
.L10:
beq     a2,zero,.L1
vsetvli a5,a4,e32,m1,ta,mu
addi    a4,a4,-1
vsetvli zero,a5,e32,m1,ta,ma  <- can be eliminated.
vle32.v v1,0(a0)
vse32.v v1,0(a1)
addiw   a2,a2,-1
bltu    a4,a5,.L10
.L3:
addiw   a2,a2,-1
addi    a4,a4,-1
bne     a2,a3,.L5
.L1:
ret

After this patch:
f:
ble     a2,zero,.L1
mv      a4,a2
li      a3,-1
vsetvli a5,a2,e32,m1,ta,ma
.L5:
vle32.v v1,0(a0)
vse32.v v1,0(a1)
bgeu    a4,a5,.L3
.L10:
beq     a2,zero,.L1
vsetvli a5,a4,e32,m1,ta,ma
addi    a4,a4,-1
vle32.v v1,0(a0)
vse32.v v1,0(a1)
addiw   a2,a2,-1
bltu    a4,a5,.L10
.L3:
addiw   a2,a2,-1
addi    a4,a4,-1
bne     a2,a3,.L5
.L1:
ret

PR target/109743

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vsetvl_at_end): New.
(local_avl_compatible_p): New.
(pass_vsetvl::local_eliminate_vsetvl_insn): Enhance local optimizations
for LCM, rewrite as a backward algorithm.
(pass_vsetvl::cleanup_insns): Use new local_eliminate_vsetvl_insn
interface, handle a BB at once.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr109743-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109743-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109743-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/pr109743-4.c: New test.

Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>

author	Kito Cheng <kito.cheng@sifive.com>
	Fri, 12 May 2023 02:26:06 +0000 (10:26 +0800)
committer	Kito Cheng <kito.cheng@sifive.com>
	Fri, 12 May 2023 13:31:11 +0000 (21:31 +0800)
commit	c919d059fcb67747d3c0bd539c7044e874b03fb7
tree	d6ede9bcf47623a094ea5faf17d882004ff36586	tree
parent	cc0e22b3f25d4b2a326322bce711179c02377e6c	commit \| diff

gcc/config/riscv/riscv-vsetvl.cc		diff \| blob \| blame \| history
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109743-1.c	[new file with mode: 0644]	blob
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109743-2.c	[new file with mode: 0644]	blob
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109743-3.c	[new file with mode: 0644]	blob
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109743-4.c	[new file with mode: 0644]	blob