[PATCH] VECT: Change flow of decrement IV
Richard Sandiford
richard.sandiford@arm.com
Tue May 30 11:41:26 GMT 2023
"juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai> writes:
> Before this patch:
> foo:
> ble a2,zero,.L5
> csrr a3,vlenb
> srli a4,a3,2
> .L3:
> minu a5,a2,a4
> vsetvli zero,a5,e32,m1,ta,ma
> vle32.v v2,0(a1)
> vle32.v v1,0(a0)
> vsetvli t1,zero,e32,m1,ta,ma
> vadd.vv v1,v1,v2
> vsetvli zero,a5,e32,m1,ta,ma
> vse32.v v1,0(a0)
> add a1,a1,a3
> add a0,a0,a3
> sub a2,a2,a5
> bne a2,zero,.L3
> .L5:
> ret
>
> After this patch:
>
> foo:
> ble a2,zero,.L5
> csrr a3,vlenb
> srli a4,a3,2
> neg a7,a4 -->>>additional instruction
> .L3:
> minu a5,a2,a4
> vsetvli zero,a5,e32,m1,ta,ma
> vle32.v v2,0(a1)
> vle32.v v1,0(a0)
> vsetvli t1,zero,e32,m1,ta,ma
> mv a6,a2 -->>>additional instruction
> vadd.vv v1,v1,v2
> vsetvli zero,a5,e32,m1,ta,ma
> vse32.v v1,0(a0)
> add a1,a1,a3
> add a0,a0,a3
> add a2,a2,a7
> bgtu a6,a4,.L3
> .L5:
> ret
>
> There is 1 more instruction in preheader and 1 more instruction in loop.
> But I think it's OK for RVV since we will definitely be using SELECT_VL so this issue will gone.
But what about cases where you won't be using SELECT_VL, such as SLP?
Richard
More information about the Gcc-patches
mailing list