[Bug target/100627] missing optimization
pinskia at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Sun May 16 19:44:06 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100627
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a target issue dealing with how uint64_t ->float/double conversions are
done.
On aarch64 for cvt_f64_std we get good code at -O3:
cvt_f64_std(std::array<double, 16ul>&, std::array<unsigned long, 16ul> const&):
ldp q7, q6, [x1]
ldp q5, q4, [x1, 32]
ldp q3, q2, [x1, 64]
ldp q1, q0, [x1, 96]
ucvtf v7.2d, v7.2d
ucvtf v6.2d, v6.2d
ucvtf v5.2d, v5.2d
ucvtf v4.2d, v4.2d
ucvtf v3.2d, v3.2d
ucvtf v2.2d, v2.2d
stp q7, q6, [x0]
ucvtf v1.2d, v1.2d
stp q5, q4, [x0, 32]
ucvtf v0.2d, v0.2d
stp q3, q2, [x0, 64]
stp q1, q0, [x0, 96]
ret
The other function is:
cvt_f32_std(std::array<float, 16ul>&, std::array<unsigned long, 16ul> const&):
ldp x3, x2, [x1]
ucvtf s7, x2
ucvtf s3, x3
ldp x3, x2, [x1, 32]
ins v3.s[1], v7.s[0]
ucvtf s6, x2
ucvtf s2, x3
ldp x3, x2, [x1, 64]
ins v2.s[1], v6.s[0]
ucvtf s5, x2
ucvtf s1, x3
ldp x3, x2, [x1, 96]
ins v1.s[1], v5.s[0]
ucvtf s4, x2
ucvtf s0, x3
ldr x2, [x1, 48]
ldr x3, [x1, 16]
ucvtf s17, x2
ldr x2, [x1, 112]
ucvtf s18, x3
ldr x3, [x1, 80]
ins v0.s[1], v4.s[0]
ucvtf s4, x2
ucvtf s16, x3
ldr x2, [x1, 24]
ldr x3, [x1, 56]
ucvtf s7, x2
ldr x2, [x1, 88]
ucvtf s6, x3
ldr x1, [x1, 120]
ucvtf s5, x2
ins v3.s[2], v18.s[0]
ins v2.s[2], v17.s[0]
ins v1.s[2], v16.s[0]
ins v0.s[2], v4.s[0]
ucvtf s4, x1
ins v3.s[3], v7.s[0]
ins v2.s[3], v6.s[0]
ins v1.s[3], v5.s[0]
ins v0.s[3], v4.s[0]
stp q3, q2, [x0]
stp q1, q0, [x0, 32]
ret
More information about the Gcc-bugs
mailing list