[Bug target/85324] missing constant propagation on SSE/AVX conversion intrinsics
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Apr 11 08:19:00 GMT 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2018-04-11
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
_2 = __builtin_ia32_cvttps2dq ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 });
_3 = _2 + { 1, 1, 1, 1 };
..
_2 = __builtin_ia32_cvttps2qq128_mask ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 }, {
0, 0 }, 255);
_3 = _2 + { 1, 1 };
..
_2 = __builtin_ia32_cvttpd2dq ({ 1.0e+0, 1.0e+0 });
_3 = _2 + { 1, 1, 1, 1 };
..
_2 = __builtin_ia32_cvttpd2qq128_mask ({ 1.0e+0, 1.0e+0 }, { 0, 0 }, 255);
_3 = _2 + { 1, 1 };
..
_2 = __builtin_ia32_pmovdw128_mask ({ 1, 1, 1, 1 }, { 0, 0, 0, 0, 0, 0, 0, 0
}, 255);
_3 = _2 + { 1, 1, 1, 1, 1, 1, 1, 1 };
the middle-end has representations for all of those and can constant-fold them.
I suggest to fold the builtins to middle-end codes in the targets
gimple_fold_builtin hook. For the mask cases with a not always execute mask
the story may be different (exposing this to the middle-end requires a
two-vector "permutation" which might not combine back to the desired ops),
but maybe even then constant folding is beneficial in some cases (and then
good enough with the middle-end exposure?).
More information about the Gcc-bugs
mailing list