[Bug target/85324] missing constant propagation on SSE/AVX conversion intrinsics

Wed Apr 11 08:19:00 GMT 2018

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2018-04-11
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
  _2 = __builtin_ia32_cvttps2dq ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 });
  _3 = _2 + { 1, 1, 1, 1 };
..
  _2 = __builtin_ia32_cvttps2qq128_mask ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 }, {
0, 0 }, 255);
  _3 = _2 + { 1, 1 };
..
  _2 = __builtin_ia32_cvttpd2dq ({ 1.0e+0, 1.0e+0 });
  _3 = _2 + { 1, 1, 1, 1 };
..
  _2 = __builtin_ia32_cvttpd2qq128_mask ({ 1.0e+0, 1.0e+0 }, { 0, 0 }, 255);
  _3 = _2 + { 1, 1 };
..
  _2 = __builtin_ia32_pmovdw128_mask ({ 1, 1, 1, 1 }, { 0, 0, 0, 0, 0, 0, 0, 0
}, 255);
  _3 = _2 + { 1, 1, 1, 1, 1, 1, 1, 1 };

the middle-end has representations for all of those and can constant-fold them.

I suggest to fold the builtins to middle-end codes in the targets
gimple_fold_builtin hook.  For the mask cases with a not always execute mask
the story may be different (exposing this to the middle-end requires a
two-vector "permutation" which might not combine back to the desired ops),
but maybe even then constant folding is beneficial in some cases (and then
good enough with the middle-end exposure?).