This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/64286] Redundant extend removal ignores vector element type


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64286

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Igor Zamyatin from comment #1)
> Perhaps something like below to restrict ree for such cases?
> 
> diff --git a/gcc/ree.c b/gcc/ree.c
> index 3376901..92370ea 100644
> --- a/gcc/ree.c
> +++ b/gcc/ree.c
> @@ -1004,6 +1004,11 @@ add_removable_extension (const_rtx expr, rtx_insn
> *insn,
>        struct df_link *defs, *def;
>        ext_cand *cand;
>  
> +      if (!SCALAR_INT_MODE_P (GET_MODE (dest))
> +	  && (GET_MODE_UNIT_PRECISION (mode) !=
> +	      GET_MODE_UNIT_PRECISION (GET_MODE (XEXP (src, 0)))))
> +	return;
> +
>        /* First, make sure we can get all the reaching definitions.  */
>        defs = get_defs (insn, XEXP (src, 0), NULL);
>        if (!defs)

I think your patch is too restrictive.
Consider -O2 -mavx2:
typedef char __v16qi __attribute__((__vector_size__(16)));
typedef int __m128i __attribute__((__vector_size__(16)));
__m128i bar (__m128i);
typedef int __m256i __attribute__((__vector_size__(32)));
__m256i v;

void
foo (char *p)
{
  __m128i a = (__m128i)__builtin_ia32_loaddqu (p);
  __m128i ps1 = bar (a);
  v = (__m256i) __builtin_ia32_pmovzxbw256 ((__v16qi) a);
}

Here, there is:
(insn 19 9 11 2 (set (reg:V16QI 22 xmm1 [92])
        (mem/c:V16QI (plus:DI (reg/f:DI 6 bp)
                (const_int -32 [0xffffffffffffffe0])) [2 %sfp+-16 S16 A128]))
pr64286.i:12 1185 {*movv16qi_internal}
     (nil))
(insn 11 19 13 2 (set (reg:V16HI 22 xmm1 [orig:93 D.2299 ] [93])
        (zero_extend:V16HI (reg:V16QI 22 xmm1 [92]))) pr64286.i:12 3826
{avx2_zero_extendv16qiv16hi2}
     (nil))
and there is no reason to restrict it.  I also don't understand the
GET_MODE_UNIT_PRECISION != GET_MODE_UNIT_PRECISION test, do you know about
SIGN_EXTEND/ZERO_EXTEND where the unit precision is the same?  That wouldn't be
an extension.
The important difference between vectors and scalars is that for scalars the
lowpart subreg of the zero/sign extended value is still the original value,
while for vectors that is not the case.  So, for vectors you can REE optimize
them only if all the uses are the same extension (zero vs. sign, and to the
same mode).

Therefore, supposedly for non-scalar modes (i.e. vector ones, other than scalar
int and vector int hopefully don't have zero/sign_extend) I think what should
be done is bail out if any of the defs has any uses that are not the sign resp.
zero extension that has been found.
We have there the:

      /* Second, make sure the reaching definitions don't feed another and
         different extension.  FIXME: this obviously can be improved.  */
      for (def = defs; def; def = def->next)
        if ((idx = def_map[INSN_UID (DF_REF_INSN (def->ref))])
            && (cand = &(*insn_list)[idx - 1])
            && cand->code != code)
          {
            if (dump_file)
              {
                fprintf (dump_file, "Cannot eliminate extension:\n");
                print_rtl_single (dump_file, insn);
                fprintf (dump_file, " because of other extension\n");
              }
            return;
          }

loop, perhaps for the vector modes we could add
else if (!SCALAR_INT_MODE_P (...) && idx == 0)
and in that case look using DU chains (which are supposedly computed) if any
uses of it other than the current insn are not a sign/zero extension at all or
are different extension or to different mode than the current instruction, and
in that case record some magic value to def_map (e.g -1U) and treat later that
magic def_map value as a sign that we should give up (disregard that
extension).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]