This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][RFC] Make the function vectorizer capable of doing type transformations


Richard Guenther <rguenther@suse.de> wrote on 01/02/2007 16:12:55:

> On Thu, 1 Feb 2007, Dorit Nuzman wrote:
>
> > I looked back at my "todo" notes and realized that there were a couple
> > additional bits that need fixing in vectorizable_call:
> >
> > (1) I think we're going to ICE (need to find the testcase) in
> > vect_get_vec_def_for_copy, because we are deleting the scalar call stmt
> > that we are vectorizing,  while we are going to need it, at least when
> > ncopies>1, when we look for the vector-defs. I think the only reason we
> > haven't bumped into this ICE yet is because the sqrt is currently
> > vectorized as a "pattern_stmt" - in which case - we don't visit the
scalar
> > sqrt to look for the vector def, but rather we visit the original stmt
> > (pow) - that is the stmt in the pattern that sqrt had replaced.  If
sqrt
> > was generated not as a result of a pattern-recognition in the
vectorizer -
> > we probably would have ICEd.  (I hope this is making any sense to
you?).
> > Anyhow - the point is that we should probably be deleting the scalar
calls
> > after we finished vectorizing all the stmts in the loop (just chain
them
> > somewhere until then).
>
> It seems to work for me, maybe you can clarify under which circumstances
> we still need the stmt after vectorization?
>

Try to write a testcase in which you directly use the intrinsic sqrt
(instead of using pow(x,0.5) which is translated into sqrt via the
pattern-recognition engine). So, you'd have:

(1)   x = __sqrt (i);
(2)   a[i] = x;

now, after vectorizing stmt (1) we'd have:

(v1)  vx = __vsqrt (vi);
(1)   x = __sqrt (i);
(2)   a[i] = x;

Normally stmt (1) would have a pointer to stmt (v1) in its
STMT_VINFO_VEC_STMT; this way, when we get to vectorize stmt (2) we can
obtain the respective vector def (vx) via the STMT_VINFO_VEC_STMT field of
stmt (1). However, stmt (1) is removed, so vectorization will ICE.

That was my initial thought, anyhow. But now I'm not sure that there would
be a problem after all, since in fact you don't actually remove the stmt,
you just change its rhs, so the stmt_vinfo stuff should remain unchanged.
Maybe we just need to add such a testcase.

> > (2) We should also check that the return value of the function call is
not
> > used after the loop (not "live") cause we simply don't handle that yet
> > (it's not difficult, but we simply don't do it yet). So all
vectorizable_*
> > functions should check that.
>
> They seem to do:
>
> t.i:11: note: init: stmt relevant? tmp_7 = lrintf (D.1967_6)
> t.i:11: note: vec_stmt_relevant_p: used out of loop.
>

We need to have this check in vectorizable_call:

  if (STMT_VINFO_LIVE_P (vinfo_for_stmt (stmt)))
    return false;

At least until we add support for vectorization of values that are used
after the loop.
I couldn't construct a testcase that would currently fail on this, but
consider this testcase:

================================
  for (i=0; i<256; ++i)
    {
      d = __builtin_pow (i, 0.5);
      dx[i] = d;
    }
  return d;
================================

looking at the tree dump:

================================
  # ivtmp.28_2 = PHI <ivtmp.28_1(4), 256(2)>
  # i_17 = PHI <i_8(4), 0(2)>
<L0>:;
(1)  D.1949_5 = (double) i_17;
(2)  d_6 = __builtin_pow (D.1949_5, 5.0e-1);
(3)  dx[i_17] = d_6;
(4)  i_8 = i_17 + 1;
(5)  ivtmp.28_1 = ivtmp.28_2 - 1;
(6)  if (ivtmp.28_1 != 0) goto <L6>; else goto <L2>;

<L6>:;
  goto <bb 3> (<L0>);

  # d_11 = PHI <d_6(3)>
<L2>:;
  return d_11;
================================

now assume we have vectorization of induction (I'll commit that by
tomorrow), and also assume that we have support for vectorization of
induction variables that are used after the loop (we had a trivial patch
for that in the past that seemed redundant with scev-ccp, but there may be
room for it after all - I'll resubmit it again in the near future). So,
stmt (1) will be marked both as 'live' (cause it feeds a value that is used
out of the loop), and also as 'relevant' (because it feeds a value that is
stored in the loop, and needs to be vectorized). Assuming that we support
vectorization of induction and properly handle inductions that are used
after the loop, we will succeed in vectorizing stmt (1), and move on. Now,
stmt (2) is also both 'live' and 'relevant', and we will vectorize it
(thanks to your function-call-vectorization), but we will totally ignore
the fact that it is used after the loop, and we'll get a wrong value for
d_11. (What happens with current mainline is that we refuse to vectorize
stmt (1), so this loop doesn't get vectorized yet, but this will change
soon).

hope it's clearer now?

Anyhow, it's a lot of text with very short bottom line - I think we need to
add a testcase for the first issue and add those 2 lines of code for the
second issue...

thanks,
dorit

> Richard.
>
> --
> Richard Guenther <rguenther@suse.de>
> Novell / SUSE Labs


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]