This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Patch ping

From: Jakub Jelinek <jakub at redhat dot com>
To: Richard Guenther <rguenther at suse dot de>
Cc: Ira Rosen <ira dot rosen at linaro dot org>, Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
Date: Fri, 4 Nov 2011 11:25:25 +0100
Subject: Re: Patch ping
References: <20111102201829.GT1052@tyan-ft48-01.lab.bos.redhat.com> <alpine.LNX.2.00.1111041043570.26779@zhemvz.fhfr.qr>
Reply-to: Jakub Jelinek <jakub at redhat dot com>

On Fri, Nov 04, 2011 at 10:52:44AM +0100, Richard Guenther wrote:
> > - Gather vectorization patch + incremental patches
> >   http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02411.html
> 
> I'm not sure I like using new builtins for gather representation
> on the tree level too much, given that we are now moving
> towards using tree codes for suffle.  Thus, how complicated
> would it be to have a gather tree code and optab and to
> handle the mixed size index issue in the expander?
> 
> I realize this would be quite some reorg to the patchset ...
> so, why did you choose builtins over a more generic approach?

Because while permutations etc. are common to most targets,
currently gather is very specialized, specific to one target,
with lots of details how it must look like (e.g. the mask stuff where
we currently don't even have tree code for conditional loads or conditional
stores), and additionally unlike VEC_PERM_EXPR etc. which are normal
expressions this one is a reference (and conditional one too), so
I'm afraid I'd need to touch huge amounts of code (most places that
currently handle MEM_REF/TARGET_MEM_REF would need to handle
VEC_GATHER_MEM_REF too, as it is a memory read (well, set of conditional
memory reads).  The i?86 backend already has (except 4) all the needed
builtins anyway and handles expanding them too, the 4 ones are just
to cope with the weirdo definition of some of them (half sized vectors).
And when it is represented as builtin, the side-effects are handled by
all optimization passes automatically, similarly how e.g. atomic builtins
are right now builtins and not expressions.

So I thought it is better to use builtins right now, then when we in 4.8+
hopefully do something about conditional loads/stores and their
vectorization and if we introduce for that some new GIMPLE representation,
this could be done on top of that.

> >   http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02846.html
> 
> +  if (TREE_CODE (base) == MEM_REF)
>      {
> -      off = TREE_OPERAND (base, 1);
> +      if (!integer_zerop (TREE_OPERAND (base, 1)))
> +	{
> +	  if (off == NULL_TREE)
> +	    {
> +	      double_int moff = mem_ref_offset (base);
> +	      off = double_int_to_tree (sizetype, moff);
> +	    }
> +	  else
> +	    off = size_binop (PLUS_EXPR, off, TREE_OPERAND (base, 1));
> 
> that's not safe, TREE_OPEAND (base, 1) is of pointer type, so
> you unconditionally need to do the conversion to sizetype of
> TREE_OPEAND (base, 1).

Ok, will fix.

> The routine lacks comments - it's got quite big and fails to

And add the comments.

> state any reason for its complexity.  I'm also not sure why
> DR would include any loop invariant parts of the SCEV - doesn't
> it instantiate just for the loop in question?

I'm not sure I understand your question.  With the incremental
patch I'm not using any DR info appart from DR_REF to determine
what is loop invariant part of the address and what is not.
The reason for that is that split_constant_offset turns the GIMPLE
code into a sometimes big tree, which actually may contain a mixture
of loop invariants/constants and SSA_NAMEs defined in the loop,
that all with casts, multiplications/additions/subtractions.
For gather I need to split it into a single loop invariant
argument (which can be computed before the loop as loop invariant
and thus can be arbitrary tree expression that is just gimplified
there) and another SSA_NAME defined into the loop which can be
vectorized which is perhaps sign-extended and multiplied by 1/2/4/8.

With the approach the incremental patch does I just
walk what split_constant_offset during DR walks and peel off
loop invariants until I have something that should be used as the
vectorized index.

	Jakub

Follow-Ups:
- Re: Patch ping
  - From: Richard Guenther

References:
- Patch ping
  - From: Jakub Jelinek
- Re: Patch ping
  - From: Richard Guenther

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]