[PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

Bill Schmidt wschmidt@linux.vnet.ibm.com
Mon Sep 14 14:20:00 GMT 2015


On Mon, 2015-09-14 at 10:47 +0100, Alan Lawrence wrote:
> On 11/09/15 14:19, Bill Schmidt wrote:
> >
> > A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar
> > that has to be broadcast back to a vector, and the best way to implement
> > it for us already has the max value in all positions of a vector.  But
> > that is something we should be able to fix with simplify-rtx in the back
> > end.
> 
> Reading this thread again, this bit stands out as unaddressed. Yes PowerPC can 
> "fix" this with simplify-rtx, but the vector cost model will not take this into 
> account - it will think that the broadcast-back-to-a-vector requires an extra 
> operation after the reduction, whereas in fact it will not.
> 
> Does that suggest we should have a new entry in vect_cost_for_stmt for 
> vec_to_scalar-and-back-to-vector (that defaults to vec_to_scalar+scalar_to_vec, 
> but on some architectures e.g. PowerPC would be the same as vec_to_scalar)?

Ideally I think we need to do something for that, yeah.  The back ends
could try to patch up the cost when finishing costs for the loop body,
epilogue, etc., but that would be somewhat of a guess; it would be
better to just be up-front that we're doing a reduction to a vector.

As part of this, I dislike the term "vec_to_scalar", which is somewhat
vague about what's going on (it sound like it could mean a vector
extract operation, which is more of an inverse of "scalar_to_vec" than a
reduction is).  GIMPLE calls it a reduction, and the optabs call it a
reduction, so we ought to call it a reduction in the vectorizer cost
model, too.

To cover our bases for PowerPC and AArch32, we probably need:

  plus_reduc_to_scalar
  plus_reduc_to_vector
  minmax_reduc_to_scalar
  minmax_reduc_to_vector

although I think plus_reduc_to_vector wouldn't be used yet, so could be
omitted.  If we go this route, then at that time we would change your
code to use minmax_reduc_to_vector and let the back ends determine
whether that requires a scalar reduction followed by a broadcast, or
whether it would be performed directly.

Using direct reduction to vector for MIN and MAX on PowerPC would be a
big cost savings over scalar reduction/broadcast.

Thanks,
Bill

> 
> (I agree that if that's the limit of how "different" conditional reductions may 
> be between architectures, then we should not have a vec_cost_for_stmt for a 
> whole conditional reduction.)
> 
> Cheers, Alan
> 




More information about the Gcc-patches mailing list