[PATCH] Teach the vectorizer about multiple sizes

Wed Oct 6 20:54:00 GMT 2010

On Wed, 6 Oct 2010, Richard Henderson wrote:

> On 10/06/2010 05:45 AM, Richard Guenther wrote:
> > +   /* Query the preferred simd mode and if that does not match the
> > +      requested size get a supported mode based on its mode class.  */
> >     simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
> > +   if (size != 0
> > +       && size != GET_MODE_SIZE (simd_mode))
> > +     {
> > +       simd_mode = mode_for_size (size *  BITS_PER_UNIT,
> > + 				 GET_MODE_CLASS (simd_mode), 0);
> > +       if (simd_mode == BLKmode
> > + 	  || !targetm.vector_mode_supported_p (simd_mode))
> > + 	return NULL_TREE;
> > +     }
> 
> I have trouble believing that this is either sufficient or 
> correct.  First, mode_for_size isn't being given anything
> but the total vector size; it's not being given the inner mode.
> So you'd need to iterate over the set of modes to find one
> that matches the inner mode you're looking for.  E.g. split
> out the loop from layout_type, case vector_type.

Hmm, but we have MODE_VECTOR_INT and MODE_VECTOR_FLOAT classes,
so I wonder what mode_for_size does here ... but indeed it looks
I need to revisit this.  Thanks for noticing.

> Second, when an explicit size is requested I don't see why
> you should bother with preferred_simd_mode at all.

For the mode class, also the preferred mode may be word_mode.

> > !   /* Autodetect first vector size we try.  */
> > !   current_vector_size = 0;
> > !   vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
> ...
> > !       vector_sizes &= ~current_vector_size;
> > !       if (vector_sizes == 0)
> > ! 	return NULL;
> > ! 
> > !       /* Try the next biggest vector size.  */
> > !       current_vector_size = 1 << floor_log2 (vector_sizes);
> 
> This seems slightly odd.  Consider
> 
>   c_v_s = 0;
>   v_s = 32 | 16;
> 
>   // first loop
>   c_v_s = 16; // via autodetection
> 
>   v_s &= ~c_v_s; -> 32;
> 
>   c_v_s = 1 << floor_log2(32) -> 32
> 
>   // second loop
>   did we really want to get here?
> 
> A reasonable assumption for this is that c_v_s should be 
> decreasing only.  Perhaps better to do
> 
>   v_s &= c_v_s - 1;
> 
> i.e. remove c_v_s and all larger sizes.

I think we also want to consider larger sizes if the target prefers
the smaller one but only the larger one can actually vectorize things.

But we can certainly start with your suggestion (I can check if we
miss anything in SPEC 2006 when restricting the 2nd loop to smaller
sizes).

> Nit: I prefer while (1) { } over do { } while (1).

Ok ;)

Thanks,
Richard.