[PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

Jan Hubicka hubicka@ucw.cz
Thu Jan 2 19:50:00 GMT 2014

> > Frankly speaking, I do not understand, what's wrong here.
> > IMHO, this change is pretty mechanical: we just extend maximal aligment
> > available. Because of 512-bit data types we now extend maximal aligment to
> > 512 bits.
> Nothing wrong per se, but...
> > I suspect that an issue is here:
> >   if (opt
> >       && AGGREGATE_TYPE_P (type)
> >       && TYPE_SIZE (type)
> >       && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
> >       && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= (unsigned) max_align
> > 
> >           || TREE_INT_CST_HIGH (TYPE_SIZE (type)))
> > 
> >       && align < max_align)
> >     align = max_align;
> ...yes, bumping max_align has the unexpected side effect of changing the 
> behavior for sizes between the old value and the new value because of this 
> code.  I'm no x86 specialist, but I think that this should be fixed.
> > Maybe we can split it and handle 256-bit aggregates separately?
> Probably, and we should also add a warning just before the declaration of 
> max_align, as well as investigate whether this didn't already happen when 
> max_align was bumped from 128 to 256.

x86-64 ABI has clause about aligning static vars to 128bit boundary at a given
size.  This was introduced to aid compiler to generate aligned vector store/load
even if the object may bind to other object file.
This is set to stone and can not be changed for AVX/SSE.

For other objects that are fully under local control we can bump up alignment
more.  I remember this code was originally supposed to bump up to 128bits since
it was written long before AVX.  I suppose it would make sense to do so when
AVX is enabled and we anticipate to use it.

I am not quite sure however how important it is given that we have pass to increase
alignment for vectorizable arrays. Other case where we can autogenerate SSE is memcpy/memset,
but sadly only for variably sized case and we don't do that by default yet (I hope to
teach move_by_pieces/store_by_pieces about SSE soonish, but not for 4.9)

This logic all come from time when vectorization was in infancy.

More information about the Gcc-patches mailing list