This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Avoid store forwarding issue in vectorizing strided SLP loads


On 09/28/2016 05:41 AM, Richard Biener wrote:

Currently strided SLP vectorization creates vector constructors composed
of vector elements.  This is a constructor form that is not handled
specially by the expander but it gets expanded via piecewise stores
to scratch memory and a load of that scratch memory.
Ugh.  Yup, obviously bad, even without store forwarding.


does not work on any CPU I know of).  The following patch simply avoids
the issue by making the vectorizer create integer loads, composing
a vector of that integers and then punning that to the desired vector
type.  Thus (V4SF){V2SF, V2SF} becomes (V4SF)(V2DI){DI, DI} and
every body is happy.  Especially x264 gets a 5-10% improvement
(dependent on vector size and x86 sub-architecture).
Seems reasonable to me -- there's not a lot of difference (conceptually) to how we've used SImode constants to construct DFmode constants in the past.


Handling the vector-vector constructors on the expander side would
require either similar punning or making vec_init parametric on
the element mode plus supporting vector elements in all targets
(which in the end probably will simply pun them similarly).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any comments?

Thanks,
Richard.

2016-09-28  Richard Biener  <rguenther@suse.de>

	* tree-vect-stmts.c (vectorizable_load): Avoid emitting vector
	constructors with vector elements.
Seems quite reasonable.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]