This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Matrix Flattening optimization


On 3/2/06, Ayal Zaks <ZAKS@il.ibm.com> wrote:
> > How does this deal with the case where
> > 1) dim2 is a random odd number like 139
> > 2) malloc on the host returns vector-aligned allocations
> > 3) you're planning on doing vector operations on a[i] for at least the
> first
> > part of a[i].
> >
> > If you do this type of malloc combining, a'[i * dim2 + 0] may no longer
> be
> > aligned on a 16-byte boundary or whatever, so your later vectorized code
> would
> > no longer be operating on aligned data, and may crash. If a user is using
> the
> > auto-vectorizer, it's possible gcc could regroup the operation to perform
>
> > well, but that wouldn't work well with hand-written vector code.
>
> Good point. We certainly don't want to obstruct vectorization ..
>
> Note that
> Razya>The analysis part is implemented in analyze_matrix_allocation_site()
> Razya>and analyze_matrix_accesses().
> The latter traverses all accesses, so it could detect such hand-written
> vector code and either disable the optimization for this matrix or pad its
> rows to properly align them. Padding for alignment might actually be worth
> while irrespective of hand/auto vectorization.

Even padding to either power-of-two row size (for cheap multiplication) or
cache-line aligned row-size if cheap enough may be worth in general.  Though
I wonder if you currenlty deal with not compile-time constant row sizes?

(and no, it won't help tramp3d ;))
Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]