This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] Matrix Flattening optimization

From: "Richard Guenther" <richard dot guenther at gmail dot com>
To: "Ayal Zaks" <ZAKS at il dot ibm dot com>
Cc: "Shantonu Sen" <ssen at opendarwin dot org>, "Razya Ladelsky" <RAZYA at il dot ibm dot com>, gcc-patches at gcc dot gnu dot org, "Daniel Berlin" <dberlin at dberlin dot org>, jh at suse dot cz
Date: Fri, 3 Mar 2006 12:10:14 +0100
Subject: Re: [RFC] Matrix Flattening optimization
References: <OF6E18706D.C44D80FC-ONC2257125.007ABE9A-C2257125.007AC898@LocalDomain> <OFD86ED807.00B76AD8-ONC2257125.007AD22D-C2257125.007CC426@il.ibm.com>

On 3/2/06, Ayal Zaks <ZAKS@il.ibm.com> wrote:
> > How does this deal with the case where
> > 1) dim2 is a random odd number like 139
> > 2) malloc on the host returns vector-aligned allocations
> > 3) you're planning on doing vector operations on a[i] for at least the
> first
> > part of a[i].
> >
> > If you do this type of malloc combining, a'[i * dim2 + 0] may no longer
> be
> > aligned on a 16-byte boundary or whatever, so your later vectorized code
> would
> > no longer be operating on aligned data, and may crash. If a user is using
> the
> > auto-vectorizer, it's possible gcc could regroup the operation to perform
>
> > well, but that wouldn't work well with hand-written vector code.
>
> Good point. We certainly don't want to obstruct vectorization ..
>
> Note that
> Razya>The analysis part is implemented in analyze_matrix_allocation_site()
> Razya>and analyze_matrix_accesses().
> The latter traverses all accesses, so it could detect such hand-written
> vector code and either disable the optimization for this matrix or pad its
> rows to properly align them. Padding for alignment might actually be worth
> while irrespective of hand/auto vectorization.

Even padding to either power-of-two row size (for cheap multiplication) or
cache-line aligned row-size if cheap enough may be worth in general.  Though
I wonder if you currenlty deal with not compile-time constant row sizes?

(and no, it won't help tramp3d ;))
Richard.

Follow-Ups:
- Re: [RFC] Matrix Flattening optimization
  - From: Falk Hueffner
- Re: [RFC] Matrix Flattening optimization
  - From: Ayal Zaks

References:
- Re: [RFC] Matrix Flattening optimization
  - From: Ayal Zaks

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]