This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: [ARM] Disable peeling

From: Richard Earnshaw <rearnsha at arm dot com>
To: Richard Biener <richard dot guenther at gmail dot com>
Cc: Andi Kleen <andi at firstfloor dot org>, Jan Hubicka <hubicka at ucw dot cz>, Christophe Lyon <christophe dot lyon at linaro dot org>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
Date: Tue, 11 Dec 2012 09:48:35 +0000
Subject: Re: RFC: [ARM] Disable peeling
References: <CAKdteOZb66r_0t1LLUdToQkJFo8UnX8f671pduuc4i7vOcL6qQ@mail.gmail.com> <50C227C5.4010601@arm.com> <CAFiYyc2LNa=MRAn5S0CZV_=Ds0SAsvqH9w1MOi7of1GFhp=ABQ@mail.gmail.com> <20121210171057.GI671@atrey.karlin.mff.cuni.cz> <m2txrtehvu.fsf@firstfloor.org> <CAFiYyc1LX1mj0E1MhTYb8AmPrOGdkjeKY7C2KaNVEan=_+3YeA@mail.gmail.com>

On 11/12/12 09:45, Richard Biener wrote:

On Mon, Dec 10, 2012 at 10:07 PM, Andi Kleen <andi@firstfloor.org> wrote:

Jan Hubicka <hubicka@ucw.cz> writes:

Note that I think Core has similar characteristics - at least for string operations
it fares well with unalignes accesses.


Nehalem and later has very fast unaligned vector loads. There's still some
penalty when they cross cache lines however.

iirc the rule of thumb is to do unaligned for 128 bit vectors,
but avoid it for 256bit vectors because the cache line cross
penalty is larger on Sandy Bridge and more likely with the larger
vectors.


Yes, I think the rule was that using the unaligned instruction variants carries
no penalty when the actual access is aligned but that aligned accesses are
still faster than unaligned accesses.  Thus peeling for alignment _is_ a win.
I also seem to remember that the story for unaligned stores vs. unaligned loads
is usually different.

Yes, it's generally the case that unaligned loads are slightly more expensive than unaligned stores, since the stores can often merge in a store buffer with little or no penalty.

R.

Follow-Ups:
- Re: RFC: [ARM] Disable peeling
  - From: Richard Biener

References:
- RFC: [ARM] Disable peeling
  - From: Christophe Lyon
- Re: RFC: [ARM] Disable peeling
  - From: Richard Earnshaw
- Re: RFC: [ARM] Disable peeling
  - From: Richard Biener
- Re: RFC: [ARM] Disable peeling
  - From: Jan Hubicka
- Re: RFC: [ARM] Disable peeling
  - From: Andi Kleen
- Re: RFC: [ARM] Disable peeling
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]