This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PING][PATCH][REVISED] Fix PR middle-end/PR28690, modify swap_commutative_operands_p


On Tue, 2007-06-26 at 20:03 -0700, H. J. Lu wrote:
> On Tue, Jun 26, 2007 at 09:09:47PM -0500, Peter Bergner wrote:
[snip]
> > but given his change only allows skipping up to 10 bytes now, there
> > might still be cases where we don't get the alignment we want.
> 
> That is not true:
> 
> http://gcc.gnu.org/ml/gcc-cvs/2007-06/msg00684.html
> 
> My patch adds ".p2align 3" after ".p2align 4,,X" for x86-64.

Yes, I know, but it's still might not be the 16-byte alignment we really
wanted.  It's also why added the snippit below:

> > I'm not sure about x86/x86_64 hardware, but on POWER, two identical
> > loops that have the same alignment in the lowest nibble of their
> > addresses can still have performance differences depending on where
> > they show up in the cache line/cache line sector.

So I'm just saying, even with your alignment patch, this could still
very well be alignment related.  That's why I asked you to send me
the three binaries that showed degradation (I skipped art since that
seems to vary widely even from run to run using the same binary)
so we can track down exactly why it is slowing down.

Peter




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]