This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: _mm_store_pd/s translated to movntpd/s

From: Matthias Kretz <kretz at compeng dot uni-frankfurt dot de>
To: gcc-help at gcc dot gnu dot org
Date: Mon, 21 Mar 2011 15:49:36 +0100
Subject: Re: _mm_store_pd/s translated to movntpd/s
References: <201103211523.02770.kretz@compeng.uni-frankfurt.de>

Hi,

On Monday 21 March 2011 15:23:02 Matthias Kretz wrote:
> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now I
> tested on an AMD Magny-Cours using the -march=barcelona flag and gcc
> translated _mm_store_pd/s calls in the code to streaming stores in the
> resulting binary.
> 
> Where does this "optimization" come from and how can I disable it? This
> doesn't make much sense on a working set that fits into the cache...
> 
> Is this intended behavior or a bug?

Additional info: If I add -fno-prefetch-loop-arrays I get normal stores as 
expected. I don't consider this a solution, though.

Regards,
	Matthias

-- 
Dipl.-Phys. Matthias Kretz
http://compeng.uni-frankfurt.de/?mkretz

Follow-Ups:
- Re: _mm_store_pd/s translated to movntpd/s
  - From: Brian Budge
- Re: _mm_store_pd/s translated to movntpd/s
  - From: Ian Lance Taylor

References:
- _mm_store_pd/s translated to movntpd/s
  - From: Matthias Kretz

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]