_mm_store_pd/s translated to movntpd/s
Matthias Kretz
kretz@compeng.uni-frankfurt.de
Mon Mar 21 17:56:00 GMT 2011
Hi,
On Monday 21 March 2011 18:00:38 Brian Budge wrote:
> On Mon, Mar 21, 2011 at 7:49 AM, Matthias Kretz wrote:
> > On Monday 21 March 2011 15:23:02 Matthias Kretz wrote:
> >> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now
> >> I tested on an AMD Magny-Cours using the -march=barcelona flag and gcc
> >> translated _mm_store_pd/s calls in the code to streaming stores in the
> >> resulting binary.
> >>
> >> Where does this "optimization" come from and how can I disable it? This
> >> doesn't make much sense on a working set that fits into the cache...
> >>
> >> Is this intended behavior or a bug?
> >
> > Additional info: If I add -fno-prefetch-loop-arrays I get normal stores
> > as expected. I don't consider this a solution, though.
>
> Do you mean _mm_stream_pd/s? I think store will still take your
> values to cache...
I mean that I wrote _mm_store_pd/s in my code but I got _mm_stream_pd/s
instead. Only if I compile with -fno-prefetch-loop-arrays do I actually get
non-streaming stores.
Regards,
Matthias
--
Dipl.-Phys. Matthias Kretz
http://compeng.uni-frankfurt.de/?mkretz
More information about the Gcc-help
mailing list