This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/45391] CPU2006 482.sphinx3: gcc4.6 5% regression from prefetching of vectorized loop

From: "changpeng dot fang at amd dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: 24 Aug 2010 00:03:55 -0000
Subject: [Bug target/45391] CPU2006 482.sphinx3: gcc4.6 5% regression from prefetching of vectorized loop
References: <bug-45391-18740@http.gcc.gnu.org/bugzilla/>
Reply-to: gcc-bugzilla at gcc dot gnu dot org


------- Comment #2 from changpeng dot fang at amd dot com  2010-08-24 00:03 -------
float f (float *x, float *y, float *z, unsigned n)
{
  float ret = 0.0;
  unsigned i;
  for (i = 0; i < n; i++)
    {
      float diff = x[i] - y[i];
      ret -= diff * diff * z[i];
    }
  return ret;
}

NO, this is related tp PR 45022 in certain sense, but the underlying
reason is yet unknown.

For the above test case, if I compile with -O3 -march=amdfam10 -m64,
the loop is not vectorized due to floating point reduction. To my
surprise, no prefetch is generated. The cost model filtered out the 
prefetches (we are trying to prefetch for each of the three memory
references):
Ahead 15, unroll factor 1, trip count -1
insn count 14, mem ref count 3, prefetch count 3
Not prefetching -- instruction to prefetch ratio (4) too small

However, if we compile with -O3 -ffast-math -march=amdfam10 -m64,
the loop can be vectorized, and one of the array reference is 
aligned. As a result and due to PR 45022, we are trying to prefetch
only for the aligned reference, and one prefetch is inserted (this
time, insns-to-prefetch ratio is big enough).

The Fix of PR 45022 will result in NO prefetch generated actually and thus
hide the problem.




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45391

References:
- [Bug target/45391] New: CPU2006 482.sphinx3: gcc4.6 5% regression from prefetching of vectorized loop
  - From: changpeng dot fang at amd dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]