This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1


On Thu, 10 Feb 2011, Fang, Changpeng wrote:

> Hi, 
> 
>  Attached is the patch to force gcc to generate 128-bit avx instructions for bdver1. We found that for
> the current Bulldozer processors, AVX128 performs better than AVX256. For example, AVX128 is 3%
> faster than AVX256 on CFP2006, and 2~3% faster than AVX256 on polyhedron.
> 
> As a result, we prefer gcc 4.6 to generate 128-bit avx instructions only (for bdver1).
> 
> The patch passed bootstrapping on x86_64-unknown-linux-gnu with "-O3 -g -march=bdver1" and
> the necessary correctness and performance.
> 
> Is it OK to commit to trunk?

I think there was no attempt to tune anything for AVX256, in particular
the vectorizer cost model may be completely off.  HJ and Andi also
hinted at some alignment problems (at least SB seems to have a large
penalty when loads cross a cacheline boundary).  So - did you do any
investigation on why 256bit vectors are slower for you?  Are these
cases that the cost model could easily catch?

Thanks,
Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]