This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: Richard Guenther <rguenther at suse dot de>
Cc: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "hubicka at ucw dot cz" <hubicka at ucw dot cz>, "rth at redhat dot com" <rth at redhat dot com>
Date: Fri, 11 Feb 2011 06:47:50 -0800
Subject: Re: [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1
References: <D4C76825A6780047854A11E93CDE84D004D3054E8E@SAUSEXMBP01.amd.com> <alpine.LNX.2.00.1102111044210.17230@zhemvz.fhfr.qr>

On Fri, Feb 11, 2011 at 1:46 AM, Richard Guenther <rguenther@suse.de> wrote:
> On Thu, 10 Feb 2011, Fang, Changpeng wrote:
>
>> Hi,
>>
>> ?Attached is the patch to force gcc to generate 128-bit avx instructions for bdver1. We found that for
>> the current Bulldozer processors, AVX128 performs better than AVX256. For example, AVX128 is 3%
>> faster than AVX256 on CFP2006, and 2~3% faster than AVX256 on polyhedron.
>>
>> As a result, we prefer gcc 4.6 to generate 128-bit avx instructions only (for bdver1).
>>
>> The patch passed bootstrapping on x86_64-unknown-linux-gnu with "-O3 -g -march=bdver1" and
>> the necessary correctness and performance.
>>
>> Is it OK to commit to trunk?
>
> I think there was no attempt to tune anything for AVX256, in particular
> the vectorizer cost model may be completely off. ?HJ and Andi also
> hinted at some alignment problems (at least SB seems to have a large
> penalty when loads cross a cacheline boundary). ?So - did you do any
> investigation on why 256bit vectors are slower for you? ?Are these
> cases that the cost model could easily catch?
>

Here is a patch to split 32byte unalignd load/store.  I don't have performance
number on this.

-- 
H.J.
----
gcc/

2011-02-11  H.J. Lu  <hongjiu.lu@intel.com>

	* config/i386/i386.c (flag_opts): Add -mavx256-split-unaligned-load
	and -mavx256-split-unaligned-store.
	(ix86_option_override_internal): Split 32-byte AVX unaligned
	load/store by default.
	(ix86_avx256_split_vector_move_misalign): New.
	(ix86_expand_vector_move_misalign): Use it.

	* config/i386/i386.opt: Add -mavx256-split-unaligned-load and
	-mavx256-split-unaligned-store.

	* config/i386/sse.md (*avx_mov<mode>_internal): Verify unaligned
	256bit load/store.  Generate unaligned store on misaligned memory
	operand.
	(*avx_movu<ssemodesuffix><avxmodesuffix>): Verify unaligned
	256bit load/store.
	(*avx_movdqu<avxmodesuffix>): Likewise.

	* doc/invoke.texi: Document -mavx256-split-unaligned-load and
	-mavx256-split-unaligned-store.

gcc/testsuite/

2011-02-11  H.J. Lu  <hongjiu.lu@intel.com>

	* gcc.target/i386/avx256-unaligned-load-1.c: New.
	* gcc.target/i386/avx256-unaligned-load-2.c: Likewise.
	* gcc.target/i386/avx256-unaligned-load-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-load-4.c: Likewise.
	* gcc.target/i386/avx256-unaligned-load-5.c: Likewise.
	* gcc.target/i386/avx256-unaligned-load-6.c: Likewise.
	* gcc.target/i386/avx256-unaligned-load-7.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-2.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-5.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-6.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-7.c: Likewise.

Attachment: gcc-avx256-unaligned-1.patch
Description: Text document

Follow-Ups:
- RE: [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1
  - From: Fang, Changpeng

References:
- [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1
  - From: Fang, Changpeng
- Re: [PATCH, i386 tuning] Generate 128-bit AVX by default for bdver1
  - From: Richard Guenther

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]