This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Add vzeroupper optimization for AVX


On Mon, Oct 25, 2010 at 4:38 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Oct 25, 2010 at 01:57:24AM -0700, H.J. Lu wrote:
>> At RTL expansion time, the vzeroupper optimization generates a
>> vzeroupper_nop before function call and functin return if 256bit AVX
>> instructions are used. The vzeroupper pass is run before final pass.
>
> Can't you run it at the end of machine_reorg instead?
>

Here is the updated patch without the new pass.  OK for trunk?

Thanks.


-- 
H.J.
gcc/

2010-10-25  H.J. Lu  <hongjiu.lu@intel.com>

	* config/i386/i386-protos.h (init_cumulative_args): Add an int.

	* config/i386/i386.c (block_info): New.
	(BLOCK_INFO): Likewise.
	(RTX_VZEROUPPER_CALLEE_RETURN_AVX256): Likewise.
	(RTX_VZEROUPPER_CALLEE_RETURN_PASS_AVX256): Likewise.
	(RTX_VZEROUPPER_CALLEE_PASS_AVX256): Likewise.
	(RTX_VZEROUPPER_NO_AVX256): Likewise.
	(check_avx256_stores): Likewise.
	(move_or_delete_vzeroupper_2): Likewise.
	(move_or_delete_vzeroupper_1): Likewise.
	(move_or_delete_vzeroupper): Likewise.
	(use_avx256_p): Likewise.
	(function_pass_avx256_p): Likewise.
	(flag_opts): Add -mvzeroupper.
	(ix86_option_override_internal): Turn on MASK_VZEROUPPER by
	default for TARGET_AVX.  Turn off MASK_VZEROUPPER if TARGET_AVX
	is disabled.
	(ix86_function_ok_for_sibcall): Disable sibcall if we need to
	generate vzeroupper.
	(init_cumulative_args): Add an int to indicate caller.  Set
	use_avx256_p, callee_return_avx256_p and caller_use_avx256_p
	based on return type.
	(ix86_function_arg): Set use_avx256_p, callee_pass_avx256_p and
	caller_pass_avx256_p based on argument type.
	(ix86_expand_epilogue): Emit vzeroupper if 256bit AVX register
	is used, but not returned by caller.
	(ix86_expand_call): Emit vzeroupper if 256bit AVX register is
	used.
	(ix86_local_alignment): Set use_avx256_p if 256bit AVX register
	is used.
	(ix86_minimum_alignment): Likewise.
	(ix86_reorg): Run the vzeroupper optimization if needed.

	* config/i386/i386.h (ix86_args): Add caller.
	(INIT_CUMULATIVE_ARGS): Updated.
	(machine_function): Add use_vzeroupper_p, use_avx256_p,
	caller_pass_avx256_p, caller_return_avx256_p,
	callee_pass_avx256_p and callee_return_avx256_p.

	* config/i386/i386.md (UNSPECV_VZEROUPPER_NOP): New.
	* config/i386/sse.md (avx_vzeroupper_nop): Likewise.

	* config/i386/i386.opt (-mvzeroupper): New.

	* doc/invoke.texi: Document -mvzeroupper.

gcc/testsuite/

2010-10-25  H.J. Lu  <hongjiu.lu@intel.com>

	* gcc.target/i386/avx-vzeroupper-1.c: Add -mtune=generic.
	* gcc.target/i386/avx-vzeroupper-2.c: Likewise.

	* gcc.target/i386/avx-vzeroupper-3.c: New.
	* gcc.target/i386/avx-vzeroupper-4.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-5.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-6.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-7.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-8.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-9.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-10.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-11.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-12.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-13.c: Likewise.
	* gcc.target/i386/avx-vzeroupper-14.c: Likewise.

Attachment: gcc-vzeroupper-2.patch
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]