This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: Split AVX 32byte unalignd load/store
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sun, 27 Mar 2011 19:53:13 +0200
- Subject: Re: PATCH: Split AVX 32byte unalignd load/store
- References: <AANLkTimnrqAtfukMkpZkjcLSEaJC2feHYnp9Bm0oh_Do@mail.gmail.com>
On Sun, Mar 27, 2011 at 3:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> Here is a patch to split AVX 32byte unalignd load/store:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00743.html
>
> It speeds up some SPEC CPU 2006 benchmarks by up to 6%.
> OK for trunk?
> 2011-02-11 H.J. Lu <hongjiu.lu@intel.com>
>
> * config/i386/i386.c (flag_opts): Add -mavx256-split-unaligned-load
> and -mavx256-split-unaligned-store.
> (ix86_option_override_internal): Split 32-byte AVX unaligned
> load/store by default.
> (ix86_avx256_split_vector_move_misalign): New.
> (ix86_expand_vector_move_misalign): Use it.
>
> * config/i386/i386.opt: Add -mavx256-split-unaligned-load and
> -mavx256-split-unaligned-store.
>
> * config/i386/sse.md (*avx_mov<mode>_internal): Verify unaligned
> 256bit load/store. Generate unaligned store on misaligned memory
> operand.
> (*avx_movu<ssemodesuffix><avxmodesuffix>): Verify unaligned
> 256bit load/store.
> (*avx_movdqu<avxmodesuffix>): Likewise.
>
> * doc/invoke.texi: Document -mavx256-split-unaligned-load and
> -mavx256-split-unaligned-store.
>
> gcc/testsuite/
>
> 2011-02-11 H.J. Lu <hongjiu.lu@intel.com>
>
> * gcc.target/i386/avx256-unaligned-load-1.c: New.
> * gcc.target/i386/avx256-unaligned-load-2.c: Likewise.
> * gcc.target/i386/avx256-unaligned-load-3.c: Likewise.
> * gcc.target/i386/avx256-unaligned-load-4.c: Likewise.
> * gcc.target/i386/avx256-unaligned-load-5.c: Likewise.
> * gcc.target/i386/avx256-unaligned-load-6.c: Likewise.
> * gcc.target/i386/avx256-unaligned-load-7.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-2.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-5.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-6.c: Likewise.
> * gcc.target/i386/avx256-unaligned-store-7.c: Likewise.
>
> @@ -203,19 +203,37 @@
> return standard_sse_constant_opcode (insn, operands[1]);
> case 1:
> case 2:
> + if (GET_MODE_ALIGNMENT (<MODE>mode) == 256
> + && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE
> + && MEM_P (operands[0])
> + && MEM_ALIGN (operands[0]) < 256)
> + || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD
> + && MEM_P (operands[1])
> + && MEM_ALIGN (operands[1]) < 256)))
> + gcc_unreachable ();
Please use "misaligned_operand (operands[...], <MODE>mode)" instead of
MEM_P && MEM_ALIGN combo in a couple of places.
OK with that change.
Thanks,
Uros.