This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Honnor ix86_accumulate_outgoing_args again


> we are going to have some AMD CPU with AVX2 support soon, the question is
> if it will prefer 256-bit vmovups/vmovupd/vmovdqu or split, but even
> if it will prefer split, the question is if like bdver{1,2,3} it will
> be X86_TUNE_AVX128_OPTIMAL, because if yes, then how 256-bit unaligned
> loads/stores are handled is much less important there.  Ganesh?

256-bit is friendly on bdver4. 
But, 256 bit unaligned stores are micro-coded which we would like to avoid. So we require 128-bit MOVUPS.

-----Original Message-----
From: Jakub Jelinek [mailto:jakub@redhat.com] 
Sent: Tuesday, November 12, 2013 3:57 PM
To: Jan Hubicka
Cc: H.J. Lu; Vladimir Makarov; GCC Patches; Uros Bizjak; Richard Henderson; Gopalasubramanian, Ganesh
Subject: Re: Honnor ix86_accumulate_outgoing_args again

On Tue, Nov 12, 2013 at 11:05:45AM +0100, Jan Hubicka wrote:
> > @@ -16576,7 +16576,7 @@ ix86_avx256_split_vector_move_misalign (rtx 
> > op0, rtx op1)
> > 
> >    if (MEM_P (op1))
> >      {
> > -      if (TARGET_AVX256_SPLIT_UNALIGNED_LOAD)
> > +      if (!TARGET_AVX2 && TARGET_AVX256_SPLIT_UNALIGNED_LOAD)
> >      {
> >        rtx r = gen_reg_rtx (mode);
> >        m = adjust_address (op1, mode, 0); @@ -16596,7 +16596,7 @@ 
> > ix86_avx256_split_vector_move_misalign (rtx op0, rtx op1)
> >      }
> >    else if (MEM_P (op0))
> >      {
> > -      if (TARGET_AVX256_SPLIT_UNALIGNED_STORE)
> > +      if (!TARGET_AVX2 && TARGET_AVX256_SPLIT_UNALIGNED_STORE)
> 
> I would add explanation comment on those two.

Looking at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01235.html
we are going to have some AMD CPU with AVX2 support soon, the question is
if it will prefer 256-bit vmovups/vmovupd/vmovdqu or split, but even
if it will prefer split, the question is if like bdver{1,2,3} it will
be X86_TUNE_AVX128_OPTIMAL, because if yes, then how 256-bit unaligned
loads/stores are handled is much less important there.  Ganesh?

> Shall we also disable argument accumulation for cores? It seems we won't
> solve the IRA issues, right?

You mean LRA issues here, right?  If you are starting to use
no-accumulate-outgoing-args much more often than in the past, I think
the problem that LRA forces a frame pointer in that case is much more
important now (or has that been fixed in the mean time?).  Vlad?

	Jakub



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]