This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH i386][google]With -mtune=core2, avoid generating the slow unaligned vector load/store (issue5488054)


On 12/12/2011 06:05 PM, Sriraman Tallam wrote:
> On core2, unaligned vector load/store using movdqu is a very slow operation.
> Experiments show it is six times slower than movdqa (aligned) and this is
> irrespective of whether the resulting data happens to be aligned or not. 
> For Corei7, there is no performance difference between the two and on AMDs,
> movdqu is only about 10% slower.  
> 
> This patch does not vectorize loops that need to generate the slow unaligned
> memory load/stores on core2.

What happens if you temporarily disable

      /* ??? Similar to above, only less clear because of quote
         typeless stores unquote.  */
      if (TARGET_SSE2 && !TARGET_SSE_TYPELESS_STORES
          && GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
        {
          op0 = gen_lowpart (V16QImode, op0);
          op1 = gen_lowpart (V16QImode, op1);
          emit_insn (gen_sse2_movdqu (op0, op1));
          return;
        }

so that the unaligned store happens via movlps + movhps?


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]