[Bug tree-optimization/60888] New: x86 vector widen multiplication by constant is not replaced with shift and sub

Fri Apr 18 15:28:00 GMT 2014

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60888

            Bug ID: 60888
           Summary: x86 vector widen multiplication by constant is not
                    replaced with shift and sub
           Product: gcc
           Version: 4.10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: evstupac at gmail dot com

Created attachment 32631
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32631&action=edit
test case

For the following test case:

void
foo(char *out, char *in)
{
  int i;
  for(i = 0; i < 1024; i++)
    out[i] = (in[i] * 32767) >> 15;
}

compiled with:
-O3 -m32 -msse2 -S -fdump-tree-vect-details

Generates the following code at 114t.vect:

  vect_cst_.16_106 = { 32767, 32767, 32767, 32767, 32767, 32767, 32767, 32767
}; 
  ...
  vect_patt_24.15_107 = WIDEN_MULT_LO_EXPR <vect__25.14_104, vect_cst_.16_106>;
  vect_patt_24.15_108 = WIDEN_MULT_HI_EXPR <vect__25.14_104, vect_cst_.16_106>;
  vect_patt_24.15_109 = WIDEN_MULT_LO_EXPR <vect__25.14_105, vect_cst_.16_106>;
  vect_patt_24.15_110 = WIDEN_MULT_HI_EXPR <vect__25.14_105, vect_cst_.16_106>;

These 4 multiplications stay till final assembler:
...
  punpcklbw  %xmm0, %xmm2
  punpckhbw  %xmm0, %xmm5
  pmullw     %xmm2, %xmm1 
  movdqa     %xmm1, %xmm0 
  pmulhw     %xmm3, %xmm2
...

However:

 out[i] = ((in[i] << 15) - in[i]) >> 15;

is faster and calculating the same.