[WIP PATCH]: Autovectorize V2SF mode

Fri May 8 17:53:19 GMT 2020

On Fri, May 8, 2020 at 7:22 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> Attached WIP patch enables auto-vectorization of basic V2SF operations
> (plus, minus, mult, min/max). The compiler takes care that everything
> is loaded from memory via movq insn, so top two registers always
> remain zero.

This example:

--cut here--
float r[2], a[2], b[2], c[2];

void foo (void)
{
  for (int i = 0; i < 2; i++)
    r[i] = 0.0f + a[i] - b[i] * c[i] + -1.0f;
}
--cut here--

compiles (-O3) to:

foo:
        movq    a(%rip), %xmm0
        xorps   %xmm1, %xmm1
        movq    c(%rip), %xmm2
        addps   %xmm1, %xmm0
        movq    b(%rip), %xmm1
        mulps   %xmm2, %xmm1
        subps   %xmm1, %xmm0
        movq    .LC0(%rip), %xmm1
        addps   %xmm1, %xmm0
        movlps  %xmm0, r(%rip)
        ret

Uros.