This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: {PATCH, vect, i386]: Vectorize lrint() and generate cvtpd2dq insn

From: "Richard Guenther" <richard dot guenther at gmail dot com>
To: "Uros Bizjak" <ubizjak at gmail dot com>
Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
Date: Fri, 29 Jun 2007 11:00:32 +0200
Subject: Re: {PATCH, vect, i386]: Vectorize lrint() and generate cvtpd2dq insn
References: <5787cf470706290136n1b0b9d2cib9c1a31df86edf5f@mail.gmail.com>

On 6/29/07, Uros Bizjak <ubizjak@gmail.com> wrote:

Hello!

This patch introduces the same approach using NARROW and WIDEN
modifier as already implemented in vectorizable_conversion() into
vectorizable_call() function. Using this modifier, gcc can vectorize
calls where (nunits_in == nunits_out / 2).

Attached patch uses this infrastructure to vectorize BUILT_IN_RINT
using cvtpd2dq sse insn. Also, this patch re-defines all 2-arg i386
builtins as const builtins (all builtins were checked that none of
them clobbers global memory).

Following testcase:

--cut here--
void foo(void)
{
  int i;

  for (i=0; i<256; ++i)
    b[i] = lrint (a[i]);
}
--cut here--

generates (-O2 -msse3 -ffast-math -ftree-vectorize):

.L7:
        cvtpd2dq       a(%eax,%eax), %xmm0
        cvtpd2dq       a+16(%eax,%eax), %xmm1
        punpcklqdq     %xmm1, %xmm0
        movdqa  %xmm0, b(%eax)
        addl    $16, %eax
        cmpl    $1024, %eax
        jne     .L7

The patch was bootstrapped on i686-pc-linux-gnu, regression tested for
all default languages. This patch finally closes PR
tree-optimization/24659, as all conversions are now vectorized (on
SSEx targets).

OK for mainline (The patch needs approval for vectorizer part)?

This is ok.

Thanks,
Richard.

2007-06-29 Uros Bizjak <ubizjak@gmail.com>

        PR tree-optimization/24659
        * tree-vect-transform.c (vectorizable_call): Handle
        (nunits_in == nunits_out / 2) and (nunits_out == nunits_in / 2) cases.

        * config/i386/sse.md (vec_pack_sfix_v2df): New expander.
        * config/i386/i386.c (enum ix86_builtins) [IX86_BUILTIN_VEC_PACK_SFIX]:
        New constant.
        (struct bdesc_2arg) [__builtin_ia32_vec_pack_sfix]: New builtin
        description.
        (ix86_init_mmx_sse_builtins): Define all builtins with 2 arguments as
        const using def_builtin_const.
        (ix86_expand_binop_builtin): Remove bogus assert() that insn wants
        input operands in the same modes as the result.
        (ix86_builtin_vectorized_function): Handle BUILT_IN_LRINT.

testsuite/ChangeLog:

2007-06-29 Uros Bizjak <ubizjak@gmail.com>

        PR tree-optimization/24659
        * gcc.target/i386/vectorize2.c: New test.
        * gcc.target/i386/sse2-lrint-vec.c: New runtime test.
        * gcc.target/i386/sse2-lrintf-vec.c: Ditto.

Uros.

References:
- {PATCH, vect, i386]: Vectorize lrint() and generate cvtpd2dq insn
  - From: Uros Bizjak

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]