This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Re: Vectorizer question: DIV to RSHIFT conversion
- From: Jakub Jelinek <jakub at redhat dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Kirill Yukhin <kirill dot yukhin at gmail dot com>, Richard Guenther <rguenther at suse dot de>, Ira Rosen <ira dot rosen at linaro dot org>
- Date: Wed, 14 Dec 2011 18:53:42 +0100
- Subject: Re: [PATCH] Re: Vectorizer question: DIV to RSHIFT conversion
- References: <CAGs3RfvOFgfVQ=PkYM+CtsgBL99k_gF_R-Yet3oFLHU7b-k5jQ@mail.gmail.com> <alpine.LNX.2.00.1112131406430.4527@zhemvz.fhfr.qr> <20111213132128.GZ1957@tyan-ft48-01.lab.bos.redhat.com> <CAGs3Rfucj9C7DqcjjJOzor0X=Yf_DT1av1T6cZdfZkmOsS+VNw@mail.gmail.com> <20111213134741.GA1957@tyan-ft48-01.lab.bos.redhat.com> <CAGs3RfvkZW+k-NmqHkNx7V42zOV4SCL-OXVLUr7M-iHs_HVAYA@mail.gmail.com> <20111214122513.GD1957@tyan-ft48-01.lab.bos.redhat.com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Wed, Dec 14, 2011 at 01:25:13PM +0100, Jakub Jelinek wrote:
> On Tue, Dec 13, 2011 at 05:57:40PM +0400, Kirill Yukhin wrote:
> > > Let me hack up a quick pattern recognizer for this...
>
> Here it is, untested so far.
> On the testcase doing 2000000 f1+f2+f3+f4 calls in the loop with -O3 -mavx
> on Sandybridge (so, vectorized just with 16 byte vectors) gives:
> vanilla 0m34.571s
> the tree-vect* parts of this patch only 0m9.013s
> the whole patch 0m8.824s
> The i386 parts are just a small optimization, I guess it could be
> done in the vectorizer too (but then we'd have to check whether the
> arithmetic/logical right shifts are supported and check costs?), or
> perhaps in the generic vcond expander (again, we'd need to check some
> costs).
Now bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk (at least the pattern recognizer)?
> 2011-12-14 Jakub Jelinek <jakub@redhat.com>
>
> * tree-vectorizer.h (NUM_PATTERNS): Bump to 10.
> * tree-vect-patterns.c (vect_recog_sdivmod_pow2_pattern): New
> function.
> (vect_vect_recog_func_ptrs): Add it.
>
> * config/i386/sse.md (vcond<V_256:mode><VI_256:mode>,
> vcond<V_128:mode><VI124_128:mode>, vcond<VI8F_128:mode>v2di):
> Use general_operand instead of nonimmediate_operand for
> operand 5 and no predicate for operands 1 and 2.
> * config/i386/i386.c (ix86_expand_int_vcond): Optimize
> x < 0 ? -1 : 0 and x < 0 ? 1 : 0 into vector arithmetic
> resp. logical shift.
>
> * gcc.dg/vect/vect-sdivmod-1.c: New test.
Jakub