This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][i386] Add some obvious missing vectorizer patterns for AVX


On Wed, May 12, 2010 at 8:06 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, May 12, 2010 at 4:37 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, May 12, 2010 at 6:37 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, May 12, 2010 at 3:09 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Wed, May 12, 2010 at 1:40 AM, Richard Guenther <rguenther@suse.de> wrote:
>>>>> On Tue, 11 May 2010, H.J. Lu wrote:
>>>>>
>>>>>> On Mon, May 10, 2010 at 6:02 AM, Richard Guenther <rguenther@suse.de> wrote:
>>>>>> >
>>>>>> > This adds patterns that do not require much thought. ?I duplicated
>>>>>> > the existing (but odd to me) superfluous vec_concats for example
>>>>>> > in vec_unpacks_hi_v8sf (AVX would have vextract for a
>>>>>> > highpart vec_select - but there must be a reason to do it the
>>>>>> > odd way for SSE).
>>>>>> >
>>>>>> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>>>>> >
>>>>>> > Ok for trunk?
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Richard.
>>>>>> >
>>>>>> > 2010-05-10 ?Richard Guenther ?<rguenther@suse.de>
>>>>>> >
>>>>>> > ? ? ? ?* config/i386/sse.md (reduc_splus_v8sf): Add.
>>>>>> > ? ? ? ?(reduc_splus_v4df): Likewise.
>>>>>> > ? ? ? ?(vec_unpacks_hi_v8sf): Likewise.
>>>>>> > ? ? ? ?(vec_unpacks_lo_v8sf): Likewise.
>>>>>> > ? ? ? ?(*avx_cvtps2pd256_2): Likewise.
>>>>>> > ? ? ? ?(vec_unpacks_float_hi_v8si): Likewise.
>>>>>> > ? ? ? ?(vec_unpacks_float_lo_v8si): Likewise.
>>>>>> > ? ? ? ?(vec_interleave_highv4df): Likewise.
>>>>>> > ? ? ? ?(vec_interleave_lowv4df): Likewise.
>>>>>> >
>>>>>>
>>>>>> >
>>>>>> > + (define_insn "vec_interleave_highv4df"
>>>>>> > + ? [(set (match_operand:V4DF 0 "register_operand" "=x")
>>>>>> > + ? ? ? (vec_select:V4DF
>>>>>> > + ? ? ? ? (vec_concat:V8DF
>>>>>> > + ? ? ? ? ? (match_operand:V4DF 1 "register_operand" "x")
>>>>>> > + ? ? ? ? ? (match_operand:V4DF 2 "nonimmediate_operand" "xm"))
>>>>>> > + ? ? ? ? (parallel [(const_int 2) (const_int 6)
>>>>>> > + ? ? ? ? ? ? ? ? ? ?(const_int 3) (const_int 7)])))]
>>>>>> > + ? "TARGET_AVX"
>>>>>> > + ? "vunpckhpd\t{%2, %1, %0|%0, %1, %2}"
>>>>>> > + ? [(set_attr "type" "sselog")
>>>>>> > + ? ?(set_attr "prefix" "vex")
>>>>>> > + ? ?(set_attr "mode" "V4DF")])
>>>>>> > +
>>>>>>
>>>>>> Those patterns are incorrect. For example, there is
>>>>>>
>>>>>> (define_insn "avx_unpckhpd256"
>>>>>> ? [(set (match_operand:V4DF 0 "register_operand" "=x")
>>>>>> ? ? ? ? (vec_select:V4DF
>>>>>> ? ? ? ? ? (vec_concat:V8DF
>>>>>> ? ? ? ? ? ? (match_operand:V4DF 1 "register_operand" "x")
>>>>>> ? ? ? ? ? ? (match_operand:V4DF 2 "nonimmediate_operand" "xm"))
>>>>>> ? ? ? ? ? (parallel [(const_int 1) (const_int 5)
>>>>>> ? ? ? ? ? ? ? ? ? ? ?(const_int 3) (const_int 7)])))]
>>>>>> ? "TARGET_AVX"
>>>>>> ? "vunpckhpd\t{%2, %1, %0|%0, %1, %2}"
>>>>>> ? [(set_attr "type" "sselog")
>>>>>> ? ?(set_attr "prefix" "vex")
>>>>>> ? ?(set_attr "mode" "V4DF")])
>>>>>>
>>>>>> We can't have the same instructions with different elements.
>>>>>
>>>>> Hm, right. ?So there's no suitable 256bit instructions for
>>>>> vec_interleave with v4df nor v8sf mode?
>>>>>
>>>>
>>>> That is correct. We have 2 choices:
>>>>
>>>> 1. Extend vectorizer to efficiently support 256bit AVX.
>>>> 2. Use define_expand. I have some patches for it. The code looks bad:
>>> 3. Do not use 256bit vectors in these cases
>>>
>>> I guess 1. and 3. are more useful, with the patterns available the
>>> vectorizer will make unconditional use of them (I can't assess how
>>> bad the code generation actually is though).
>>>
>>
>> How hard to choose a vector size based on available
>> patterns? In loop, there may be supported patterns
>> and unsupported patterns for 256bit vectors. Where
>> do we draw the line?
>
> We will have to come up with something (I have some ideas).
> I'll blame it on the AVX folks that they again designed something
> completely non-symmetrical ...
>

Even 128bit SSE vector instructions aren't natural fit to the gcc
vectorizer infrastructure.

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]