This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

From: Cong Hou <congh at google dot com>
To: Uros Bizjak <ubizjak at gmail dot com>
Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
Date: Mon, 18 Nov 2013 16:52:56 -0800
Subject: Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
Authentication-results: sourceware.org; auth=none
References: <CAFULd4a0MCQfZ_GNr6BAcDNF-SKkMB=Kc8+SZqt1-PXnbT6scw at mail dot gmail dot com> <CAK=A3=1czu4oujPNjj5DArX96KUZmwpCB9BsCtLrV9L=TRfXvg at mail dot gmail dot com> <CAFULd4aknT5co=Dqhi4USj=dSXzLGR5QBKB=HSm8kRTSnTghUw at mail dot gmail dot com>

On Mon, Nov 18, 2013 at 12:27 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Nov 18, 2013 at 9:15 PM, Cong Hou <congh@google.com> wrote:
>
>>>> This patch adds the support to two non-isomorphic operations addsub
>>>> and subadd for SLP vectorizer. More non-isomorphic operations can be
>>>> added later, but the limitation is that operations on even/odd
>>>> elements should still be isomorphic. Once such an operation is
>>>> detected, the code of the operation used in vectorized code is stored
>>>> and later will be used during statement transformation. Two new GIMPLE
>>>> opeartions VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR are defined. And also
>>>> new optabs for them. They are also documented.
>>>>
>>>> The target supports for SSE/SSE2/SSE3/AVX are added for those two new
>>>> operations on floating points. SSE3/AVX provides ADDSUBPD and ADDSUBPS
>>>> instructions. For SSE/SSE2, those two operations are emulated using
>>>> two instructions (selectively negate then add).
>>>
>>> +(define_expand "vec_subadd_v4sf3"
>>> +  [(set (match_operand:V4SF 0 "register_operand")
>>> + (unspec:V4SF
>>> +  [(match_operand:V4SF 1 "register_operand")
>>> +   (match_operand:V4SF 2 "nonimmediate_operand")] UNSPEC_SUBADD))]
>>> +  "TARGET_SSE"
>>> +{
>>> +  if (TARGET_SSE3)
>>> +    emit_insn (gen_sse3_addsubv4sf3 (operands[0], operands[1], operands[2]));
>>> +  else
>>> +    ix86_sse_expand_fp_addsub_operator (true, V4SFmode, operands);
>>> +  DONE;
>>> +})
>>>
>>> Make the expander pattern look like correspondig sse3 insn and:
>>> ...
>>> {
>>>   if (!TARGET_SSE3)
>>>     {
>>>       ix86_sse_expand_fp_...();
>>>       DONE;
>>>     }
>>> }
>>>
>>
>> You mean I should write two expanders for SSE and SSE3 respectively?
>
> No, please use the same approach as you did for abs<mode>2 expander.
> For !TARGET_SSE3, call the helper function (ix86_sse_expand...),
> otherwise expand through pattern. Also, it looks to me that you should
> partially expand in the pattern before calling helper function, mainly
> to avoid a bunch of "if (...)" at the beginning of the helper
> function.
>


I know what you mean. Then I have to change the pattern being detected
for sse3_addsubv4sf3, so that it can handle ADDSUB_EXPR for SSE3.

Currently I am considering using Richard's method without creating new
tree nodes and optabs, based on pattern matching. I will handle SSE2
and SSE3 separately by define_expand and define_insn. The current
problem is that the pattern may contain more than four instructions
which cannot be processed by the combine pass.

I am considering how to reduce the number of instructions in the
pattern to four.

Thank you very much!


Cong



> Uros.

References:
- Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
  - From: Uros Bizjak
- Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
  - From: Cong Hou
- Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
  - From: Uros Bizjak

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]