This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Cong Hou <congh at google dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
- Date: Tue, 19 Nov 2013 09:51:39 +0000
- Subject: Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
- Authentication-results: sourceware.org; auth=none
- References: <CAK=A3=0jwnwHsyoOnPq1yDwSC=KJHfXwgpyMwvOaCLtqCTo_5g at mail dot gmail dot com> <5286655A dot 90102 at arm dot com> <CAK=A3=0MzQ3Rpa2h-6C-XM9=SUfiFXNvfgYLP_FoHXGoo3FZxg at mail dot gmail dot com>
On 18/11/13 20:19, Cong Hou wrote:
> On Fri, Nov 15, 2013 at 10:18 AM, Richard Earnshaw <firstname.lastname@example.org> wrote:
>> On 15/11/13 02:06, Cong Hou wrote:
>>> This patch adds the support to two non-isomorphic operations addsub
>>> and subadd for SLP vectorizer. More non-isomorphic operations can be
>>> added later, but the limitation is that operations on even/odd
>>> elements should still be isomorphic. Once such an operation is
>>> detected, the code of the operation used in vectorized code is stored
>>> and later will be used during statement transformation. Two new GIMPLE
>>> opeartions VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR are defined. And also
>>> new optabs for them. They are also documented.
>> Not withstanding what Richi has already said on this subject, you
>> certainly don't need both VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR. The
>> latter can always be formed by vec-negating the second operand and
>> passing it to VEC_ADDSUB_EXPR.
> Right. But I also considered targets without the support to addsub
> instructions. Then we could still selectively negate odd/even elements
> using masks then use PLUS_EXPR (at most 2 instructions). If I
> implement VEC_ADDSUB_EXPR by negating the second operand then using
> VEC_ADDSUB_EXPR, I end up with one more instruction.
No, you don't, since as Richi has mentioned elsewhere, two RTL
operations in a single pattern doesn't imply two instructions.