This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Cong Hou <congh at google dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, "ramana dot gcc at googlemail dot com" <ramana dot gcc at googlemail dot com>, Richard Biener <rguenther at suse dot de>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 8 Nov 2013 10:55:21 +0000
- Subject: Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.
- Authentication-results: sourceware.org; auth=none
- References: <CAFULd4Znkj7WxsP9kmng069XKXb2CWL3es4myY7tE-5JmykJFw at mail dot gmail dot com> <CAK=A3=33kpXdwCEhbPFfs4=mov0k6Z6J+O0HnBdE0fB41K7vvQ at mail dot gmail dot com> <20131101101656 dot GA1347 at arm dot com> <CAK=A3=3s+nG7CJUDAeXGARz+1_WjjzEYgwz9TK47QSvjqGoySg at mail dot gmail dot com> <20131104100617 dot GA29519 at arm dot com> <CAK=A3=3W5=cO9JYxMNbj2QGLgGuUj+En8=p2yFqLiLgzpz_bWA at mail dot gmail dot com> <20131105095308 dot GA16568 at arm dot com> <CAK=A3=2GSaa5apxacaxw4At-CVLbxLemi9mUpoJT6grtC6Z4gw at mail dot gmail dot com> <CAK=A3=2BjLOgmrFouSHdSDYqQZeDBhQzB1ZiE-JQf8tmO+ST6w at mail dot gmail dot com>
> On Tue, Nov 5, 2013 at 9:58 AM, Cong Hou <congh@google.com> wrote:
> > Thank you for your detailed explanation.
> >
> > Once GCC detects a reduction operation, it will automatically
> > accumulate all elements in the vector after the loop. In the loop the
> > reduction variable is always a vector whose elements are reductions of
> > corresponding values from other vectors. Therefore in your case the
> > only instruction you need to generate is:
> >
> > VABAL ops[3], ops[1], ops[2]
> >
> > It is OK if you accumulate the elements into one in the vector inside
> > of the loop (if one instruction can do this), but you have to make
> > sure other elements in the vector should remain zero so that the final
> > result is correct.
> >
> > If you are confused about the documentation, check the one for
> > udot_prod (just above usad in md.texi), as it has very similar
> > behavior as usad. Actually I copied the text from there and did some
> > changes. As those two instruction patterns are both for vectorization,
> > their behavior should not be difficult to explain.
> >
> > If you have more questions or think that the documentation is still
> > improper please let me know.
Hi Cong,
Thanks for your reply.
I've looked at Dorit's original patch adding WIDEN_SUM_EXPR and
DOT_PROD_EXPR and I see that the same ambiguity exists for
DOT_PROD_EXPR. Can you please add a note in your tree.def
that SAD_EXPR, like DOT_PROD_EXPR can be expanded as either:
tmp = WIDEN_MINUS_EXPR (arg1, arg2)
tmp2 = ABS_EXPR (tmp)
arg3 = PLUS_EXPR (tmp2, arg3)
or:
tmp = WIDEN_MINUS_EXPR (arg1, arg2)
tmp2 = ABS_EXPR (tmp)
arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
Where WIDEN_MINUS_EXPR is a signed MINUS_EXPR, returning a
a value of the same (widened) type as arg3.
Also, while looking for the history of DOT_PROD_EXPR I spotted this
patch:
[autovect] [patch] detect mult-hi and sad patterns
http://gcc.gnu.org/ml/gcc-patches/2005-10/msg01394.html
I wonder what the reason was for that patch to be dropped?
Thanks,
James