This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH PR69848/partial]Propagate comparison into VEC_COND_EXPR if target supports
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Bin Cheng <Bin dot Cheng at arm dot com>,"gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: nd <nd at arm dot com>
- Date: Fri, 13 May 2016 18:53:59 +0200
- Subject: Re: [PATCH PR69848/partial]Propagate comparison into VEC_COND_EXPR if target supports
- Authentication-results: sourceware.org; auth=none
- References: <DB5PR08MB1144B63987076D3F48A67BDDE7740 at DB5PR08MB1144 dot eurprd08 dot prod dot outlook dot com>
On May 13, 2016 6:02:27 PM GMT+02:00, Bin Cheng <Bin.Cheng@arm.com> wrote:
>Hi,
>As PR69848 reported, GCC vectorizer now generates comparison outside of
>VEC_COND_EXPR for COND_REDUCTION case, as below:
>
> _20 = vect__1.6_8 != { 0, 0, 0, 0 };
> vect_c_2.8_16 = VEC_COND_EXPR <_20, { 0, 0, 0, 0 }, vect_c_2.7_13>;
> _21 = VEC_COND_EXPR <_20, ivtmp_17, _19>;
>
>This results in inefficient expanding. With IR like:
>
>vect_c_2.8_16 = VEC_COND_EXPR <vect__1.6_8 != { 0, 0, 0, 0 }, { 0, 0,
>0, 0 }, vect_c_2.7_13>;
> _21 = VEC_COND_EXPR <vect__1.6_8 != { 0, 0, 0, 0 }, ivtmp_17, _19>;
>
>We can do:
>1) Expanding time optimization, for example, reverting comparison
>operator by switching VEC_COND_EXPR operands. This is useful when
>backend only supports some comparison operators.
>2) For backend not supporting vcond_mask patterns, saving one LT_EXPR
>instruction which introduced by expand_vec_cond_expr.
>
>This patch fixes this by propagating comparison into VEC_COND_EXPR even
>if it's used multiple times. For now, GCC does single_use_only
>propagation. Ideally, we may duplicate the comparison before each use
>statement just before expanding, so that TER can successfully backtrack
>it from each VEC_COND_EXPR. Unfortunately I didn't find a good pass to
>do this. Tree-vect-generic.c looks like a good candidate, but it's so
>early that following CSE could undo the transform. Another possible
>fix is to generate comparison inside VEC_COND_EXPR directly in function
>vectorizable_reduction.
I prefer this for now.
Richard.
>As for possible comparison CSE opportunities, I checked that it's
>simple enough to be handled by RTL CSE.
>
>Bootstrap and test on x86_64 and AArch64. Any comments?
>
>Thanks,
>bin
>
>2016-05-12 Bin Cheng <bin.cheng@arm.com>
>
> PR tree-optimization/69848
> * optabs-tree.c (expand_vcond_mask_p, expand_vcond_p): New.
> (expand_vec_cmp_expr_p): Call above functions.
> * optabs-tree.h (expand_vcond_mask_p, expand_vcond_p): New.
> * tree-ssa-forwprop.c (optabs-tree.h): Include header file.
> (forward_propagate_into_cond): Propgate multiple uses for
> VEC_COND_EXPR.