This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Fri, May 13, 2016 at 5:53 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On May 13, 2016 6:02:27 PM GMT+02:00, Bin Cheng <Bin.Cheng@arm.com> wrote: >>Hi, >>As PR69848 reported, GCC vectorizer now generates comparison outside of >>VEC_COND_EXPR for COND_REDUCTION case, as below: >> >> _20 = vect__1.6_8 != { 0, 0, 0, 0 }; >> vect_c_2.8_16 = VEC_COND_EXPR <_20, { 0, 0, 0, 0 }, vect_c_2.7_13>; >> _21 = VEC_COND_EXPR <_20, ivtmp_17, _19>; >> >>This results in inefficient expanding. With IR like: >> >>vect_c_2.8_16 = VEC_COND_EXPR <vect__1.6_8 != { 0, 0, 0, 0 }, { 0, 0, >>0, 0 }, vect_c_2.7_13>; >> _21 = VEC_COND_EXPR <vect__1.6_8 != { 0, 0, 0, 0 }, ivtmp_17, _19>; >> >>We can do: >>1) Expanding time optimization, for example, reverting comparison >>operator by switching VEC_COND_EXPR operands. This is useful when >>backend only supports some comparison operators. >>2) For backend not supporting vcond_mask patterns, saving one LT_EXPR >>instruction which introduced by expand_vec_cond_expr. >> >>This patch fixes this by propagating comparison into VEC_COND_EXPR even >>if it's used multiple times. For now, GCC does single_use_only >>propagation. Ideally, we may duplicate the comparison before each use >>statement just before expanding, so that TER can successfully backtrack >>it from each VEC_COND_EXPR. Unfortunately I didn't find a good pass to >>do this. Tree-vect-generic.c looks like a good candidate, but it's so >>early that following CSE could undo the transform. Another possible >>fix is to generate comparison inside VEC_COND_EXPR directly in function >>vectorizable_reduction. > > I prefer this for now. Hi Richard, you mean this patch, or the possible fix before your comment? Here is an updated patch addressing comment issue pointed out by Bernhard Reutner-Fischer. Thanks. Thanks, bin > > Richard. > >>As for possible comparison CSE opportunities, I checked that it's >>simple enough to be handled by RTL CSE. >> >>Bootstrap and test on x86_64 and AArch64. Any comments? >> >>Thanks, >>bin >> >>2016-05-12 Bin Cheng <bin.cheng@arm.com> >> >> PR tree-optimization/69848 >> * optabs-tree.c (expand_vcond_mask_p, expand_vcond_p): New. >> (expand_vec_cmp_expr_p): Call above functions. >> * optabs-tree.h (expand_vcond_mask_p, expand_vcond_p): New. >> * tree-ssa-forwprop.c (optabs-tree.h): Include header file. >> (forward_propagate_into_cond): Propgate multiple uses for >> VEC_COND_EXPR. > >
Attachment:
pr69848-1-20160515.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |