This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Improve vect_create_epilog_for_reduction (PR tree-optimization/80631)
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Richard Biener <rguenther at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 18 Dec 2017 22:50:50 +0100
- Subject: [PATCH] Improve vect_create_epilog_for_reduction (PR tree-optimization/80631)
- Authentication-results: sourceware.org; auth=none
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
Hi!
When backporting the wrong-code bugfix parts of PR80631 to 7.3, I've noticed
that we perform the optimization to use the induc_val only when reduc_fn is
IFN_REDUC_{MAX,MIN}. That is true e.g. for AVX2, but not plain SSE2, so if
we have a loop like:
void foo (int *v)
{
int found_index = -17;
for (int k = 0; k < 64; k++)
if (v[k] == 77)
found_index = k;
return found_index;
}
we start with { -17, -17, -17... } only for avx* and can optimize away
the scalar if (reduc_result == -17 ? -17 : reduc_result, but for sse*
we start instead with { -1, -1, -1... } and can't optimize
if (reduc_result == -1 ? -17 : reduc_result.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?
2017-12-18 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/80631
* tree-vect-loop.c (vect_create_epilog_for_reduction): Compare
induc_code against MAX_EXPR or MIN_EXPR instead of reduc_fn against
IFN_REDUC_MAX or IFN_REDUC_MIN.
--- gcc/tree-vect-loop.c.jj 2017-12-12 09:54:28.000000000 +0100
+++ gcc/tree-vect-loop.c 2017-12-15 18:56:15.426591727 +0100
@@ -4432,9 +4432,9 @@ vect_create_epilog_for_reduction (vec<tr
&& (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
== INTEGER_INDUC_COND_REDUCTION)
&& !integer_zerop (induc_val)
- && ((reduc_fn == IFN_REDUC_MAX
+ && ((induc_code == MAX_EXPR
&& tree_int_cst_lt (initial_def, induc_val))
- || (reduc_fn == IFN_REDUC_MIN
+ || (induc_code == MIN_EXPR
&& tree_int_cst_lt (induc_val, initial_def))))
induc_val = initial_def;
vect_is_simple_use (initial_def, loop_vinfo, &def_stmt, &initial_def_dt);
Jakub