Bug 98848 - [10 regression] vectorizer failed to reduce max pattern since r9-1590
Summary: [10 regression] vectorizer failed to reduce max pattern since r9-1590
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P2 normal
Target Milestone: 11.0
Assignee: Jakub Jelinek
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2021-01-27 08:29 UTC by Hongtao.liu
Modified: 2023-07-07 09:22 UTC (History)
4 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target:
Build:
Known to work: 11.0
Known to fail: 10.5.0
Last reconfirmed: 2021-01-27 00:00:00


Attachments
gcc11-pr98848.patch (749 bytes, patch)
2021-02-01 10:21 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Hongtao.liu 2021-01-27 08:29:14 UTC
cat test.c

extern short a[9000];

int foo()
{ 
    int b;
    int i;
    b = a[0];

    for(i = 1; i < 9000; i ++) {
        if(a[i] < b) {
          b = a[i];
        }
    }
    return b;
}

gcc8 successfully vectorized the loop with option: -Ofast -march=skylake-avx512, but gcc9/10/trunk failed.

test.c:9:16: missed: couldn't vectorize loop
test.c:3:5: missed: not vectorized: relevant phi not supported: b_14 = PHI <_9(5), b_8(2)>
test.c:3:5: note: vectorized 0 loops in function.
test.c:14:10: note: ***** Analysis failed with vector mode V16HI
test.c:14:10: note: ***** Skipping vector mode V32QI, which would repeat the analysis for V16HI

It seems vect_recog_widen_op_pattern failed to handle this???
Comment 1 Hongtao.liu 2021-01-27 08:31:51 UTC
With a bit adjustment of testcase, vectorized.
@@ -2,7 +2,7 @@ extern short a[9000];
 
 int foo()
 { 
-  int b;
+  short b;
   int i;
   b = a[0];
Comment 2 Richard Biener 2021-01-27 09:51:48 UTC
Probably because

t.c:9:16: note:   vect_recog_over_widening_pattern: detected: _9 = MIN_EXPR <_3, b_14>;
t.c:9:16: note:   demoting int to signed short
t.c:9:16: note:   created pattern stmt: patt_11 = MIN_EXPR <_2, patt_12>;
t.c:9:16: note:   over_widening pattern recognized: patt_6 = (int) patt_11;
t.c:9:16: note:   extra pattern stmt: patt_12 = (signed short) b_14;
t.c:9:16: note:   extra pattern stmt: patt_11 = MIN_EXPR <_2, patt_12>;

which makes the reduction unhandled (we only support sign changing conversions,
not truncations).

We can restrict the over-widen pattern to not apply for reductions or see
to use range-info (like pattern recog does) in the reduction handling somehow.
I don't see a obvious place to add a reduction def check to
vect_recog_over_widening_pattern, maybe Richard does.
Comment 3 Jakub Jelinek 2021-01-28 17:07:12 UTC
Regressed with r9-1590-g370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85
Comment 4 Jakub Jelinek 2021-01-28 17:20:24 UTC
Alternatively, couldn't we support truncation in the reductions if SSA_NAME_RANGE_INFO suggests that the values are always in the narrower range?
Comment 5 rguenther@suse.de 2021-01-29 07:31:23 UTC
On Thu, 28 Jan 2021, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98848
> 
> --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Alternatively, couldn't we support truncation in the reductions if
> SSA_NAME_RANGE_INFO suggests that the values are always in the narrower range?

Yes, we probably could.  But note that changes in reduction support
are quite fragile and we're currently just set up for sign changes
(via emitting V_C_E) but not required promotions/demotions so
there would be a lot of changes needed.  It's also not clear
promoting/demoting the reduction IV all the time is doing any good
(unless you suggest that we'd magically undo the pattern by promoting
the non-reduction OP instead - but that would require even more changes).

So I guess the better approach might be to somehow allow late "undoing"
of pattern recog (but it's a bit complicated because of how that 
influences VF compute and also relevant/liveness compute).
Comment 6 Jakub Jelinek 2021-02-01 09:23:06 UTC
So what about punting if the lhs of the possible over_widen pattern is a PHI on loop header?

--- gcc/tree-vect-patterns.c.jj	2021-01-04 10:25:38.650235896 +0100
+++ gcc/tree-vect-patterns.c	2021-02-01 10:13:51.755008757 +0100
@@ -1579,6 +1579,20 @@ vect_recog_over_widening_pattern (vec_in
   tree type = TREE_TYPE (lhs);
   tree_code code = gimple_assign_rhs_code (last_stmt);
 
+  /* Punt if lhs might be used in a reduction.  */
+  if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
+    {
+      use_operand_p use_p;
+      imm_use_iterator iter;
+      FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
+	{
+	  gimple *use_stmt = USE_STMT (use_p);
+	  if (gimple_code (use_stmt) == GIMPLE_PHI
+	      && gimple_bb (use_stmt) == LOOP_VINFO_LOOP (loop_vinfo)->header)
+	    return NULL;
+	}
+    }
+
   /* Keep the first operand of a COND_EXPR as-is: only the other two
      operands are interesting.  */
   unsigned int first_op = (code == COND_EXPR ? 2 : 1);

doesn't regress any vect.exp=*over-widen* tests and let's this testcase be vectorized.
Comment 7 rguenther@suse.de 2021-02-01 09:35:34 UTC
On Mon, 1 Feb 2021, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98848
> 
> --- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> So what about punting if the lhs of the possible over_widen pattern is a PHI on
> loop header?

That would be STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def,
elsewhere we now use vect_reassociating_reduction_p, not sure if
that woudl apply here, too.

> --- gcc/tree-vect-patterns.c.jj 2021-01-04 10:25:38.650235896 +0100
> +++ gcc/tree-vect-patterns.c    2021-02-01 10:13:51.755008757 +0100
> @@ -1579,6 +1579,20 @@ vect_recog_over_widening_pattern (vec_in
>    tree type = TREE_TYPE (lhs);
>    tree_code code = gimple_assign_rhs_code (last_stmt);
> 
> +  /* Punt if lhs might be used in a reduction.  */
> +  if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
> +    {
> +      use_operand_p use_p;
> +      imm_use_iterator iter;
> +      FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
> +       {
> +         gimple *use_stmt = USE_STMT (use_p);
> +         if (gimple_code (use_stmt) == GIMPLE_PHI
> +             && gimple_bb (use_stmt) == LOOP_VINFO_LOOP (loop_vinfo)->header)
> +           return NULL;
> +       }
> +    }
> +
>    /* Keep the first operand of a COND_EXPR as-is: only the other two
>       operands are interesting.  */
>    unsigned int first_op = (code == COND_EXPR ? 2 : 1);
> 
> doesn't regress any vect.exp=*over-widen* tests and let's this testcase be
> vectorized.
> 
>
Comment 8 Jakub Jelinek 2021-02-01 10:21:46 UTC
Created attachment 50102 [details]
gcc11-pr98848.patch

This works too.  I don't see how we could use vect_reassociating_reduction_p, that for one seems to be used in the positive checks (only recognize if reduction) and more importantly, makes heavy assumptions on what the assignment must be (while for over-widen it could be e.g. a COND_EXPR).
Comment 9 GCC Commits 2021-02-02 09:33:08 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:1592b74350a0311e4c95a0192ea9c943847e7bc0

commit r11-7034-g1592b74350a0311e4c95a0192ea9c943847e7bc0
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Feb 2 10:32:23 2021 +0100

    tree-vect-patterns: Don't create over widening patterns for stmts used in reductions [PR98848]
    
    As discussed in the PR, the reduction code isn't able to cope with type
    promotions/demotions in the reduction computation, so if we recognize an
    over-widening pattern that has vect_reduction_def type, we most likely make
    it non-vectorizable.
    
    2021-02-02  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/98848
            * tree-vect-patterns.c (vect_recog_over_widening_pattern): Punt if
            STMT_VINFO_DEF_TYPE (last_stmt_info) is vect_reduction_def.
    
            * gcc.dg/vect/pr98848.c: New test.
            * gcc.dg/vect/pr92205.c: Remove xfail.
Comment 10 Jakub Jelinek 2021-02-02 09:34:03 UTC
Fixed on the trunk.  Unsure if we want to backport this.
Comment 11 Richard Biener 2021-06-01 08:19:28 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 12 Richard Biener 2022-05-27 09:44:21 UTC
GCC 9 branch is being closed
Comment 13 Jakub Jelinek 2022-06-28 10:43:19 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 14 Richard Biener 2023-07-07 09:22:37 UTC
Fixed in GCC 11.