113210 – [14 Regression] ICE: tree check: expected integer_cst, have cond_expr in get_len, at tree.h:6481

Bug 113210 - [14 Regression] ICE: tree check: expected integer_cst, have cond_expr in get_len, at tree.h:6481

Summary: [14 Regression] ICE: tree check: expected integer_cst, have cond_expr in get_...

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	14.0

Importance:	P3 normal
Target Milestone:	14.0
Assignee:	Jakub Jelinek

URL:
Keywords:	ice-on-valid-code

Depends on:
Blocks:

Reported:	2024-01-02 23:33 UTC by Patrick O'Neill
Modified:	2024-01-09 09:32 UTC (History)
CC List:	8 users (show)

See Also:
Host:
Target:	riscv aarch64
Build:
Known to work:
Known to fail:
Last reconfirmed:	2024-01-03 00:00:00

Attachments
-freport-bug output (1.37 KB, text/x-csrc) 2024-01-02 23:33 UTC, Patrick O'Neill	Details
gcc14-pr113210.patch (636 bytes, patch) 2024-01-05 16:21 UTC, Jakub Jelinek	Details \| Diff
gcc14-pr113210.patch (795 bytes, patch) 2024-01-08 15:58 UTC, Jakub Jelinek	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Patrick O'Neill 2024-01-02 23:33:15 UTC

Created attachment 56982 [details]
-freport-bug output

Testcase:
char a, c;
short b;
void d() {
  c = a - 1;
  b = c;
  unsigned short *e = &b;
  while (++*e > 256)
    ;
}

Command:
> /scratch/tc-testing/tc-jan-1-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gcv -O2 red.c -freport-bug

during GIMPLE pass: vect
red.c: In function 'd':
red.c:3:6: internal compiler error: tree check: expected integer_cst, have cond_expr in get_len, at tree.h:6481
    3 | void d() {
      |      ^
0xab8132 tree_check_failed(tree_node const*, char const*, int, char const*, ...)
        ../../../gcc/gcc/tree.cc:8952
0xc97cb2 tree_check(tree_node const*, char const*, int, char const*, tree_code)
        ../../../gcc/gcc/tree.h:3900
0xc97cb2 wi::extended_tree<131072>::get_len() const
        ../../../gcc/gcc/tree.h:6481
0xc97cb2 wi::int_traits<generic_wide_int<wi::extended_tree<131072> > >::decompose(long*, unsigned int, generic_wide_int<wi::extended_tree<131072> > const&)
        ../../../gcc/gcc/wide-int.h:1050
0xc97cb2 wide_int_ref_storage<true, false>::wide_int_ref_storage<generic_wide_int<wi::extended_tree<131072> > >(generic_wide_int<wi::extended_tree<131072> > const&, unsigned int)
        ../../../gcc/gcc/wide-int.h:1099
0xc97cb2 generic_wide_int<wide_int_ref_storage<true, false> >::generic_wide_int<generic_wide_int<wi::extended_tree<131072> > >(generic_wide_int<wi::extended_tree<131072> > const&, unsigned int)
        ../../../gcc/gcc/wide-int.h:855
0xc97cb2 wi::binary_traits<generic_wide_int<wi::extended_tree<131072> >, int, wi::int_traits<generic_wide_int<wi::extended_tree<131072> > >::precision_type, wi::int_traits<int>::precision_type>::result_type wi::add<generic_wide_int<wi::extended_tree<131072> >, int>(generic_wide_int<wi::extended_tree<131072> > const&, int const&)
        ../../../gcc/gcc/wide-int.h:2871
0x156f0e7 wi::binary_traits<generic_wide_int<wi::extended_tree<131072> >, int, wi::int_traits<generic_wide_int<wi::extended_tree<131072> > >::precision_type, wi::int_traits<int>::precision_type>::operator_result operator+<generic_wide_int<wi::extended_tree<131072> >, int>(generic_wide_int<wi::extended_tree<131072> > const&, int const&)
        ../../../gcc/gcc/wide-int.h:3857
0x156f0e7 vect_analyze_loop_costing
        ../../../gcc/gcc/tree-vect-loop.cc:2270
0x157ff01 vect_analyze_loop_2
        ../../../gcc/gcc/tree-vect-loop.cc:3166
0x1581a20 vect_analyze_loop_1
        ../../../gcc/gcc/tree-vect-loop.cc:3461
0x15821b9 vect_analyze_loop(loop*, vec_info_shared*)
        ../../../gcc/gcc/tree-vect-loop.cc:3619
0x15c84d4 try_vectorize_loop_1
        ../../../gcc/gcc/tree-vectorizer.cc:1066
0x15c84d4 try_vectorize_loop
        ../../../gcc/gcc/tree-vectorizer.cc:1182
0x15c8dfc execute
        ../../../gcc/gcc/tree-vectorizer.cc:1298
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
Preprocessed source stored into /scratch/tmp/ccdgSv2l.out file, please attach this to your bugreport.

Godbolt: https://godbolt.org/z/x851M9q7n

The ICE goes away if you change the 256 -> 255

-freport-bug output attached.

Comment 1 JuzheZhong 2024-01-02 23:38:52 UTC

This is not RISC-V issues, it's middle-end issue.
Plz change the title and CC Richard.

https://godbolt.org/z/1bj9xaYTa

ARM SVE has the same issue.

Comment 2 Patrick O'Neill 2024-01-02 23:48:13 UTC

On 1/2/24 15:38, juzhe.zhong at rivai dot ai wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113210
>
> --- Comment #1 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
> This is not RISC-V issues, it's middle-end issue.
> Plz change the title and CC Richard.
>
> https://godbolt.org/z/1bj9xaYTa
>
> ARM SVE has the same issue.
>
Got it - Bugzilla seems to be down for me.
Once it's back up I'll update the PR.

Thanks,
Patrick

Comment 3 Andrew Pinski 2024-01-03 01:34:43 UTC

Confirmed on aarch64-linux-gnu with `-O3 -march=armv9-a+sve` .

Comment 4 Andrew Pinski 2024-01-03 01:39:35 UTC

 is executed at most (short unsigned int) (a.0_1 + 255) + 1 <= 256 ? 0 : ~(short unsigned int) (a.0_1 + 255) (bounded by 65279) + 1 times in loop 1.


Though that is the same niter as in GCC 13 ...

The code which seems to be causing this was introduced at r14-2281-g0682a32c026f1e .

Comment 5 Andrew Pinski 2024-01-03 01:42:39 UTC

Better/slightly more reduced testcase:
```
unsigned char a, c;
unsigned short b;
void d() {
  c = a - 1;
  b = c;
  while (++b > 256)
    ;
}
```

Comment 6 Jakub Jelinek 2024-01-05 14:37:09 UTC

LOOP_VINFO_NITERS is INTEGER_CST 1 here, but LOOP_VINFO_NITERSM1 is that complex
expression and we assume that if LOOP_VINFO_NITERS_KNOWN_P then NITERSM1 is also known,
which is not the case here.

Comment 7 Jakub Jelinek 2024-01-05 16:09:49 UTC

Or maybe just a bug in the PLUS_EXPR folding?
The code sets NITERSM1 to
(short unsigned int) (a.0_1 + 255) + 1 > 256 ? ~(short unsigned int) (a.0_1 + 255) : 0
and then fold_build2s PLUS_EXPR of that and 1 and somehow it folds to 1, that doesn't sound right to me.
Now, when folding the + 1 addition just with the second operand, i.e.
~(short unsigned int) (a.0_1 + 255)
it correctly folds into
-(short unsigned int) (a.0_1 + 255)
and obviously the second one to 1.
There is also the
/* (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when
   X is unsigned, as when X + 1 overflows, X is -1, so -X == 1.  */
(simplify
 (cond (gt (plus @0 integer_onep) @1) (negate @0) integer_onep@2)
 (if (TYPE_UNSIGNED (type))
  (cond (ge @0 @1) (negate @0) @2)))
match.pd rule, but that I'd think should just fold the whole thing to:
(short unsigned int) (a.0_1 + 255) >= 256 ? -(short unsigned int) (a.0_1 + 255) : 1

Though, a.0_1 is unsigned char, so (short unsigned int) (a.0_1 + 255) + 1 > 256
is actually never true.
So guess the folding is correct.

Comment 8 Jakub Jelinek 2024-01-05 16:21:15 UTC

Created attachment 56993 [details]
gcc14-pr113210.patch

So, I'd go with this patch (so far untested).

Comment 9 Richard Biener 2024-01-08 09:44:14 UTC

(In reply to Jakub Jelinek from comment #8)
> Created attachment 56993 [details]
> gcc14-pr113210.patch
> 
> So, I'd go with this patch (so far untested).

I think we want to keep the invariant that both are INTEGER_CST when one is.

If we can fold the add to 1 why can't we fold the original to 0?

That is, another fix might be to adjust NITERSM1 to NITERS - 1 when
NITERS went constant ...  (btw, I want to get rid of _NITERS and only
keep _NITERSM1 given _NITERS cannot be represented when the number of
latch iterations is TYPE_MAX_VALUE)

Comment 10 Jakub Jelinek 2024-01-08 10:02:19 UTC

(In reply to Richard Biener from comment #9)
> I think we want to keep the invariant that both are INTEGER_CST when one is.
> 
> If we can fold the add to 1 why can't we fold the original to 0?

Because fold generally isn't recursive, so either whatever created the expression (niters analysis) would need to try to fold all subexpressions which it clearly doesn't try to do so, or we rely on luck that something changed and we fold something because of that, which is exactly what triggers it during the + 1 folding.

Comment 11 Jakub Jelinek 2024-01-08 10:09:34 UTC

(In reply to Richard Biener from comment #9)
> That is, another fix might be to adjust NITERSM1 to NITERS - 1 when
> NITERS went constant ...  (btw, I want to get rid of _NITERS and only

Or we could only use fold_build2 for the PLUS_EXPR 1 computation if NITERSM1 is INTEGER_CST, otherwise use build2...

Comment 12 Richard Biener 2024-01-08 10:16:20 UTC

(In reply to Jakub Jelinek from comment #11)
> (In reply to Richard Biener from comment #9)
> > That is, another fix might be to adjust NITERSM1 to NITERS - 1 when
> > NITERS went constant ...  (btw, I want to get rid of _NITERS and only
> 
> Or we could only use fold_build2 for the PLUS_EXPR 1 computation if NITERSM1
> is INTEGER_CST, otherwise use build2...

I think we should see where the original expression is built but not folded.

Comment 13 Richard Biener 2024-01-08 10:18:11 UTC

(In reply to Richard Biener from comment #12)
> (In reply to Jakub Jelinek from comment #11)
> > (In reply to Richard Biener from comment #9)
> > > That is, another fix might be to adjust NITERSM1 to NITERS - 1 when
> > > NITERS went constant ...  (btw, I want to get rid of _NITERS and only
> > 
> > Or we could only use fold_build2 for the PLUS_EXPR 1 computation if NITERSM1
> > is INTEGER_CST, otherwise use build2...
> 
> I think we should see where the original expression is built but not folded.

Hmm, probably in estimate_numbers_of_iterations,

      if (TREE_CODE (niter_desc.may_be_zero) != INTEGER_CST)
        niter = build3 (COND_EXPR, type, niter_desc.may_be_zero,
                        build_int_cst (type, 0),
                        niter);

I vaguely remember code trying to pattern match the COND_EXPR created
by this (though it should instead use number_of_iterations_exit).  It
should be safe to replace the above with fold_build3.

Comment 14 Richard Biener 2024-01-08 10:25:08 UTC

(In reply to Richard Biener from comment #13)
> (In reply to Richard Biener from comment #12)
> > (In reply to Jakub Jelinek from comment #11)
> > > (In reply to Richard Biener from comment #9)
> > > > That is, another fix might be to adjust NITERSM1 to NITERS - 1 when
> > > > NITERS went constant ...  (btw, I want to get rid of _NITERS and only
> > > 
> > > Or we could only use fold_build2 for the PLUS_EXPR 1 computation if NITERSM1
> > > is INTEGER_CST, otherwise use build2...
> > 
> > I think we should see where the original expression is built but not folded.
> 
> Hmm, probably in estimate_numbers_of_iterations,
> 
>       if (TREE_CODE (niter_desc.may_be_zero) != INTEGER_CST)
>         niter = build3 (COND_EXPR, type, niter_desc.may_be_zero,
>                         build_int_cst (type, 0),
>                         niter);
> 
> I vaguely remember code trying to pattern match the COND_EXPR created
> by this (though it should instead use number_of_iterations_exit).  It
> should be safe to replace the above with fold_build3.

No, it's not that.  The COND_EXPR is folded, but we see

(short unsigned int) (a.0_1 + 255) + 1 > 256 ? ~(short unsigned int) (a.0_1 + 255) : 0

and that isn't catched.

#0  0x00000000014d3464 in fold_build3_loc (loc=0, code=COND_EXPR, 
    type=<integer_type 0x7ffff704b540 short unsigned int>, 
    op0=<gt_expr 0x7ffff7207f50>, op1=<bit_not_expr 0x7ffff6e13540>, 
    op2=<integer_cst 0x7ffff7047378>)
    at /space/rguenther/src/gcc/gcc/fold-const.cc:14174
#1  0x00000000014d0513 in fold_ternary_loc (loc=0, code=COND_EXPR, 
    type=<integer_type 0x7ffff704b540 short unsigned int>, 
    op0=<le_expr 0x7ffff7207f00>, op1=<integer_cst 0x7ffff7047378>, 
    op2=<bit_not_expr 0x7ffff6e13540>)
    at /space/rguenther/src/gcc/gcc/fold-const.cc:13238
#2  0x00000000014d3436 in fold_build3_loc (loc=0, code=COND_EXPR, 
    type=<integer_type 0x7ffff704b540 short unsigned int>, 
    op0=<le_expr 0x7ffff7207f00>, op1=<integer_cst 0x7ffff7047378>, 
    op2=<bit_not_expr 0x7ffff6e13540>)
    at /space/rguenther/src/gcc/gcc/fold-const.cc:14172
#3  0x0000000001d4de50 in vect_get_loop_niters (loop=0x7ffff71e3600, 
    main_exit=0x7ffff7203990, assumptions=0x7fffffffd200, 
    number_of_iterations=0x7fffffffd1f0, number_of_iterationsm1=0x7fffffffd1f8)
    at /space/rguenther/src/gcc/gcc/tree-vect-loop.cc:919

So I suggest to either try fixing that or, when adding one folds to a constant
make sure M1 is constant as well.

Comment 15 Jakub Jelinek 2024-01-08 15:58:59 UTC

Created attachment 57007 [details]
gcc14-pr113210.patch

So like this instead?

Comment 16 Richard Biener 2024-01-09 08:36:24 UTC

(In reply to Jakub Jelinek from comment #15)
> Created attachment 57007 [details]
> gcc14-pr113210.patch
> 
> So like this instead?

yes

Comment 17 GCC Commits 2024-01-09 09:32:27 UTC

The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:b1d4e5b51375efd285378173375d634ad6ba8462

commit r14-7030-gb1d4e5b51375efd285378173375d634ad6ba8462
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jan 9 10:31:51 2024 +0100

    vect: Ensure both NITERSM1 and NITERS are INTEGER_CSTs or neither of them [PR113210]
    
    On the following testcase e.g. on riscv64 or aarch64 (latter with
    -O3 -march=armv8-a+sve ) we ICE, because while NITERS is INTEGER_CST,
    NITERSM1 is a complex expression like
    (short unsigned int) (a.0_1 + 255) + 1 > 256 ? ~(short unsigned int) (a.0_1 + 255) : 0
    where a.0_1 is unsigned char.  The condition is never true, so the above
    is equivalent to just 0, but only when trying to fold the above with
    PLUS_EXPR 1 we manage to simplify it (first
    ~(short unsigned int) (a.0_1 + 255)
    to
    -(short unsigned int) (a.0_1 + 255)
    and then
    (short unsigned int) (a.0_1 + 255) + 1 > 256 ? -(short unsigned int) (a.0_1 + 255) : 1
    to
    (short unsigned int) (a.0_1 + 255) >= 256 ? -(short unsigned int) (a.0_1 + 255) : 1
    and only at this point we fold the condition to be false.
    
    But the vectorizer seems to assume that if NITERS is known (i.e. suitable
    INTEGER_CST) then NITERSM1 also is, so the following hack ensures that if
    NITERS folds into INTEGER_CST NITERSM1 will be one as well.
    
    2024-01-09  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/113210
            * tree-vect-loop.cc (vect_get_loop_niters): If non-INTEGER_CST
            value in *number_of_iterationsm1 PLUS_EXPR 1 is folded into
            INTEGER_CST, recompute *number_of_iterationsm1 as the INTEGER_CST
            minus 1.
    
            * gcc.c-torture/compile/pr113210.c: New test.

Comment 18 Jakub Jelinek 2024-01-09 09:32:58 UTC

Fixed.