Bug 105142 - [12 Regression] Wrong code with -O2 since r12-2591
Summary: [12 Regression] Wrong code with -O2 since r12-2591
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 12.0
: P1 normal
Target Milestone: 12.0
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on:
Blocks: yarpgen
  Show dependency treegraph
 
Reported: 2022-04-03 22:13 UTC by Vsevolod Livinskii
Modified: 2022-11-11 14:05 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work: 11.0
Known to fail:
Last reconfirmed: 2022-04-03 00:00:00


Attachments
patch to rewrite undefined overflow stmts (1.28 KB, patch)
2022-04-04 09:32 UTC, Richard Biener
Details | Diff
avoid combining the conditions (1.34 KB, patch)
2022-04-04 10:11 UTC, Richard Biener
Details | Diff
patch to rewrite undefined overflow stmts (1.33 KB, patch)
2022-04-04 12:43 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vsevolod Livinskii 2022-04-03 22:13:11 UTC
Link to the Compiler Explorer: https://godbolt.org/z/51v98a6fd

Reproducer:
#include <stdio.h>

long long int arr_248 = 3623214276426624192LL;
unsigned short arr_255;

void test() __attribute__((noipa));

char a = 42;
const long long &min(const long long &c, const long long &d) {
  return c < d ? c : d;
}
void test() { arr_255 = min(a, min(a, arr_248) + 5713568809962283044LL); }

int main() {
    test();
    printf("%u\n", arr_255);
    if (arr_255 != 42)
      __builtin_abort();
}

Error:
>$ g++ -O1 small.cpp && ./a.out
42
>$ g++ -O2 small.cpp && ./a.out
3300
Aborted (core dumped)

gcc version 12.0.1 20220401 (git://gcc.gnu.org/git/gcc.git:master 15d683d4f0b390b27c54a7c92c6e4f33195bdc93)
Comment 1 Jakub Jelinek 2022-04-03 22:20:49 UTC
Started with r12-2591-g2e96b5f14e4025691b57d2301d71aa6092ed44bc
Comment 2 Jakub Jelinek 2022-04-03 22:33:30 UTC
long long int c = 3623214276426624192LL;
unsigned short b;
char a = 42;
const long long &min(const long long &x, const long long &y) { return x < y ? x : y; }
__attribute__((noipa)) void test() { b = min(a, min(a, c) + 5713568809962283044LL); }
int main() { test(); if (b != 42) __builtin_abort(); }

ifcombine "optimizes" the IMO correct:
  <bb 2> [local count: 1073741824]:
  a.1_1 = a;
  _2 = (long long int) a.1_1;
  _16 = c;
  if (_2 < _16)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 3> [local count: 536870912]:
  _4 = _16 + 5713568809962283044;
  if (_4 > _2)
    goto <bb 4>; [20.00%]
  else
    goto <bb 5>; [80.00%]

  <bb 4> [local count: 536870912]:

  <bb 5> [local count: 1073741824]:
  # _5 = PHI <_2(4), _4(3)>
  _6 = (short unsigned int) _5;
  b = _6;
which for the testcase jumps from bb 2 to bb4 into:
  <bb 2> [local count: 1073741824]:
  a.1_1 = a;
  _2 = (long long int) a.1_1;
  _16 = c;
  _4 = _16 + 5713568809962283044;
  if (_2 < _4)
    goto <bb 3>; [60.00%]
  else
    goto <bb 4>; [40.00%]

  <bb 3> [local count: 536870912]:

  <bb 4> [local count: 1073741824]:
  # _5 = PHI <_2(3), _4(2)>
  _6 = (short unsigned int) _5;
  b = _6;
which invokes UB on the _16 + 5713568809962283044 addition (signed integer overflow).  So, either it needs to make sure the no longer conditional addition is performed in unsigned type, or it needs to give up.
Comment 3 Richard Biener 2022-04-04 07:27:06 UTC
Let me take a stab at this.  We can enhance

        if (tree_ssa_ifcombine_bb (bb))
          {
            /* Clear range info from all stmts in BB which is now executed
               conditional on a always true/false condition.  */
            reset_flow_sensitive_info_in_bb (bb);
            cfg_changed |= true;

or adjust bb_no_side_effects_p (which already checks gimple_uses_undefined_value_p).
Comment 4 Richard Biener 2022-04-04 09:07:45 UTC
Hmm, I think the error happens earlier already when we simplify

_2 < _16 && (_16 + 5713568809962283044 > _2) to _2 < _4

I do have a patch to do the rewrite into defined overflow which definitely fixes a latent issue but not this bug.
Comment 5 Jakub Jelinek 2022-04-04 09:25:17 UTC
If so, that must be still during ifcombine though.
Because what I see is that we have effectively MIN_EXPR <a, MIN_EXPR <a, c> + large_cst> until threadfull1, which decides to thread it based on the
a < c comparison, if a < c, then the result is just a and the addition is optimized away as unused, otherwise effectively MIN_EXPR <c + large_cst, a>
is done.  I believe that is a correct transformation, because for undefined overflow arithmetics (with no wrap around) a < c && a < c + large_cst is equivalent.
Comment 6 Jakub Jelinek 2022-04-04 09:26:19 UTC
I mean ... is equivalent to a < c.
Comment 7 Richard Biener 2022-04-04 09:28:13 UTC
(In reply to Richard Biener from comment #4)
> Hmm, I think the error happens earlier already when we simplify
> 
> _2 < _16 && (_4 > _2) to _2 < _4
> 
> I do have a patch to do the rewrite into defined overflow which definitely
> fixes a latent issue but not this bug.

Which happens through

Applying pattern match.pd:6775, gimple-match.cc:54944
Applying pattern match.pd:3153, gimple-match.cc:177685

_2 < _16 & _2 < _4

match.pd:6775 -> _2 < min (_16, _4)

and

min (_16, _16 + 5713568809962283044)

match.pd:3153 -> _16 + 5713568809962283044 (aka _4)

which is all "nice" and validates that we can replace the inner condition
with itself but it ignores that it only subsumes the outer condition if
the outer condition is true.

That's a real tough nut to crack.  It basically means that we cannot
really combine two conditions if the definition chain of the inner
condition is under the outer condition (and involves undefined overflow),
unless we somehow can decide whether the outer condition tests for overflow
of any of the guarded stmts.  In practice this means not simplifying
interesting cases I guess and likely fold-const.cc code is susceptible
to the very same issue.

A "simple" workaround would pass in context BBs to maybe_fold_and_comparisons
and adjust maybe_fold_comparisons_from_match_pd with a custom
follow_all_ssa_edges variant, disallowing (signed) operation expansion from
the inner BB.  I will experiment with that to see what the fallout would be.
Note combining will have similar issues with using flow-sensitive range info
from inner BB defined stmts constrained by the outer check.
Comment 8 Richard Biener 2022-04-04 09:32:38 UTC
Created attachment 52744 [details]
patch to rewrite undefined overflow stmts

Here's the initial patch, not solving the testcase.
Comment 9 Richard Biener 2022-04-04 10:11:21 UTC
Created attachment 52745 [details]
avoid combining the conditions

Like this prototype which fixes the testcase and avoids combining from stmts
defined in the middle BB.  It could be enhanced by only considering not
combining from stmts with undefined overflow behavior (or flow-sensitive info).

This patch is also incomplete, maybe_fold_and_comparisons and
maybe_fold_or_comparisons is used by if-conversion, ifcombine and reassoc
and more importantly through recursion with itself and related simplifications
in gimple-fold.cc.  It's not clear where to extract the context BB from and
I didn't follow all the flow through the various routines.  Esp. if-conversion
looks susceptible, reassoc might restrict itself to conditions from the same
BB, not sure.
Comment 10 Richard Biener 2022-04-04 11:20:18 UTC
(In reply to Richard Biener from comment #9)
> Created attachment 52745 [details]
> avoid combining the conditions
> 
> Like this prototype which fixes the testcase and avoids combining from stmts
> defined in the middle BB.  It could be enhanced by only considering not
> combining from stmts with undefined overflow behavior (or flow-sensitive
> info).
> 
> This patch is also incomplete, maybe_fold_and_comparisons and
> maybe_fold_or_comparisons is used by if-conversion, ifcombine and reassoc
> and more importantly through recursion with itself and related
> simplifications
> in gimple-fold.cc.  It's not clear where to extract the context BB from and
> I didn't follow all the flow through the various routines.  Esp.
> if-conversion
> looks susceptible, reassoc might restrict itself to conditions from the same
> BB, not sure.

surprisingly(?) the patch has no negative effect on the testsuite (OTOH it is incomplete and likely test coverage for vectorizer if-conversion is broader).

I'm unsure whether we want to rush things for GCC 12 and to what extent
we want to allow combination from the middle-block(s) (exposing, besides
undefined overflow, other flow-sensitive info that depends on the outer
condition being short-circuiting).
Comment 11 Richard Biener 2022-04-04 12:43:30 UTC
Created attachment 52746 [details]
patch to rewrite undefined overflow stmts

Fixed patch #1 (doesn't fix this testcase).  Bootstrapped/tested OK on x86_64, I'm queuing this for stage1 for now.
Comment 12 GCC Commits 2022-04-06 06:15:06 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:fc8d9e4497032dd295aac9414042163f92250b77

commit r12-8012-gfc8d9e4497032dd295aac9414042163f92250b77
Author: Richard Biener <rguenther@suse.de>
Date:   Mon Apr 4 12:23:28 2022 +0200

    tree-optimization/105142 - wrong code with maybe_fold_{and,or}_comparisons
    
    The following avoids expanding definitions in regions conditionally
    executed under the condition A when simplifying A && B or A || B.
    This is done by passing down the basic-block of the outer condition
    to maybe_fold_{and,or}_comparisons, through the various helpers
    in gimple-fold.cc that might call back to maybe_fold_{and,or}_comparisons
    and ultimatively to maybe_fold_comparisons_from_match_pd where the
    fix is to provide a custom valueization hook to
    gimple_match_op::resimplify that avoids looking at definitions
    that do not dominate the outer block.
    
    For the testcase this avoids combining a stmt that invokes undefined
    integer overflow when the outer condition is false but it also
    aovids combining stmts with range information that is derived from
    the outer condition.
    
    The new parameter to maybe_fold_{and,or}_comparisons is defaulted
    to nullptr and I only adjusted the if-combine to pass down the
    outer block.  I think other callers like tree-if-conv have the
    same issue but it's not straight-forward as to what to do there.
    
    2022-04-05  Richard Biener  <rguenther@suse.de>
    
            PR tree-optimization/105142
            * gimple-fold.h (maybe_fold_and_comparisons): Add defaulted
            basic-block parameter.
            (maybe_fold_or_comparisons): Likewise.
            * gimple-fold.cc (follow_outer_ssa_edges): New.
            (maybe_fold_comparisons_from_match_pd): Use follow_outer_ssa_edges
            when an outer condition basic-block is specified.
            (and_comparisons_1, and_var_with_comparison,
            and_var_with_comparison_1, or_comparisons_1,
            or_var_with_comparison, or_var_with_comparison_1): Receive and pass
            down the outer condition basic-block.
            * tree-ssa-ifcombine.cc (ifcombine_ifandif): Pass down the
            basic-block of the outer condition.
    
            * g++.dg/torture/pr105142.C: New testcase.
Comment 13 Richard Biener 2022-04-06 06:17:53 UTC
The specific instance of wrong-code has been fixed, there are still latent issues though.  I've queued the undefined overflow stmt rewrite for stage1.
Comment 14 GCC Commits 2022-11-11 14:05:32 UTC
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:4b3874d803e7961f38b22fa798517a63171bb985

commit r13-3904-g4b3874d803e7961f38b22fa798517a63171bb985
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Jul 26 11:52:49 2022 +0200

    tree-optimization/105142 - improve maybe_fold_comparisons_from_match_pd fix
    
    The following improves on the fix for PR105142 which restricted the
    expression lookup used for maybe_fold_comparisons_from_match_pd to
    avoid picking up flow-sensitive info for use in places where guarding
    conditions do not hold.  Instead of not allowing to expand SSA
    definitions there the following temporarily clears flow-sensitive
    info on the SSA names and restores it when finished matching.
    
            PR tree-optimization/105142
            * gimple-fold.cc (fosa_unwind): New global.
            (follow_outer_ssa_edges): When the SSA definition to follow
            is does not dominate fosa_bb, temporarily clear flow-sensitive
            info.  Make sure to not expand stmts with not defined overflow.
            (maybe_fold_comparisons_from_match_pd): Set up unwind stack
            for follow_outer_ssa_edges and unwind flow-sensitive info
            clearing after matching.