Bug 80536 - [6/7/8 Regression] UBSAN: compile time segfault
Summary: [6/7/8 Regression] UBSAN: compile time segfault
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 7.0
: P2 normal
Target Milestone: 6.4
Assignee: Marek Polacek
URL:
Keywords:
Depends on:
Blocks: yarpgen
  Show dependency treegraph
 
Reported: 2017-04-26 20:16 UTC by Dmitry Babokin
Modified: 2021-11-01 23:07 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2017-04-26 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry Babokin 2017-04-26 20:16:13 UTC
> cat f.cpp
extern unsigned char var_67;
void foo() {
  2 * (var_67 * (unsigned long long)(-0 - -2565537134464303311UL)) % 1;
}

> g++ -fsanitize=undefined -O0 -c f.cpp
g++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.

gcc is top of the trunk (r247282), x86_64.
Comment 1 Marek Polacek 2017-04-26 20:20:08 UTC
I'll have a look.  Looks like an infinite recursion in fold_binary_loc.
Comment 2 Marek Polacek 2017-04-26 20:26:08 UTC
Started with

commit c61a1e061d7d208571d26dbb4e0873b0b4e36ec4
Author: jason <jason@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Tue Nov 17 21:44:08 2015 +0000

            Don't fold -(constant) or -0.
    
            * parser.c (cp_parser_unary_expression): Fold -constant here.
            * typeck.c (cp_build_unary_op): Not here.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@230506 138bc75d-0d04-0410-961f-82ee72b054a4
Comment 3 Marek Polacek 2017-05-09 12:50:18 UTC
Better testcase:

int
foo (int i)
{
  return ((i * (unsigned long long) (-0 + 1UL)) * 2) % 1;
}
Comment 4 Jakub Jelinek 2017-05-09 16:20:35 UTC
Would dropping the fold call from save_expr fix this?
Comment 5 Jakub Jelinek 2017-05-09 16:22:20 UTC
To expand on that, I think we want to drop that call from there and instead be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it becomes simple enough not to require any saving.
Comment 6 Marek Polacek 2017-05-09 16:32:55 UTC
Yeah, it helps with this particular testcase (and I agree we want to get rid of that fold() call in save_expr -- I'll take care of it), but I wonder if this issue is something separate: starting with r230506 we can generate expressions such as
i * ((unsigned long) -0 + 1) * 2
and given how fold() works we don't get to simplify the sub-expression "(unsigned long) -0 + 1)" so this expression isn't folded to "i * 2".  So I wonder if we want something like

--- a/gcc/convert.c
+++ b/gcc/convert.c
@@ -524,7 +524,13 @@ convert_to_integer_1 (tree type, tree expr, bool dofold)
    return expr;
       return build2_loc (EXPR_LOCATION (expr), COMPOUND_EXPR, TREE_TYPE (t),
             TREE_OPERAND (expr, 0), t);
-    }    
+    }
+
+  /* -0 is 0, so get rid of the NEGATE_EXPR.  */
+  if (0 && ex_form == NEGATE_EXPR
+      && TREE_CODE (TREE_OPERAND (expr, 0)) == INTEGER_CST
+      && integer_zerop (TREE_OPERAND (expr, 0)))
+    return convert_to_integer_maybe_fold (type, TREE_OPERAND (expr, 0), dofold);
 
   /* Convert e.g. (long)round(d) -> lround(d).  */
   /* If we're converting to char, we may encounter differing behavior
Comment 7 Marek Polacek 2017-05-09 16:34:46 UTC
I.e. I'm worried we could trigger the endless recursion also if we happen to call fold() on that expression via a different path than from save_expr.
Comment 8 Jakub Jelinek 2017-05-09 16:41:46 UTC
But we ideally shouldn't be folding anything until we actually c_fully_fold or cp_fold recursively, starting with the leafs.  Most of the folders heavily rely on that.
Comment 9 Marek Polacek 2017-05-09 17:01:16 UTC
Well, I hope we're not.  Very much related: PR80386.
Comment 10 Marek Polacek 2017-05-09 17:28:19 UTC
Removing the fold() call doesn't regress anything, btw.
Comment 11 Marek Polacek 2017-05-10 12:25:39 UTC
(In reply to Jakub Jelinek from comment #5)
> To expand on that, I think we want to drop that call from there and instead
> be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it
> becomes simple enough not to require any saving.

Hmm, I'm not sure what you mean.  save_expr has

3351   if (tree_invariant_p_1 (inner))
3352     return expr;

so we shouldn't create unnecessary SAVE_EXPRs.  Or do you mean something else?
Comment 12 Jakub Jelinek 2017-05-10 12:42:04 UTC
(In reply to Marek Polacek from comment #11)
> (In reply to Jakub Jelinek from comment #5)
> > To expand on that, I think we want to drop that call from there and instead
> > be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it
> > becomes simple enough not to require any saving.
> 
> Hmm, I'm not sure what you mean.  save_expr has
> 
> 3351   if (tree_invariant_p_1 (inner))
> 3352     return expr;

Sure, it has and also has skip_simple_arithmetic.  But without the fold there is a chance (though small, as fold isn't recursive) that it previously would turn something non-invariant/simple arithmetics into invariant/simple arith and we wouldn't create the SAVE_EXPR, but now do.  Besides increased memory footprint that wouldn't be bad, the problem is that I don't see any of the recursive folders being able to undo that, so we end up with them until gimplification.

Thus, it would be nice if e.g. cp_fold, or fold, or c_fully_fold_internal was able to fold a SAVE_EXPR where:
  inner = skip_simple_arithmetic (TREE_OPERAND (save_expr, 0));
  if (TREE_CODE (inner) == ERROR_MARK)
    return inner;

  if (tree_invariant_p_1 (inner))
    return TREE_OPERAND (save_expr, 0);

The problem on the C FE side (that would be nice to fix) is that it has its own c_save_expr that wants the operand to be c_fully_folded already when creating the SAVE_EXPR, it would be better if we could post-pone that and perhaps use some flag on the SAVE_EXPR to indicate whether we've c_fully_folded the operand already or not and only fully fold it once (C++ FE does that through its hash maps) the first time something calls c_fully_fold on the SAVE_EXPR.
So maybe you should start just with the C++ FE for now, or do it in fold too.
Comment 13 Marek Polacek 2017-05-10 16:35:20 UTC
(In reply to Jakub Jelinek from comment #12)
> (In reply to Marek Polacek from comment #11)
> > (In reply to Jakub Jelinek from comment #5)
> > > To expand on that, I think we want to drop that call from there and instead
> > > be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it
> > > becomes simple enough not to require any saving.
> > 
> > Hmm, I'm not sure what you mean.  save_expr has
> > 
> > 3351   if (tree_invariant_p_1 (inner))
> > 3352     return expr;
> 
> Sure, it has and also has skip_simple_arithmetic.  But without the fold
> there is a chance (though small, as fold isn't recursive) that it previously
> would turn something non-invariant/simple arithmetics into invariant/simple
> arith and we wouldn't create the SAVE_EXPR, but now do.  Besides increased
> memory footprint that wouldn't be bad, the problem is that I don't see any
> of the recursive folders being able to undo that, so we end up with them
> until gimplification.

This is true, but it happens very rarely.  It can happen e.g. when the fold() call in save_expr() folds away the first operand of a COMPOUND_EXPR, and the second operand is e.g.
(long unsigned int) ((sizetype) SAVE_EXPR <n> * 4)
then skip_simple_arithmetic can pull out "SAVE_EXPR <n>" out of it, which is  tree_invariant_p_1.  

> Thus, it would be nice if e.g. cp_fold, or fold, or c_fully_fold_internal
> was able to fold a SAVE_EXPR where:
>   inner = skip_simple_arithmetic (TREE_OPERAND (save_expr, 0));
>   if (TREE_CODE (inner) == ERROR_MARK)
>     return inner;
> 
>   if (tree_invariant_p_1 (inner))
>     return TREE_OPERAND (save_expr, 0);

But even if I add this to fold or c_fully_fold, we don't have any guarantees that any of these will be called before gimplification, right?  So most likely we'd end up with the new SAVE_EXPR in the gimplifier, which, as you point out, is not that bad.

> The problem on the C FE side (that would be nice to fix) is that it has its
> own c_save_expr that wants the operand to be c_fully_folded already when
> creating the SAVE_EXPR, it would be better if we could post-pone that and
> perhaps use some flag on the SAVE_EXPR to indicate whether we've
> c_fully_folded the operand already or not and only fully fold it once (C++
> FE does that through its hash maps) the first time something calls
> c_fully_fold on the SAVE_EXPR.
> So maybe you should start just with the C++ FE for now, or do it in fold too.

But c_fully_fold nor cp_fold step into SAVE_EXPRs, they just return them unmodified.  What can happen though is that c_save_expr gets something that c_fully_fold folds into an invariant/simple arith, in which case we don't wrap it in SAVE_EXPR, so the same expression might be folded multiple times, right?  And that could be solved by adding a hash map to c_fully_fold.

So shouldn't we first apply just this?

Comments very appreciated.

--- gcc/tree.c
+++ gcc/tree.c
@@ -3337,7 +3337,6 @@ tree_invariant_p (tree t)
 tree
 save_expr (tree expr)
 {
-  tree t = fold (expr);
   tree inner;
 
   /* If the tree evaluates to a constant, then we don't want to hide that
@@ -3345,33 +3344,33 @@ save_expr (tree expr)
      However, a read-only object that has side effects cannot be bypassed.
      Since it is no problem to reevaluate literals, we just return the
      literal node.  */
-  inner = skip_simple_arithmetic (t);
+  inner = skip_simple_arithmetic (expr);
   if (TREE_CODE (inner) == ERROR_MARK)
     return inner;
 
   if (tree_invariant_p_1 (inner))
-    return t;
+    return expr;
 
   /* If INNER contains a PLACEHOLDER_EXPR, we must evaluate it each time, since
      it means that the size or offset of some field of an object depends on
      the value within another field.
 
-     Note that it must not be the case that T contains both a PLACEHOLDER_EXPR
+     Note that it must not be the case that EXPR contains both a PLACEHOLDER_EXPR
      and some variable since it would then need to be both evaluated once and
      evaluated more than once.  Front-ends must assure this case cannot
      happen by surrounding any such subexpressions in their own SAVE_EXPR
      and forcing evaluation at the proper time.  */
   if (contains_placeholder_p (inner))
-    return t;
+    return expr;
 
-  t = build1 (SAVE_EXPR, TREE_TYPE (expr), t);
-  SET_EXPR_LOCATION (t, EXPR_LOCATION (expr));
+  expr = build1 (SAVE_EXPR, TREE_TYPE (expr), expr);
+  SET_EXPR_LOCATION (expr, EXPR_LOCATION (expr));
 
   /* This expression might be placed ahead of a jump to ensure that the
      value was computed on both sides of the jump.  So make sure it isn't
      eliminated as dead.  */
-  TREE_SIDE_EFFECTS (t) = 1;
-  return t;
+  TREE_SIDE_EFFECTS (expr) = 1;
+  return expr;
 }
 
 /* Look inside EXPR into any simple arithmetic operations.  Return the
Comment 14 Jakub Jelinek 2017-05-10 16:57:31 UTC
(In reply to Marek Polacek from comment #13)
> This is true, but it happens very rarely.  It can happen e.g. when the
> fold() call in save_expr() folds away the first operand of a COMPOUND_EXPR,
> and the second operand is e.g.

Can't it happen say if you have save_expr called with (0 * i) + (0 * j) + (0 * k) or whatever similar initially complex, but after folding very simple and obviously invariant?

> But even if I add this to fold or c_fully_fold, we don't have any guarantees
> that any of these will be called before gimplification, right?  So most
> likely we'd end up with the new SAVE_EXPR in the gimplifier, which, as you
> point out, is not that bad.

I think cp_fold should handle SAVE_EXPR (by cp_folding the operand, and if it is invariant or invariant after skipping simple arith, returning that folded operand, otherwise making sure to add the SAVE_EXPR into the fold_cache giving 
the SAVE_EXPR itself.  Right now cp_fold ignores SAVE_EXPR, but cp_fold_r handles it, but that one doesn't do much good, because it cp_folds the operands only after folding the containing trees.
Comment 15 Marek Polacek 2017-05-11 10:24:04 UTC
(In reply to Jakub Jelinek from comment #14)
> (In reply to Marek Polacek from comment #13)
> > This is true, but it happens very rarely.  It can happen e.g. when the
> > fold() call in save_expr() folds away the first operand of a COMPOUND_EXPR,
> > and the second operand is e.g.
> 
> Can't it happen say if you have save_expr called with (0 * i) + (0 * j) + (0
> * k) or whatever similar initially complex, but after folding very simple
> and obviously invariant?

In C I don't think so, because we mostly call c_save_expr and c_fully_fold therein would fold that expression to 0.  And when we call save_expr, it's when in_late_binary_op so the operands have already been folded.  There's one case, though, where we call save_expr without previous folding, and that's when constructing a VLA whose size is a sizeof of another VLA in grokdeclarator:

 6097                     /* Arrange for the SAVE_EXPR on the inside of the
 6098                        MINUS_EXPR, which allows the -1 to get folded
 6099                        with the +1 that happens when building TYPE_SIZE.  */
 6100                     if (size_varies)
 6101                       size = save_expr (size);

void
f (int i)
{
  int (*a)[i];
  int x[sizeof (*a)];
}

I wouldn't worry much about that.

> > But even if I add this to fold or c_fully_fold, we don't have any guarantees
> > that any of these will be called before gimplification, right?  So most
> > likely we'd end up with the new SAVE_EXPR in the gimplifier, which, as you
> > point out, is not that bad.
> 
> I think cp_fold should handle SAVE_EXPR (by cp_folding the operand, and if
> it is invariant or invariant after skipping simple arith, returning that
> folded operand, otherwise making sure to add the SAVE_EXPR into the
> fold_cache giving 
> the SAVE_EXPR itself.  Right now cp_fold ignores SAVE_EXPR, but cp_fold_r
> handles it, but that one doesn't do much good, because it cp_folds the
> operands only after folding the containing trees.

I'm testing this.  Judging by running the C++ testsuite, it basically never happens that we're able to cp_fold the content of a SAVE_EXPR to an invariant, although it happens e.g. with this test:

int
foo (int i)
{
  return ((0 * i * (unsigned long long) (-0 + 1UL)) * 2) % 1;
}

so it probably makes sense to add the cp_fold bits.

Thanks.
Comment 16 Jakub Jelinek 2017-05-11 10:33:35 UTC
(In reply to Marek Polacek from comment #15)
> In C I don't think so, because we mostly call c_save_expr and c_fully_fold
> therein would fold that expression to 0.  And when we call save_expr, it's

Yeah, I know that, and I think it is a serious bug.  With the c_fully_fold in c_save_expr the C FE is not doing anything close to delayed folding, it folds immediately whenever we might need a save_expr, the old trees gone.
So I think it would be nice to kill c_save_expr, just use save_expr, and let c_fully_fold fold SAVE_EXPR operand (just once, not many times).
Comment 17 Marek Polacek 2017-05-11 10:39:27 UTC
(In reply to Jakub Jelinek from comment #16)
> (In reply to Marek Polacek from comment #15)
> > In C I don't think so, because we mostly call c_save_expr and c_fully_fold
> > therein would fold that expression to 0.  And when we call save_expr, it's
> 
> Yeah, I know that, and I think it is a serious bug.  With the c_fully_fold
> in c_save_expr the C FE is not doing anything close to delayed folding, it
> folds immediately whenever we might need a save_expr, the old trees gone.
> So I think it would be nice to kill c_save_expr, just use save_expr, and let
> c_fully_fold fold SAVE_EXPR operand (just once, not many times).

I agree.  I'll try (guess we'll need the fold cache).  That can be a separate project, though.
Comment 18 Jakub Jelinek 2017-05-11 10:44:55 UTC
(In reply to Marek Polacek from comment #17)
> (In reply to Jakub Jelinek from comment #16)
> > (In reply to Marek Polacek from comment #15)
> > > In C I don't think so, because we mostly call c_save_expr and c_fully_fold
> > > therein would fold that expression to 0.  And when we call save_expr, it's
> > 
> > Yeah, I know that, and I think it is a serious bug.  With the c_fully_fold
> > in c_save_expr the C FE is not doing anything close to delayed folding, it
> > folds immediately whenever we might need a save_expr, the old trees gone.
> > So I think it would be nice to kill c_save_expr, just use save_expr, and let
> > c_fully_fold fold SAVE_EXPR operand (just once, not many times).
> 
> I agree.  I'll try (guess we'll need the fold cache).  That can be a
> separate project, though.

See above, if it is just about SAVE_EXPR, the C FE could just grab one of the many spare bits on SAVE_EXPR for a flag whether the operand has been c_fully_folded already.
Comment 19 Marek Polacek 2017-05-16 19:25:36 UTC
Author: mpolacek
Date: Tue May 16 19:25:04 2017
New Revision: 248124

URL: https://gcc.gnu.org/viewcvs?rev=248124&root=gcc&view=rev
Log:
	PR sanitizer/80536
	PR sanitizer/80386
	* cp-gimplify.c (cp_fold): Handle SAVE_EXPR.

	* tree.c (save_expr): Don't fold the expression.

	* c-c++-common/ubsan/pr80536.c: New test.
	* g++.dg/ubsan/pr80386.C: New test.

Added:
    trunk/gcc/testsuite/c-c++-common/ubsan/pr80536.c
    trunk/gcc/testsuite/g++.dg/ubsan/pr80386.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cp/ChangeLog
    trunk/gcc/cp/cp-gimplify.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree.c
Comment 20 Marek Polacek 2017-05-16 19:27:53 UTC
Fixed.