> cat f.cpp extern unsigned char var_67; void foo() { 2 * (var_67 * (unsigned long long)(-0 - -2565537134464303311UL)) % 1; } > g++ -fsanitize=undefined -O0 -c f.cpp g++: internal compiler error: Segmentation fault (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See <https://gcc.gnu.org/bugs/> for instructions. gcc is top of the trunk (r247282), x86_64.
I'll have a look. Looks like an infinite recursion in fold_binary_loc.
Started with commit c61a1e061d7d208571d26dbb4e0873b0b4e36ec4 Author: jason <jason@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Tue Nov 17 21:44:08 2015 +0000 Don't fold -(constant) or -0. * parser.c (cp_parser_unary_expression): Fold -constant here. * typeck.c (cp_build_unary_op): Not here. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@230506 138bc75d-0d04-0410-961f-82ee72b054a4
Better testcase: int foo (int i) { return ((i * (unsigned long long) (-0 + 1UL)) * 2) % 1; }
Would dropping the fold call from save_expr fix this?
To expand on that, I think we want to drop that call from there and instead be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it becomes simple enough not to require any saving.
Yeah, it helps with this particular testcase (and I agree we want to get rid of that fold() call in save_expr -- I'll take care of it), but I wonder if this issue is something separate: starting with r230506 we can generate expressions such as i * ((unsigned long) -0 + 1) * 2 and given how fold() works we don't get to simplify the sub-expression "(unsigned long) -0 + 1)" so this expression isn't folded to "i * 2". So I wonder if we want something like --- a/gcc/convert.c +++ b/gcc/convert.c @@ -524,7 +524,13 @@ convert_to_integer_1 (tree type, tree expr, bool dofold) return expr; return build2_loc (EXPR_LOCATION (expr), COMPOUND_EXPR, TREE_TYPE (t), TREE_OPERAND (expr, 0), t); - } + } + + /* -0 is 0, so get rid of the NEGATE_EXPR. */ + if (0 && ex_form == NEGATE_EXPR + && TREE_CODE (TREE_OPERAND (expr, 0)) == INTEGER_CST + && integer_zerop (TREE_OPERAND (expr, 0))) + return convert_to_integer_maybe_fold (type, TREE_OPERAND (expr, 0), dofold); /* Convert e.g. (long)round(d) -> lround(d). */ /* If we're converting to char, we may encounter differing behavior
I.e. I'm worried we could trigger the endless recursion also if we happen to call fold() on that expression via a different path than from save_expr.
But we ideally shouldn't be folding anything until we actually c_fully_fold or cp_fold recursively, starting with the leafs. Most of the folders heavily rely on that.
Well, I hope we're not. Very much related: PR80386.
Removing the fold() call doesn't regress anything, btw.
(In reply to Jakub Jelinek from comment #5) > To expand on that, I think we want to drop that call from there and instead > be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it > becomes simple enough not to require any saving. Hmm, I'm not sure what you mean. save_expr has 3351 if (tree_invariant_p_1 (inner)) 3352 return expr; so we shouldn't create unnecessary SAVE_EXPRs. Or do you mean something else?
(In reply to Marek Polacek from comment #11) > (In reply to Jakub Jelinek from comment #5) > > To expand on that, I think we want to drop that call from there and instead > > be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it > > becomes simple enough not to require any saving. > > Hmm, I'm not sure what you mean. save_expr has > > 3351 if (tree_invariant_p_1 (inner)) > 3352 return expr; Sure, it has and also has skip_simple_arithmetic. But without the fold there is a chance (though small, as fold isn't recursive) that it previously would turn something non-invariant/simple arithmetics into invariant/simple arith and we wouldn't create the SAVE_EXPR, but now do. Besides increased memory footprint that wouldn't be bad, the problem is that I don't see any of the recursive folders being able to undo that, so we end up with them until gimplification. Thus, it would be nice if e.g. cp_fold, or fold, or c_fully_fold_internal was able to fold a SAVE_EXPR where: inner = skip_simple_arithmetic (TREE_OPERAND (save_expr, 0)); if (TREE_CODE (inner) == ERROR_MARK) return inner; if (tree_invariant_p_1 (inner)) return TREE_OPERAND (save_expr, 0); The problem on the C FE side (that would be nice to fix) is that it has its own c_save_expr that wants the operand to be c_fully_folded already when creating the SAVE_EXPR, it would be better if we could post-pone that and perhaps use some flag on the SAVE_EXPR to indicate whether we've c_fully_folded the operand already or not and only fully fold it once (C++ FE does that through its hash maps) the first time something calls c_fully_fold on the SAVE_EXPR. So maybe you should start just with the C++ FE for now, or do it in fold too.
(In reply to Jakub Jelinek from comment #12) > (In reply to Marek Polacek from comment #11) > > (In reply to Jakub Jelinek from comment #5) > > > To expand on that, I think we want to drop that call from there and instead > > > be able to simplify somehow a SAVE_EXPR if after c_fully_fold or cp_fold it > > > becomes simple enough not to require any saving. > > > > Hmm, I'm not sure what you mean. save_expr has > > > > 3351 if (tree_invariant_p_1 (inner)) > > 3352 return expr; > > Sure, it has and also has skip_simple_arithmetic. But without the fold > there is a chance (though small, as fold isn't recursive) that it previously > would turn something non-invariant/simple arithmetics into invariant/simple > arith and we wouldn't create the SAVE_EXPR, but now do. Besides increased > memory footprint that wouldn't be bad, the problem is that I don't see any > of the recursive folders being able to undo that, so we end up with them > until gimplification. This is true, but it happens very rarely. It can happen e.g. when the fold() call in save_expr() folds away the first operand of a COMPOUND_EXPR, and the second operand is e.g. (long unsigned int) ((sizetype) SAVE_EXPR <n> * 4) then skip_simple_arithmetic can pull out "SAVE_EXPR <n>" out of it, which is tree_invariant_p_1. > Thus, it would be nice if e.g. cp_fold, or fold, or c_fully_fold_internal > was able to fold a SAVE_EXPR where: > inner = skip_simple_arithmetic (TREE_OPERAND (save_expr, 0)); > if (TREE_CODE (inner) == ERROR_MARK) > return inner; > > if (tree_invariant_p_1 (inner)) > return TREE_OPERAND (save_expr, 0); But even if I add this to fold or c_fully_fold, we don't have any guarantees that any of these will be called before gimplification, right? So most likely we'd end up with the new SAVE_EXPR in the gimplifier, which, as you point out, is not that bad. > The problem on the C FE side (that would be nice to fix) is that it has its > own c_save_expr that wants the operand to be c_fully_folded already when > creating the SAVE_EXPR, it would be better if we could post-pone that and > perhaps use some flag on the SAVE_EXPR to indicate whether we've > c_fully_folded the operand already or not and only fully fold it once (C++ > FE does that through its hash maps) the first time something calls > c_fully_fold on the SAVE_EXPR. > So maybe you should start just with the C++ FE for now, or do it in fold too. But c_fully_fold nor cp_fold step into SAVE_EXPRs, they just return them unmodified. What can happen though is that c_save_expr gets something that c_fully_fold folds into an invariant/simple arith, in which case we don't wrap it in SAVE_EXPR, so the same expression might be folded multiple times, right? And that could be solved by adding a hash map to c_fully_fold. So shouldn't we first apply just this? Comments very appreciated. --- gcc/tree.c +++ gcc/tree.c @@ -3337,7 +3337,6 @@ tree_invariant_p (tree t) tree save_expr (tree expr) { - tree t = fold (expr); tree inner; /* If the tree evaluates to a constant, then we don't want to hide that @@ -3345,33 +3344,33 @@ save_expr (tree expr) However, a read-only object that has side effects cannot be bypassed. Since it is no problem to reevaluate literals, we just return the literal node. */ - inner = skip_simple_arithmetic (t); + inner = skip_simple_arithmetic (expr); if (TREE_CODE (inner) == ERROR_MARK) return inner; if (tree_invariant_p_1 (inner)) - return t; + return expr; /* If INNER contains a PLACEHOLDER_EXPR, we must evaluate it each time, since it means that the size or offset of some field of an object depends on the value within another field. - Note that it must not be the case that T contains both a PLACEHOLDER_EXPR + Note that it must not be the case that EXPR contains both a PLACEHOLDER_EXPR and some variable since it would then need to be both evaluated once and evaluated more than once. Front-ends must assure this case cannot happen by surrounding any such subexpressions in their own SAVE_EXPR and forcing evaluation at the proper time. */ if (contains_placeholder_p (inner)) - return t; + return expr; - t = build1 (SAVE_EXPR, TREE_TYPE (expr), t); - SET_EXPR_LOCATION (t, EXPR_LOCATION (expr)); + expr = build1 (SAVE_EXPR, TREE_TYPE (expr), expr); + SET_EXPR_LOCATION (expr, EXPR_LOCATION (expr)); /* This expression might be placed ahead of a jump to ensure that the value was computed on both sides of the jump. So make sure it isn't eliminated as dead. */ - TREE_SIDE_EFFECTS (t) = 1; - return t; + TREE_SIDE_EFFECTS (expr) = 1; + return expr; } /* Look inside EXPR into any simple arithmetic operations. Return the
(In reply to Marek Polacek from comment #13) > This is true, but it happens very rarely. It can happen e.g. when the > fold() call in save_expr() folds away the first operand of a COMPOUND_EXPR, > and the second operand is e.g. Can't it happen say if you have save_expr called with (0 * i) + (0 * j) + (0 * k) or whatever similar initially complex, but after folding very simple and obviously invariant? > But even if I add this to fold or c_fully_fold, we don't have any guarantees > that any of these will be called before gimplification, right? So most > likely we'd end up with the new SAVE_EXPR in the gimplifier, which, as you > point out, is not that bad. I think cp_fold should handle SAVE_EXPR (by cp_folding the operand, and if it is invariant or invariant after skipping simple arith, returning that folded operand, otherwise making sure to add the SAVE_EXPR into the fold_cache giving the SAVE_EXPR itself. Right now cp_fold ignores SAVE_EXPR, but cp_fold_r handles it, but that one doesn't do much good, because it cp_folds the operands only after folding the containing trees.
(In reply to Jakub Jelinek from comment #14) > (In reply to Marek Polacek from comment #13) > > This is true, but it happens very rarely. It can happen e.g. when the > > fold() call in save_expr() folds away the first operand of a COMPOUND_EXPR, > > and the second operand is e.g. > > Can't it happen say if you have save_expr called with (0 * i) + (0 * j) + (0 > * k) or whatever similar initially complex, but after folding very simple > and obviously invariant? In C I don't think so, because we mostly call c_save_expr and c_fully_fold therein would fold that expression to 0. And when we call save_expr, it's when in_late_binary_op so the operands have already been folded. There's one case, though, where we call save_expr without previous folding, and that's when constructing a VLA whose size is a sizeof of another VLA in grokdeclarator: 6097 /* Arrange for the SAVE_EXPR on the inside of the 6098 MINUS_EXPR, which allows the -1 to get folded 6099 with the +1 that happens when building TYPE_SIZE. */ 6100 if (size_varies) 6101 size = save_expr (size); void f (int i) { int (*a)[i]; int x[sizeof (*a)]; } I wouldn't worry much about that. > > But even if I add this to fold or c_fully_fold, we don't have any guarantees > > that any of these will be called before gimplification, right? So most > > likely we'd end up with the new SAVE_EXPR in the gimplifier, which, as you > > point out, is not that bad. > > I think cp_fold should handle SAVE_EXPR (by cp_folding the operand, and if > it is invariant or invariant after skipping simple arith, returning that > folded operand, otherwise making sure to add the SAVE_EXPR into the > fold_cache giving > the SAVE_EXPR itself. Right now cp_fold ignores SAVE_EXPR, but cp_fold_r > handles it, but that one doesn't do much good, because it cp_folds the > operands only after folding the containing trees. I'm testing this. Judging by running the C++ testsuite, it basically never happens that we're able to cp_fold the content of a SAVE_EXPR to an invariant, although it happens e.g. with this test: int foo (int i) { return ((0 * i * (unsigned long long) (-0 + 1UL)) * 2) % 1; } so it probably makes sense to add the cp_fold bits. Thanks.
(In reply to Marek Polacek from comment #15) > In C I don't think so, because we mostly call c_save_expr and c_fully_fold > therein would fold that expression to 0. And when we call save_expr, it's Yeah, I know that, and I think it is a serious bug. With the c_fully_fold in c_save_expr the C FE is not doing anything close to delayed folding, it folds immediately whenever we might need a save_expr, the old trees gone. So I think it would be nice to kill c_save_expr, just use save_expr, and let c_fully_fold fold SAVE_EXPR operand (just once, not many times).
(In reply to Jakub Jelinek from comment #16) > (In reply to Marek Polacek from comment #15) > > In C I don't think so, because we mostly call c_save_expr and c_fully_fold > > therein would fold that expression to 0. And when we call save_expr, it's > > Yeah, I know that, and I think it is a serious bug. With the c_fully_fold > in c_save_expr the C FE is not doing anything close to delayed folding, it > folds immediately whenever we might need a save_expr, the old trees gone. > So I think it would be nice to kill c_save_expr, just use save_expr, and let > c_fully_fold fold SAVE_EXPR operand (just once, not many times). I agree. I'll try (guess we'll need the fold cache). That can be a separate project, though.
(In reply to Marek Polacek from comment #17) > (In reply to Jakub Jelinek from comment #16) > > (In reply to Marek Polacek from comment #15) > > > In C I don't think so, because we mostly call c_save_expr and c_fully_fold > > > therein would fold that expression to 0. And when we call save_expr, it's > > > > Yeah, I know that, and I think it is a serious bug. With the c_fully_fold > > in c_save_expr the C FE is not doing anything close to delayed folding, it > > folds immediately whenever we might need a save_expr, the old trees gone. > > So I think it would be nice to kill c_save_expr, just use save_expr, and let > > c_fully_fold fold SAVE_EXPR operand (just once, not many times). > > I agree. I'll try (guess we'll need the fold cache). That can be a > separate project, though. See above, if it is just about SAVE_EXPR, the C FE could just grab one of the many spare bits on SAVE_EXPR for a flag whether the operand has been c_fully_folded already.
Author: mpolacek Date: Tue May 16 19:25:04 2017 New Revision: 248124 URL: https://gcc.gnu.org/viewcvs?rev=248124&root=gcc&view=rev Log: PR sanitizer/80536 PR sanitizer/80386 * cp-gimplify.c (cp_fold): Handle SAVE_EXPR. * tree.c (save_expr): Don't fold the expression. * c-c++-common/ubsan/pr80536.c: New test. * g++.dg/ubsan/pr80386.C: New test. Added: trunk/gcc/testsuite/c-c++-common/ubsan/pr80536.c trunk/gcc/testsuite/g++.dg/ubsan/pr80386.C Modified: trunk/gcc/ChangeLog trunk/gcc/cp/ChangeLog trunk/gcc/cp/cp-gimplify.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree.c
Fixed.