User account creation filtered due to spam.
Compiling the following test case with g++ -Os -fpic on s390x-ibm-linux results in abort() being called unexpectedly: extern "C" void abort (void) __attribute__ ((__noreturn__)); class Transaction { public: bool Aborted; Transaction () : Aborted (false) { } ~Transaction () { if (!Aborted) abort (); } }; void test (Transaction &Trans) __attribute__ ((noinline)); void test (Transaction &Trans) { Trans.Aborted = true; } int main (void) { Transaction T; test (T); } What happens is that the destructor for Transaction is inlined into main (twice, once in the regular exit and once in the exception path). The load of the Aborted member is then moved by the code hoisting pass (pass_rtl_hoist) in gcse.c to a single location in basic block 2. However, that basic block ends with the call to the "test" routine, and for some reason, code hoisting thinks it therefore needs to move the hoisted instruction to *before* that call (see the CALL_P case of insert_insn_end_basic_block). This causes wrong code generation since that call actually modifies the memory that is being loaded. There seems to be other code that appears intended to recognize that case and avoid hoisting then (see prune_expressions), but that apparently only looks for abnormal edges, which we don't have here. Not sure how this is supposed to work correctly ...
Doesn't seem to be a recent regression, at least already r187763 is doing that.
But in the end it actually is a regression that started r166555 (previously it has been latent). Slightly adjusted testcase: struct T { bool a; T () : a (false) {} ~T () { if (!a) __builtin_abort (); } }; __attribute__((noinline)) void test (T &x) { x.a = true; } int main () { T T; test (T); }
The problem is that the hoisting code seems to assume the hoisting would be performed at the end of a bb, but that is not what insert_insn_end_basic_block will then actually do. But by the time insert_insn_end_basic_block is called, it is too late to punt or find another bb to hoist it to. compute_transp only analyzes transparency across the whole bb (to my surprise, it considers all calls as potentially clobbering MEMs, not just non-const/pure functions). So I think we want to have some predicate for what insert_insn_end_basic_block does to emit before the end of bb while searching for which bb to hoist to, and if that predicate is true, if it is a CALL (non-const/pure?) then not consider that bb for expressions containing MEMs (dunno if jumps could clobber MEMs too). Or just check for this and occrs_to_hoist.release (); if it happens in the if (hoistable > 1 && dbg_cnt (hoist_insn)) block or so?
Can't we insert on the edge and commit edge insertions late? Or simply split the block for this case?
(In reply to Richard Biener from comment #4) > Can't we insert on the edge and commit edge insertions late? Or simply > split the block for this case? Well, for the case of calls there typically are 2 edges, not just one (the abnormal one and normal fallthru), except for noreturn throwing calls. So inserting on the edges would usually mean undoing the hoisting. Note if the expression doesn't contain MEM (and isn't clobbered by the call, but that should just affect hard registers), hoisting it before the call is just fine.
On Fri, 16 Dec 2016, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78812 > > --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #4) > > Can't we insert on the edge and commit edge insertions late? Or simply > > split the block for this case? > > Well, for the case of calls there typically are 2 edges, not just one (the > abnormal one and normal fallthru), except for noreturn throwing calls. > So inserting on the edges would usually mean undoing the hoisting. > Note if the expression doesn't contain MEM (and isn't clobbered by the call, > but that should just affect hard registers), hoisting it before the call is > just fine. But then sth is wrong with the dataflow problem ... maybe it somehow assumes that a signle outgoing edge is always fallthru (and not EH)? That is, inserting on EH edges is "tricky" at best...
(In reply to rguenther@suse.de from comment #6) > On Fri, 16 Dec 2016, jakub at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78812 > > > > --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > > (In reply to Richard Biener from comment #4) > > > Can't we insert on the edge and commit edge insertions late? Or simply > > > split the block for this case? > > > > Well, for the case of calls there typically are 2 edges, not just one (the > > abnormal one and normal fallthru), except for noreturn throwing calls. > > So inserting on the edges would usually mean undoing the hoisting. > > Note if the expression doesn't contain MEM (and isn't clobbered by the call, > > but that should just affect hard registers), hoisting it before the call is > > just fine. > > But then sth is wrong with the dataflow problem ... maybe it somehow > assumes that a signle outgoing edge is always fallthru (and not EH)? > That is, inserting on EH edges is "tricky" at best... Dunno, haven't looked into it in detail. The hoisting has been from 2 spots, one after the call's fallthrough and another from the EH landing pad of the call (which is why emitting the expression after the call's fallthrough and duplicating after the EH landing pad would effectively undo the optimization).
On Fri, 16 Dec 2016, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78812 > > --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to rguenther@suse.de from comment #6) > > On Fri, 16 Dec 2016, jakub at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78812 > > > > > > --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > > > (In reply to Richard Biener from comment #4) > > > > Can't we insert on the edge and commit edge insertions late? Or simply > > > > split the block for this case? > > > > > > Well, for the case of calls there typically are 2 edges, not just one (the > > > abnormal one and normal fallthru), except for noreturn throwing calls. > > > So inserting on the edges would usually mean undoing the hoisting. > > > Note if the expression doesn't contain MEM (and isn't clobbered by the call, > > > but that should just affect hard registers), hoisting it before the call is > > > just fine. > > > > But then sth is wrong with the dataflow problem ... maybe it somehow > > assumes that a signle outgoing edge is always fallthru (and not EH)? > > That is, inserting on EH edges is "tricky" at best... > > Dunno, haven't looked into it in detail. The hoisting has been from 2 spots, > one after the call's fallthrough and another from the EH landing pad of the > call (which is why emitting the expression after the call's fallthrough and > duplicating after the EH landing pad would effectively undo the optimization). Ok, but we obviously can't hoist to this place so presumambly have to reflect that in the dataflow somehow (or prune its solution). We have if (!pre_p && MEM_P (expr->expr)) /* Note memory references that can be clobbered by a call. We do not split abnormal edges in hoisting, so would a memory reference get hoisted along an abnormal edge, it would be placed /before/ the call. Therefore, only constant memory references can be hoisted along abnormal edges. */ { but exactly the same reasoning applies to EH edges. Thus Index: gcse.c =================================================================== --- gcse.c (revision 243738) +++ gcse.c (working copy) @@ -1757,7 +1757,7 @@ prune_expressions (bool pre_p) and, hence, there's no need to conservatively prune expressions on "intermediate" set-and-jump instructions. */ FOR_EACH_EDGE (e, ei, bb->preds) - if ((e->flags & EDGE_ABNORMAL) + if ((e->flags & (EDGE_ABNORMAL|EDGE_EH)) && (pre_p || CALL_P (BB_END (e->src)))) { bitmap_and_compl (antloc[bb->index], doing that if only pre_p is probably not necessary (but it'll be only !CALL_P for non-call-exceptions).
So it's been a long time... And IIRC, rth was the one that fixed this wart in gcse. A BB that ends with a call that can throw or any other abnormal edge is supposed to suppress hoisting for precisely the reasons outlined in this BZ. Essentially we need to insert on the edge, but we can't (easily) do that for the EH/abnormal edge and we can't safely insert before the call. Theother possibility was that we kill the expressions in the EH landing pads. I thought we looked at that as well.
So EDGE_EH implies EDGE_ABNORMAL, so the patch in c#8 isn't going to have any impact. The real bug is more subtle. prune_expressions walks the expression table looking for expressions that occur in blocks with abnormal edges. We then remove those expressions from antic/transp. The bug is it uses MEM_P. Which is (of course) false if the given RTX has an embedded MEM. A good example would be: (zero_extend:DI (mem/c:QI (plus:DI (reg/f:DI 34 %fp) (const_int -1 [0xffffffffffffffff])) [2 T.a+0 S1 A8])) Since that's not a MEM_P, the expression isn't removed from antic/transp which makes it subject to hoisting across the abnormal edge. This could be easily fixed by walking into the ZERO/SIGN_EXTEND rtx, but there may be other places where an embedded MEM might appear (hell, on a cisc machine it would be just about anywhere). So I want to do a bit of auditing of gcse.c.
(In reply to Jeffrey A. Law from comment #10) > Since that's not a MEM_P, the expression isn't removed from antic/transp > which makes it subject to hoisting across the abnormal edge. > > This could be easily fixed by walking into the ZERO/SIGN_EXTEND rtx, but > there may be other places where an embedded MEM might appear (hell, on a > cisc machine it would be just about anywhere). So I want to do a bit of > auditing of gcse.c. Can't you just use contains_mem_rtx_p (move it from ifcvt.c to rtlanal.c and export)?
I'm mostly concerned about other places where we assume that a memory reference is supposed to show up at the toplevel of a source/dest. For example, it looks like we don't properly handle the case where we have a REG_EQUAL note of the form (any_extend (mem (...)). THankfully, most of gcse.c will DTRT with such an rtx and the cases for REG_EQUAL notes don't appear to be correctness issues. The only correctness issue so far is prune_expressions. I need to look at postreload-gcse.c as well. All those cases will need to use a helper of some sort to look inside the rtx. The mechanics of that (such as using contains_mem_rtx_p) aren't terribly concerning/interesting at this stage.
So I see a case in postreload-gcse.c where we might mis-handle when the destination is a ZERO_EXTRACT or STRICT_LOW_PART. Neither happen often which is probably why we've never noticed.
Author: law Date: Thu Jan 5 07:38:48 2017 New Revision: 244093 URL: https://gcc.gnu.org/viewcvs?rev=244093&root=gcc&view=rev Log: PR tree-optimizatin/78812 * rtl.h (contains_mem_rtx_p): Prototype. * ifcvt.c (containts_mem_rtx_p): Move from here to... * rtlanal.c (contains_mem_rtx_p): Here and remvoe static linkage. * gcse.c (prune_expressions): Use contains_mem_rtx_p to discover and prune MEMs that are not at the toplevel of a SET_SRC rtx. Look through ZERO_EXTEND and SIGN_EXTEND when trying to avoid pruning MEMs. PR tree-optimization/78812 * g++.dg/torture/pr78812.C: New test. Added: trunk/gcc/testsuite/g++.dg/torture/pr78812.C Modified: trunk/gcc/ChangeLog trunk/gcc/gcse.c trunk/gcc/ifcvt.c trunk/gcc/rtl.h trunk/gcc/rtlanal.c trunk/gcc/testsuite/ChangeLog
On Wed, 4 Jan 2017, law at redhat dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78812 > > Jeffrey A. Law <law at redhat dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Assignee|unassigned at gcc dot gnu.org |law at redhat dot com > > --- Comment #10 from Jeffrey A. Law <law at redhat dot com> --- > So EDGE_EH implies EDGE_ABNORMAL, so the patch in c#8 isn't going to have any > impact. The real bug is more subtle. EDGE_EH does not imply EDGE_ABNORMAL.
ABNORMAL_CALL and EH imply ABNORMAL, at least for RTL CFG according to their documentation. I haven't gone through the trouble to see where that is enforced, but I did very that the edges had EH and ABNORMAL set in the dumps.
> ABNORMAL_CALL and EH imply ABNORMAL, at least for RTL CFG according to their > documentation. I haven't gone through the trouble to see where that is > enforced, but I did very that the edges had EH and ABNORMAL set in the dumps. This was historically the case in GIMPLE too but Jan changed it at some point.