Running /aux/hubicka/9-install/bin/g++ Unified_cpp_dom_events0-8.ii -c -flto -flifetime-dse=1 -fPIC -fstack-protector-strong -Wall -Wempty-body -Wignored-qualifiers -Woverloaded-virtual -Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code -Wwrite-strings -Wno-invalid-offsetof -Wc++1z-compat -Wduplicated-cond -Wimplicit-fallthrough -Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object -Wno-error=multistatement-macros -Wno-error=class-memaccess -Wformat -Wformat-overflow=2 -fno-sized-deallocation -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fno-exceptions -fno-strict-aliasing -fno-rtti -fno-exceptions -fno-math-errno -pthread -pipe -g -freorder-blocks -O2 -fno-omit-frame-pointer -funwind-tables -Wno-error=shadow eventually runs out of memory on my machine, while GCC 8 finishes rather quickly. Most of time is spent in gimple_copy. Without debug info the file builds for me.
I see -flto in there, so making this block the mozillametabug
http://www.ucw.cz/~hubicka/Unified_cpp_dom_events0-8.ii.xz
Note that it fails without -flto as well. Without -g it builds&works for me. We end up with huge BB containing # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG aArgs#1 => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG aFirst => NULL Seems that these statements are being early inlined and duplicated indefinitely. Something needs to cut them to sensible limit.
The code is: inline bool IsNodeInternal() const { return false; } template <typename First, typename... Args> inline bool IsNodeInternal(First aFirst, Args... aArgs) const { return mNodeInfo->Equals(aFirst) || IsNodeInternal(aArgs...); } public: inline bool IsHTMLElement() const { return IsElement() && IsInNamespace(3); } inline bool IsHTMLElement(const nsAtom* aTag) const { return IsElement() && mNodeInfo->Equals(aTag, 3); } template <typename First, typename... Args> inline bool IsAnyOfHTMLElements(First aFirst, Args... aArgs) const { return IsHTMLElement() && IsNodeInternal(aFirst, aArgs...); } $ grep ";; Function nsINode::IsNodeInternal" *cfg2 | less ;; Function nsINode::IsNodeInternal (_ZNK7nsINode14IsNodeInternalEv, funcdef_no=13668, decl_uid=274623, cgraph_uid=7143, symbol_order=7869) ;; Function nsINode::IsNodeInternal<nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJEEEbT_DpT0_, funcdef_no=58886, decl_uid=1359969, cgraph_uid=49491, symbol_order=50680) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_EEEbT_DpT0_, funcdef_no=58870, decl_uid=1359893, cgraph_uid=49475, symbol_order=50664) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_EEEbT_DpT0_, funcdef_no=58853, decl_uid=1359814, cgraph_uid=49458, symbol_order=50647) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_EEEbT_DpT0_, funcdef_no=58835, decl_uid=1359727, cgraph_uid=49440, symbol_order=50629) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_EEEbT_DpT0_, funcdef_no=58816, decl_uid=1359637, cgraph_uid=49421, symbol_order=50610) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_EEEbT_DpT0_, funcdef_no=58796, decl_uid=1359532, cgraph_uid=49401, symbol_order=50590) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_S2_EEEbT_DpT0_, funcdef_no=58774, decl_uid=1359424, cgraph_uid=49379, symbol_order=50568) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_S2_S2_EEEbT_DpT0_, funcdef_no=58751, decl_uid=1359292, cgraph_uid=49356, symbol_order=50545) ;; Function nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (_ZNK7nsINode14IsNodeInternalIP12nsStaticAtomJS2_S2_S2_S2_S2_S2_S2_S2_EEEbT_DpT0_, funcdef_no=58719, decl_uid=1359003, cgraph_uid=49324, symbol_order=50513) GCC ends up producing empty BBS with 1753608 debug statements in nsINode::IsNodeInternal<nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*, nsStaticAtom*> (const struct nsINode * const this, struct nsStaticAtom * aFirst, struct nsStaticAtom * aArgs#0, struct nsStaticAtom * aArgs#1, struct nsStaticAtom * aArgs#2, struct nsStaticAtom * aArgs#3, struct nsStaticAtom * aArgs#4, struct nsStaticAtom * aArgs#5, struct nsStaticAtom * aArgs#6, struct nsStaticAtom * aArgs#7 I did not wait for the version with 9 parameters :)
It sounds like possibly "caused" by the PR90131 or PR89892 fixes. We've been here before btw (with the complaint of "useless" redundant debug-bind stmts). I see no "testcase" here?
Created attachment 46256 [details] debug stmt DCE Does the attached help? On the testcase from PR89892 it does @@ -204,7 +202,6 @@ goto <bb 5>; [100.00%] <bb 4> [local count: 21570272]: - # DEBUG d => NULL # DEBUG d => 0 # DEBUG BEGIN_STMT # DEBUG BEGIN_STMT @@ -213,8 +210,6 @@ a ={v} _3; # DEBUG BEGIN_STMT # DEBUG d => 1 - # DEBUG d => NULL - # DEBUG d => 1 # DEBUG BEGIN_STMT # DEBUG BEGIN_STMT _5 = c.2_6 + 1; and later @@ -268,7 +264,6 @@ # DEBUG BEGIN_STMT d_26 = d_32 + 1; # DEBUG d => d_26 - # DEBUG d => d_26 # DEBUG BEGIN_STMT if (d_26 <= 5) goto <bb 7>; [89.00%] for example. I need to ask Alex whether we can ignore location differences (and what differences) on the debug stmts.
Alex - in a series of # DEBUG x => ... # DEBUG x => ... # DEBUG x => ... what are the rules of which ones we can remove? Can we always just keep the last? What about location differences? What about possibly interleaving DEBUG_BEGIN stmts? For the quoted example # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG this => NULL # DEBUG aFirst => NULL # DEBUG aArgs#0 => NULL # DEBUG aArgs#1 => NULL # DEBUG this => NULL I fear the simple way of matching only adjacent stmts isn't catching all possibilities. Using a bitmap might be possible but can be also expensive.
I guess it is important to know if those are the same decls or different, just with the same names. -fdump-tree-all-uid will tell...
Created attachment 46257 [details] more aggressive variant More aggressive variant using a bitmap, simply keeping the last, does for example @@ -243,7 +237,6 @@ <bb 5> [local count: 43794188]: # DEBUG BEGIN_STMT i_28 = i_35 + 1; - # DEBUG i => i_28 # DEBUG d => 1 # DEBUG i => i_28 note if there are debug temps those are still not DCEd even if not used. In theory they should always have debug temp defs and uses in the same BB so the backward walk could gather local uses. But that needs more experiments / verification.
Hi, the file was too large for bugzilla so I uploaded it to http://www.ucw.cz/~hubicka/Unified_cpp_dom_events0-8.ii.xz and posted link in comment #2 :) The agressive variant helps (I did not try to other one) but we still get huge BBs containing: # DEBUG thisD.1400042 => NULL # DEBUG thisD.1400332 => NULL # DEBUG aFirstD.1400331 => NULL # DEBUG thisD.1400330 => NULL # DEBUG aFirstD.1400329 => NULL # DEBUG aArgs#0D.1400328 => NULL # DEBUG thisD.1400327 => NULL # DEBUG aFirstD.1400326 => NULL # DEBUG aArgs#0D.1400325 => NULL # DEBUG aArgs#1D.1400324 => NULL # DEBUG thisD.1400323 => NULL # DEBUG aFirstD.1400322 => NULL # DEBUG aArgs#0D.1400321 => NULL # DEBUG aArgs#1D.1400320 => NULL # DEBUG aArgs#2D.1400319 => NULL # DEBUG thisD.1400318 => NULL # DEBUG aFirstD.1400317 => NULL # DEBUG aArgs#0D.1400316 => NULL # DEBUG aArgs#1D.1400315 => NULL # DEBUG aArgs#2D.1400314 => NULL # DEBUG aArgs#3D.1400313 => NULL # DEBUG thisD.1400312 => NULL # DEBUG aFirstD.1400311 => NULL # DEBUG aArgs#0D.1400310 => NULL # DEBUG aArgs#1D.1400309 => NULL # DEBUG aArgs#2D.1400308 => NULL # DEBUG aArgs#3D.1400307 => NULL # DEBUG aArgs#4D.1400306 => NULL # DEBUG thisD.1400305 => NULL # DEBUG aFirstD.1400304 => NULL # DEBUG aArgs#0D.1400303 => NULL # DEBUG aArgs#1D.1400302 => NULL # DEBUG aArgs#2D.1400301 => NULL # DEBUG aArgs#3D.1400300 => NULL # DEBUG aArgs#4D.1400299 => NULL # DEBUG aArgs#5D.1400298 => NULL # DEBUG thisD.1400297 => NULL # DEBUG aFirstD.1400296 => NULL # DEBUG aArgs#0D.1400295 => NULL # DEBUG aArgs#1D.1400294 => NULL .... I will be back after breakfast :) Especialy for LTO we realy want to avoid too many of those in early opts. Honza
I also see # DEBUG D#1 => {CLOBBER} # DEBUG grayWordD.71258 => D#1 that looks pointless. Results from removing grayWord_57(D) ={v} {CLOBBER}; during into-SSA rewrite. That should instead be # DEBUG grayWord => NULL saving one debug-temp. Testing Index: gcc/tree-ssa.c =================================================================== --- gcc/tree-ssa.c (revision 270643) +++ gcc/tree-ssa.c (working copy) @@ -358,6 +358,11 @@ insert_debug_temp_for_var_def (gimple_st else if (value == error_mark_node) value = NULL; } + else if (gimple_clobber_p (def_stmt)) + /* We can end up here when rewriting a decl into SSA and coming + along a clobber for the original decl. Turn that into + # DEBUG decl => NULL */ + value = NULL; else if (is_gimple_assign (def_stmt)) { bool no_value = false;
Is # DEBUG INLINE_ENTRY NULL useful at all? And since I see # DEBUG D#1157 => &aOther_2(D)->mOffsetD.377222 # DEBUG thisD.1370683 => D#1158 # DEBUG aOtherD.1370684 => D#1157 # DEBUG INLINE_ENTRY __ct D.183324 MEM[(struct MaybeD.183104 *)this_5(D) + 16B].mIsSomeD.183308 = 0; _7 = MEM[(const struct Maybe &)aOther_2(D) + 16].mIsSomeD.183308; if (_7 != 0) goto <bb 3>; [50.00%] else goto <bb 4>; [50.00%] <bb 3> [local count: 536870913]: # DEBUG thisD.1370685 => D#1157 # DEBUG INLINE_ENTRY operator*D.183425 # DEBUG thisD.1370677 => D#1157 # DEBUG INLINE_ENTRY NULL # DEBUG thisD.1370678 => D#1157 # DEBUG INLINE_ENTRY NULL my guess that debug temps are always in the same BB as uses is wrong.
(In reply to Richard Biener from comment #9) > Created attachment 46257 [details] > more aggressive variant > > More aggressive variant using a bitmap, simply keeping the last, does for > example > > @@ -243,7 +237,6 @@ > <bb 5> [local count: 43794188]: > # DEBUG BEGIN_STMT > i_28 = i_35 + 1; > - # DEBUG i => i_28 > # DEBUG d => 1 > # DEBUG i => i_28 > > note if there are debug temps those are still not DCEd even if not used. > In theory they should always have debug temp defs and uses in the same > BB so the backward walk could gather local uses. But that needs more > experiments / verification. As noted this one likely isn't safe considering # DEBUG D#1 => 123; # DEBUG a => D#1; # DEBUG D#1 => 234; # DEBUG b => D#1; where we'd remove the # DEBUG D#1 => 123; stmt. It has been said the situation can only occur with DEBUG_EXPR_DECL thus excempting those would work. That is, replacing D#1 with say 'c' won't ever happen (non-DEBUG_EXPR_DECL decls on the RHS, there just appear SSA names, invariants or DEBUG_EXPR_DECLs). With not DCEing DEBUG_EXPR_DECL binds on the testcase we go from > wc -l Unified_cpp_dom_events0-8.ii.046t.release_ssa 163546 Unified_cpp_dom_events0-8.ii.046t.release_ssa to > wc -l Unified_cpp_dom_events0-8.ii.046t.release_ssa 163626 Unified_cpp_dom_events0-8.ii.046t.release_ssa
Created attachment 46258 [details] updated patch Updated patch. Previous aggressive variant survived bootstrap/test OK.
And it's indeed "caused" by the CFG-cleanup fixes. The DCE patch gets us back to (nearly) the same number of debug stmts as before.
Created attachment 46260 [details] removed-unused-local patch There do appear to be variables that are just appearing in # DEBUG var => NULL stmts, not in debug binds and not referenced elsewhere. But not many. Mostly unused this variables it seems. 362 debug binds are dead this way for the testcase (after the DCE patch).
Created attachment 46261 [details] Doing the removal in find_obviously_necessary_stmts Just for the record: I'd written this over the weekend for completely unrelated reasons (was looking at debug info, and wanted to replicate the DCE that is already done in RTL). It doesn't include the DEBUG_EXPR_DECL stuff, so obviously can't go in as-is, but it means that we never even add the redundant instructions to the worklist.
(In reply to rsandifo@gcc.gnu.org from comment #17) > Created attachment 46261 [details] > Doing the removal in find_obviously_necessary_stmts > > Just for the record: I'd written this over the weekend for completely > unrelated reasons (was looking at debug info, and wanted to replicate the > DCE that is already done in RTL). It doesn't include the DEBUG_EXPR_DECL > stuff, so obviously can't go in as-is, but it means that we never even add > the redundant instructions to the worklist. Yeah, that looks nearly equivalent (same issue with DEBUG_EXPR_DECLs though). I think doing it in eliminate_unnecessary_stmts is slightly better for the case where we come from # DEBUG i => NULL; i_3 = 1; # DEBUG i => i_3; and are about to DCE i_3 (I think that's not handled by my patch either without further tweaks). No debug-stmt support in the GIMPLE FE right now so not so easy to test atm. For int main() { int i; i = 1, i = 2; return i+1; } with -O -g -fdisable-tree-ccp1 -fdisable-tree-fre1 -fdisable-tree-evrp we go from <bb 2> : # DEBUG BEGIN_STMT # DEBUG BEGIN_STMT i_1 = 1; # DEBUG i => i_1 i_2 = 2; # DEBUG i => i_2 # DEBUG BEGIN_STMT _3 = 3; return 3; after DSE1 to <bb 2> : # DEBUG BEGIN_STMT # DEBUG BEGIN_STMT # DEBUG i => 1 # DEBUG i => 2 # DEBUG BEGIN_STMT return 3; after CDDCE1 which could be improved to just # DEBUG i => 2 (note the missing DEBUG BEGIN_STMT for compound stmts). The following would do that trick (doing a bit more debug DCE on the testcase in this PR) Index: gcc/tree-ssa-dce.c =================================================================== --- gcc/tree-ssa-dce.c (revision 270645) +++ gcc/tree-ssa-dce.c (working copy) @@ -1282,11 +1283,15 @@ eliminate_unnecessary_stmts (void) if (!is_gimple_debug (stmt)) something_changed = true; remove_dead_stmt (&gsi, bb, to_remove_edges); + continue; } else if (is_gimple_call (stmt)) {
(In reply to Richard Biener from comment #18) > (In reply to rsandifo@gcc.gnu.org from comment #17) > > Created attachment 46261 [details] > > Doing the removal in find_obviously_necessary_stmts > > > > Just for the record: I'd written this over the weekend for completely > > unrelated reasons (was looking at debug info, and wanted to replicate the > > DCE that is already done in RTL). It doesn't include the DEBUG_EXPR_DECL > > stuff, so obviously can't go in as-is, but it means that we never even add > > the redundant instructions to the worklist. > > Yeah, that looks nearly equivalent (same issue with DEBUG_EXPR_DECLs though). > > I think doing it in eliminate_unnecessary_stmts is slightly better for the > case where we come from > > # DEBUG i => NULL; > i_3 = 1; > # DEBUG i => i_3; > > and are about to DCE i_3 Ah, yeah.
Another report: https://gcc.gnu.org/ml/gcc/2019-04/msg00270.html
I just built Firefox with GCC 9 branch and the updated patch. Debug enabled build is: real 45m56.187s user 493m8.688s sys 22m4.512s compared to debug disabled build with GCC 9: real 35m5.141s user 386m2.364s sys 17m21.892s GCC 8 debug disabled build is: real 36m55.979s user 389m54.256s sys 17m51.964s and llvm7 debug disabled is: real 36m28.096s user 384m24.056s sys 8m53.768s All with LTO. I will build gcc8 and llvm7 with debug info enabled.
gcc 8 with debug info real 45m49.664s user 500m1.776s sys 22m39.816s llvm 7 with debug info real 43m43.798s user 447m36.028s sys 10m13.512s
> what are the rules of which ones we can remove? Can we always just keep the last? What about location differences? What about possibly interleaving DEBUG_BEGIN stmts? code insns and markers are potential inspection points, so binds reaching them should ideally not be messed with. this means one can safely remove binds that are overridden before the next inspection point, so yes, keeping only the last one that provides a binding for any given variable before each inspection point should be safe. this should (in theory) be reasonably rare, though, because there should be sequence points between assignments to the same variable. this doesn't imply inspection points, e.g., we don't insert markers at comma operators. I suppose these arise from binds inserted by CFG rearrangements rather than when entering SSA. I don't have much of a sense of these yet, but it's clear their meaning is looser, in a way. They don't refer to a specific code location or bind present in sources, but rather they indicate our inability to keep track of bindings after certain transformations. My intuition is that, as things are, we could drop them in the same circumstances we'd drop overriding binds, but maybe we could go looser about them, especially when they're resets. When you ask about location differences, it's not clear whether you're asking about binding values (used in location lists) or source line locations. It's not always viable to tell whether binding expressions are equivalent, but the last one prevails except at control flow confluences, and var-tracking deals with that much later. As for source line locations, I don't think they're relevant in debug bind statements. > # DEBUG INLINE_ENTRY NULL That is a marker, but without naming the inlined function, it would appear to be useless indeed. I wonder, however, if the logical block that named the inlined function was really lost, or if it just references some anonymous function. This would guide the decision on whether the marker became useless for losing track of the function named by it (which might require an investigation of how this came about, instead of silently dropping further debug info), or whether we just don't have a name for the function any more (but we might still have debug info generated for it) > my guess that debug temps are always in the same BB as uses is wrong. they can be bound in one BB and used in another, indeed. they may even be bound in multiple BBs and used in multiple other BBs, and then var-tracking will attempt to identify shared values in incoming exprs at confluence points. we don't do that very much, but we could. even without that, there are cases in which we issue temp binds in one block and reference them in another, even when we're not sure one reaches the other: var-tracking sorts that out eventually.
(In reply to Alexandre Oliva from comment #23) > > what are the rules of which ones we can remove? Can we always just keep the > last? What about location differences? What about possibly interleaving > DEBUG_BEGIN stmts? I think Richard's patch only considers debug bind stmts next to each other, any other stmt like DEBUG BEGIN_STMT, etc. reset the table (i.e. stop the elimination). > When you ask about location differences, it's not clear whether you're > asking about binding values (used in location lists) or source line > locations. It's not always viable to tell whether binding expressions are > equivalent, but the last one prevails except at control flow confluences, > and var-tracking deals with that much later. As for source line locations, > I don't think they're relevant in debug bind statements. My understanding is that we completely ignore gimple_location from debug bind stmts. While debug stmts and later DEBUG_INSNs do have location, the -fcompare-debug rules should make sure that other stmts/insns location is not affected by the locations of the debug stmts/insns. And, var-tracking replaces the DEBUG_INSNs with NOTE_INSN_VAR_LOCATION notes that do not have a locus. > > my guess that debug temps are always in the same BB as uses is wrong. > > they can be bound in one BB and used in another, indeed. they may even be > bound in multiple BBs and used in multiple other BBs, and then var-tracking > will attempt to identify shared values in incoming exprs at confluence > points. we don't do that very much, but we could. even without that, there > are cases in which we issue temp binds in one block and reference them in > another, even when we're not sure one reaches the other: var-tracking sorts > that out eventually. Besides Richard's eliminate_unnecessary_stmts patch, which you've acked and I'm going to install soon, there are other patches (GCC 10 only presumably): #c11 and #c16, what do you think about those? The former canonicalizes {CLOBBER} values of debug stmts to NULL, the latter tries to remove debug stmts for variables which are not referenced in the function at all and have only NULL debug binds for them; I believe in such cases the debug insns don't provide anything over the optimized away vars in the body. Another thing is if we could also remove DEBUG_EXPR_DECLs by doing analysis similar to #c16, look through all debug bind stmts, see what DEBUG_EXPR_DECLs they are refering to, and if there are DEBUG_EXPR_DECLs which are not referenced in any of them and they aren't referenced in decl_debug_args_lookup either, just remove debug bind stmts for those DEBUG_EXPR_DECLs. Similarly, would it be useful to reset debug bind stmts if they refer to DEBUG_EXPR_DECLs that are never set (or do it only for some simple cases where we couldn't figure out any location from the expr)?
Author: jakub Date: Tue Apr 30 07:40:06 2019 New Revision: 270674 URL: https://gcc.gnu.org/viewcvs?rev=270674&root=gcc&view=rev Log: PR tree-optimization/90273 * tree-ssa-dce.c (eliminate_unnecessary_stmts): Eliminate useless debug stmts. Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/tree-ssa-dce.c
I saw the #c11 patch in gcc-patches, and it seemed to have been posted FTR and installed. It looked good, so I didn't comment on it. I agree about the effects of #c16, though I begin to get a feeling that it's working too hard for too little benefit. Ditto trying to optimize debug temps: you will get some savings, sure, but how much benefit for such global analyses? Perhaps we'd get a much bigger bang for the buck introducing vector resets, in which a single gimple bind stmt would reset several decls at once. If that's become as common as it is being made out to be, this could save a significant amount of memory. Though from Jan's comments on compile times, it doesn't look like we've got much slower, which makes me wonder what the new problem really is... Could it be that debug binds have always been there, plentiful but under the radar, and that the real recent regression (assuming there really is one) lies elsewhere? (sorry, I haven't really dug into it myself)
On April 30, 2019 4:27:25 PM GMT+02:00, "aoliva at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90273 > >--- Comment #26 from Alexandre Oliva <aoliva at gcc dot gnu.org> --- >I saw the #c11 patch in gcc-patches, and it seemed to have been posted >FTR and >installed. It looked good, so I didn't comment on it. > >I agree about the effects of #c16, though I begin to get a feeling that >it's >working too hard for too little benefit. Ditto trying to optimize >debug temps: >you will get some savings, sure, but how much benefit for such global >analyses? > >Perhaps we'd get a much bigger bang for the buck introducing vector >resets, in >which a single gimple bind stmt would reset several decls at once. If >that's >become as common as it is being made out to be, this could save a >significant >amount of memory. > >Though from Jan's comments on compile times, it doesn't look like we've >got >much slower, which makes me wonder what the new problem really is... >Could it >be that debug binds have always been there, plentiful but under the >radar, and >that the real recent regression (assuming there really is one) lies >elsewhere? >(sorry, I haven't really dug into it myself) The recent regression is we no longer throw them away plentiful during CFG cleanup and now they pile up during inlining. I agree full DCE with liveness will be expensive for usually little gain. Not sure if vector resets will improve things much.
> The recent regression is we no longer throw them away plentiful during CFG > cleanup and now they pile up during inlining. > > I agree full DCE with liveness will be expensive for usually little gain. Not > sure if vector resets will improve things much. One thing to keep in mind that after early opts and in late opts after the initial cleanups post ipa-inline-transforms we likely have a lot of new debug statements brought in by inliner. It would make sense to do something more expensive twice in queue to get rid of them. Especially in early opts we do not want to make too many useless debug statements to hit the LTO stream or get duplicated by subsequent inlining. Honza
Fixed on gcc-9-branch, trunk commits left for Richard.
Fixed.
Author: rguenth Date: Thu May 2 11:17:00 2019 New Revision: 270791 URL: https://gcc.gnu.org/viewcvs?rev=270791&root=gcc&view=rev Log: 2019-05-02 Richard Biener <rguenther@suse.de> PR tree-optimization/90273 * tree-ssa-dce.c (eliminate_unnecessary_stmts): Eliminate useless debug stmts. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-dce.c