This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: debugging info considered harmful to lto.


Hi,
to add some extra data
> At the summit, I discovered two things about the internal representation
> of debugging information:
> 
> 1) According to Honza, the instances of the BLOCK tree type take 30% of
> the space in a compilation.

this large portion appears on C++ testcases doing a lot of inlining
(tramp3d, boost, gerald's testcase).  The reason is obvious - we have
several function calls for every instruction output. Since early
inlining is good on killing the instructions early, we end up with many
blocks (there are several blocks constructed for every function inlined)
and they are never pruned out.

I have patch that removes empty block and uses now unused ignore_block
debug hook (this usage probably went away by my rewrite of blocks
handling in cfglayout many years ago and no one noticed problem - so
either ignore_block is currently overly conservative, or we are missing
some later sanity check that all blocks needed are output).  It also
knows how to be more aggressive withtout -g. Without -g most of blocks
comes away, with -g it is about 30-50% if I remember correctly. (I made
it few months back when I first noticed the problem)

I am attaching my work in progress patch (it ICEs in libjava compilation
where it didn't last month so I need to fix it before sending for
review) just in case someone is interested.

> 2) The BLOCKS structure is linked in a way so that the blocks for one
> function link to the blocks of other functions. 
> 
> These two facts conspire to create a big problem for GCC/LTO, especially
> when we progress to trying to compile very large programs.  Unlike many
> other essential parts of gcc, the current representation of debugging
> information is not one that can be divided into moderate sized pieces
> that can be processed independently.
> 
> Honza's last lto patch "solves" the problem in the very short term by
> simply creating a single BLOCK for each function.  This provides enough
> information to allow our testing to continue, but means that no useful
> debugging information will be generated.   This is not acceptable in
> even the medium term, but it allows the problem to be deferred while
> Mark and I get the basics of reading and writing gimple working. 
> 
> I find it somewhat surprising that we need so many blocks.  My
> experience is that in real programs few blocks actually have any local
> declarations and it appears that we do not bother to get rid of the
> blocks that have no local decls.   However the biggest problem for lto
> is that when a procedure is inlined, the set of blocks for the inlined
> function is copied and the new copies have a cross link to the original
> version. It would help a lot if that pointer could be replaced with

The particular problem here is that the abstract origin pointers points
to the blocks within functions they was constructed from. These are used
by dwarf2out to output abstract copy of the function and then use it
as a destination of origin pointers from every copy of the function.
(I am not sure what the block origins are neede for by GDB, explanation
would be welcome)

The functions pointed might be otherwise invisible to middle end (either
never finalized by frotnend - such as in the case of templates where the
ABSTRACT_ORIGINs point to the C++ representation of uninstantiated
template, or just declared dead by cgraph and removed before lowering)
and thus also never saved to LTO files in a way we intend them to work.

In debug output every function can also appear twice - once as abstract
function that is output early via debug hook in the original form with
all blocks where the abstract origins points (this seems partly broken,
I will try to send a fix). Second time as real function such as when
produced offline that can have already modified block structure (by
inlining or by my little block removal pass). Somewhat slopy process is
then matching the origin points (the ABSTACT_ORIGIN pointers points to
now modified BLOCK structure, but dwarf2out data is already out in
unmodified form).

For LTO we probably need to find way to pickle such a abstract
functions in addition to full repreasenation of functions that was
partly processed by optimizations.

> something like a pointer to a function decl and the dfs number of the
> block in the original function.  I do not know the semantics of what is
> needed by the debuggers, but some representation where each function can
> be managed as a separate unit is going to be required to process large
> programs. 

I would like to ask here for some help too.  It seems to me that it is
just tip of the iceberg - most of the other debug info is going on side
directly from frontend and we need way to read it back to LTO frontend
somehow and get it through with correct modification when optimization
happens...

I am trying to make sense of current way debug info is handled but it is
a bit chalenging.  We are inconsistent in many ways - for example we
output debug info on optimized out locals in some cases but not in other
cases, we do some care to output info on optimized out static variables,
but not static locals. We care to output debug info on optimized out
inline functions, but not for functions that are not inline etc.
I believe a lot of this inconsistency was actually brought in by my
cgraph work, so I would like to handle this somehow :(

It seems to me that cgraph/varool is generally not a good place to deal
with debug info for this reasons. We probably need real symboltable that
replace current wrapup_global_declarations process...

Does someone have idea of the overall plan how debug info should work
and what should be there? 

Honza
> 
> Suggestions are welcome, but volunteers willing to attack this problem
> are truly needed.  I do not think that anyone would take lto seriously
> if we cannot support debugging; only toy compilers do not have debugging. 
> 
> Kenny
>  
> 

Index: gimple-low.c
===================================================================
*** gimple-low.c	(revision 124614)
--- gimple-low.c	(working copy)
*************** lower_stmt (tree_stmt_iterator *tsi, str
*** 210,216 ****
  {
    tree stmt = tsi_stmt (*tsi);
  
!   if (EXPR_HAS_LOCATION (stmt) && data)
      TREE_BLOCK (stmt) = data->block;
  
    switch (TREE_CODE (stmt))
--- 210,218 ----
  {
    tree stmt = tsi_stmt (*tsi);
  
!   if (EXPR_HAS_LOCATION (stmt) && data
!       && (debug_info_level == DINFO_LEVEL_NORMAL
!           || debug_info_level == DINFO_LEVEL_VERBOSE))
      TREE_BLOCK (stmt) = data->block;
  
    switch (TREE_CODE (stmt))
Index: tree-ssa-live.c
===================================================================
*** tree-ssa-live.c	(revision 124614)
--- tree-ssa-live.c	(working copy)
*************** Boston, MA 02110-1301, USA.  */
*** 30,35 ****
--- 30,37 ----
  #include "tree-dump.h"
  #include "tree-ssa-live.h"
  #include "toplev.h"
+ #include "debug.h"
+ #include "flags.h"
  
  #ifdef ENABLE_CHECKING
  static void  verify_live_on_entry (tree_live_info_p);
*************** mark_all_vars_used_1 (tree *tp, int *wal
*** 405,413 ****
--- 407,421 ----
  		      void *data ATTRIBUTE_UNUSED)
  {
    tree t = *tp;
+   char const c = TREE_CODE_CLASS (TREE_CODE (t));
+   tree b;
  
    if (TREE_CODE (t) == SSA_NAME)
      t = SSA_NAME_VAR (t);
+   if ((IS_EXPR_CODE_CLASS (c)
+        || IS_GIMPLE_STMT_CODE_CLASS (c))
+       && (b = TREE_BLOCK (t)) != NULL)
+     TREE_USED (b) = true;
  
    /* Ignore TREE_ORIGINAL for TARGET_MEM_REFS, as well as other
       fields that do not contain vars.  */
*************** mark_all_vars_used_1 (tree *tp, int *wal
*** 431,436 ****
--- 439,545 ----
    return NULL;
  }
  
+ /* Mark the scope block SCOPE and is subblocks unused when they can be
+    possibly eliminated if dead.  */
+ 
+ static void
+ mark_scope_block_unused (tree scope)
+ {
+   tree t;
+   TREE_USED (scope) = false;
+   if (!(*debug_hooks->ignore_block) (scope))
+     TREE_USED (scope) = true;
+   for (t = BLOCK_SUBBLOCKS (scope); t ; t = BLOCK_CHAIN (t))
+     mark_scope_block_unused (t);
+ }
+ 
+ /* Look if the block is dead (by possibly elliminating it's dead subblocks)
+    and return true if so.  
+    Block is declared dead if:
+      1) No statements are associated with it.
+      2) Declares no live variables
+      3) All subblocks are dead
+ 	or there is precisely one subblocks and the block
+ 	has same abstract origin as outer block and declares
+ 	no variables, so it is pure wrapper.
+    When we are not outputting full debug info, we also elliminate dead variables
+    out of scope blocks to let them to be recycled by GGC and to save copying work
+    done by the inliner.
+ */
+ 
+ static bool
+ remove_unused_scope_block_p (tree scope)
+ {
+   tree *t, *next;
+   bool unused = !TREE_USED (scope);
+   var_ann_t ann;
+   int nsubblocks = 0;
+ 
+   for (t = &BLOCK_VARS (scope); *t; t = next)
+     {
+       next = &TREE_CHAIN (*t);
+ 
+       /* Debug info of nested function reffers to the block of the
+ 	 function.  */
+       if (TREE_CODE (*t) == FUNCTION_DECL)
+ 	unused = false;
+ 
+       /* When we are outputting debug info, we usually want to output
+ 	 info about optimized-out variables in the scope blocks.
+ 	 Exception are the scope blocks not containing any instructions
+ 	 at all so user can't get into the scopes at first place.  */
+       else if ((ann = var_ann (*t)) != NULL
+ 		&& ann->used)
+ 	unused = false;
+ 
+       /* When we are not doing full debug info, we however can keep around
+ 	 only the used variables for cfgexpand's memory packing saving quite
+ 	 a lot of memory.  */
+       else if (debug_info_level != DINFO_LEVEL_NORMAL
+ 	       && debug_info_level != DINFO_LEVEL_VERBOSE)
+ 	{
+ 	  *t = TREE_CHAIN (*t);
+ 	  next = t;
+ 	}
+     }
+ 
+   for (t = &BLOCK_SUBBLOCKS (scope); *t ;)
+     if (remove_unused_scope_block_p (*t))
+       {
+ 	if (BLOCK_SUBBLOCKS (*t))
+ 	  {
+ 	    tree next = BLOCK_CHAIN (*t);
+ 	    *t = BLOCK_SUBBLOCKS (*t);
+ 	    BLOCK_CHAIN (*t) = next;
+ 	    t = &BLOCK_CHAIN (*t);
+ 	  }
+ 	else
+           *t = BLOCK_CHAIN (*t);
+       }
+     else
+       {
+         t = &BLOCK_CHAIN (*t);
+ 	nsubblocks ++;
+       }
+    /* Outer scope is always used.  */
+    if (!BLOCK_SUPERCONTEXT (scope)
+        || TREE_CODE (BLOCK_SUPERCONTEXT (scope)) == FUNCTION_DECL)
+      unused = false;
+    /* If there are more than one live subblocks, it is used.  */
+    else if (nsubblocks > 1)
+      unused = false;
+    /* When there is only one subblock, see if it is just wrapper we can
+       ignore.  Wrappers are not declaring any variables and not changing
+       abstract origin.  */
+    else if (nsubblocks == 1
+ 	    && (BLOCK_VARS (scope)
+ 		|| ((debug_info_level == DINFO_LEVEL_NORMAL
+ 		     || debug_info_level == DINFO_LEVEL_VERBOSE)
+ 		    && ((BLOCK_ABSTRACT_ORIGIN (scope)
+ 			!= BLOCK_ABSTRACT_ORIGIN (BLOCK_SUPERCONTEXT (scope)))))))
+      unused = false;
+    return unused;
+ }
  
  /* Mark all VAR_DECLS under *EXPR_P as used, so that they won't be 
     eliminated during the tree->rtl conversion process.  */
*************** remove_unused_locals (void)
*** 452,457 ****
--- 561,567 ----
    referenced_var_iterator rvi;
    var_ann_t ann;
  
+   mark_scope_block_unused (DECL_INITIAL (current_function_decl));
    /* Assume all locals are unused.  */
    FOR_EACH_REFERENCED_VAR (t, rvi)
      var_ann (t)->used = false;
*************** remove_unused_locals (void)
*** 498,504 ****
  	  *cell = TREE_CHAIN (*cell);
  	  continue;
  	}
- 
        cell = &TREE_CHAIN (*cell);
      }
  
--- 608,613 ----
*************** remove_unused_locals (void)
*** 516,521 ****
--- 625,631 ----
  	&& !ann->symbol_mem_tag
  	&& !TREE_ADDRESSABLE (t))
        remove_referenced_var (t);
+   remove_unused_scope_block_p (DECL_INITIAL (current_function_decl));
  }
  
  
Index: tree-inline.c
===================================================================
*** tree-inline.c	(revision 124614)
--- tree-inline.c	(working copy)
*************** expand_call_inline (basic_block bb, tree
*** 2498,2507 ****
       actual inline expansion of the body, and a label for the return
       statements within the function to jump to.  The type of the
       statement expression is the return type of the function call.  */
!   id->block = make_node (BLOCK);
!   BLOCK_ABSTRACT_ORIGIN (id->block) = fn;
!   BLOCK_SOURCE_LOCATION (id->block) = input_location;
!   add_lexical_block (TREE_BLOCK (stmt), id->block);
  
    /* Local declarations will be replaced by their equivalents in this
       map.  */
--- 2498,2513 ----
       actual inline expansion of the body, and a label for the return
       statements within the function to jump to.  The type of the
       statement expression is the return type of the function call.  */
!   if (debug_info_level == DINFO_LEVEL_NORMAL
!       || debug_info_level == DINFO_LEVEL_VERBOSE)
!     {
!       id->block = make_node (BLOCK);
!       BLOCK_ABSTRACT_ORIGIN (id->block) = fn;
!       BLOCK_SOURCE_LOCATION (id->block) = input_location;
!       add_lexical_block (TREE_BLOCK (stmt), id->block);
!     }
!   else
!     id->block = DECL_INITIAL (current_function_decl);
  
    /* Local declarations will be replaced by their equivalents in this
       map.  */
Index: tree-cfg.c
===================================================================
*** tree-cfg.c	(revision 124614)
--- tree-cfg.c	(working copy)
*************** move_sese_region_to_fn (struct function 
*** 4925,4930 ****
--- 4925,4948 ----
    return bb;
  }
  
+ /* Dump scope blocks.  */
+ 
+ static void
+ dump_scope_block (FILE *file, int indent, tree scope, int flags)
+ {
+   tree var, t;
+ 
+   fprintf (file, "\n%*sScope block #%i\n",indent, "" , BLOCK_NUMBER (scope));
+   for (var = BLOCK_VARS (scope); var; var = TREE_CHAIN (var))
+     {
+       fprintf (file, "%*s",indent, "");
+       print_generic_decl (file, var, flags);
+       fprintf (file, "\n");
+     }
+   for (t = BLOCK_SUBBLOCKS (scope); t ; t = BLOCK_CHAIN (t))
+     dump_scope_block (file, indent + 2, t, flags);
+ }
+ 
  
  /* Dump FUNCTION_DECL FN to file FILE using FLAGS (see TDF_* in tree.h)  */
  
*************** dump_function_to_file (tree fn, FILE *fi
*** 4980,4985 ****
--- 4998,5005 ----
  
  	  any_var = true;
  	}
+ 
+       dump_scope_block (file, 0, DECL_INITIAL (fn), flags);
      }
  
    if (cfun && cfun->decl == fn && cfun->cfg && basic_block_info)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]