[rfc] Whole program optimization

Jan Hubicka jh@suse.cz
Wed May 25 23:31:00 GMT 2005


Hi,
this patch adds the -fwhole-program command line option that effectivly brings
all public functions static except for main and those marked with "used"
attribute.  With --combine it is usefull to make IPA quite more active
(especially code size wise at a moment, but hopefully performance wise soon
too).

There are two issues I would like to discuss before applying the patch.
I remember that we discussed this with Richard while past and Richard
argued that ELF visibility flags should be used instead.  I don't have
archive of the disucssion and I am not quite convinced that this is
possible (this is quite ortoghonal issue to the visibility outside
linked object), so if I miss something here, please enlighten me.

Other issue is the way of explicitly marking entry point - at the moment I
reuse attribute "used" as any function marked by this beast can not be properly
optimized anyway, but it might be usefull to invent new attribute for this
such as "externally_visible" or so.

Honza

Bootstrapped/regtested i686-pc-gnu-linux.

2005-05-26  Jan Hubicka  <jh@suse.cz>
	* cgraph.c (dump_cgraph_node): Print new flags.
	(dump_cgraph_varpool_node): Likewise.
	(decide_variable_is_needed): Initialize externally_visible flag.
	* cgraph.h (cgraph_local_info): Add externally_visible flag.
	(cgraph_varpool_node): Likewise.
	(cgraph_function_flags_ready): Declare.
	* cgraph.c (cgraph_mark_local_functions): Rename to ...
	(cgraph_function_and_variable_visibility) ... this one; handle
	externally_visible flags.
	(decide_is_function_needed): Set externally_visible flag.
	(cgraph_finalize_function): Deal properly with early cleanups.
	(cgraph_optimize): Update call of
	cgraph_function_and_variable_visibility.
	* common.opt (fwhole-program): New.
	* invoke.texi (-fwhole-program): Document.

Index: cgraph.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.73
diff -c -3 -p -r1.73 cgraph.c
*** cgraph.c	25 May 2005 12:33:30 -0000	1.73
--- cgraph.c	25 May 2005 15:07:18 -0000
*************** dump_cgraph_node (FILE *f, struct cgraph
*** 581,590 ****
--- 581,596 ----
      fprintf (f, " output");
    if (node->local.local)
      fprintf (f, " local");
+   if (node->local.externally_visible)
+     fprintf (f, " externally_visible");
+   if (node->local.finalized)
+     fprintf (f, " finalized");
    if (node->local.disregard_inline_limits)
      fprintf (f, " always_inline");
    else if (node->local.inlinable)
      fprintf (f, " inlinable");
+   if (node->local.redefined_extern_inline)
+     fprintf (f, " redefined_extern_inline");
    if (TREE_ASM_WRITTEN (node->decl))
      fprintf (f, " asm_written");
  
*************** dump_cgraph_varpool_node (FILE *f, struc
*** 638,643 ****
--- 644,651 ----
      fprintf (f, " finalized");
    if (node->output)
      fprintf (f, " output");
+   if (node->externally_visible)
+     fprintf (f, " externally_visible");
    fprintf (f, "\n");
  }
  
*************** decide_is_variable_needed (struct cgraph
*** 771,784 ****
  {
    /* If the user told us it is used, then it must be so.  */
    if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
!     return true;
  
    /* ??? If the assembler name is set by hand, it is possible to assemble
       the name later after finalizing the function and the fact is noticed
       in assemble_name then.  This is arguably a bug.  */
    if (DECL_ASSEMBLER_NAME_SET_P (decl)
        && TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)))
!     return true;
  
    /* If we decided it was needed before, but at the time we didn't have
       the definition available, then it's still needed.  */
--- 779,800 ----
  {
    /* If the user told us it is used, then it must be so.  */
    if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
!     {
!       if (TREE_PUBLIC (decl))
!         node->externally_visible = true;
!       return true;
!     }
  
    /* ??? If the assembler name is set by hand, it is possible to assemble
       the name later after finalizing the function and the fact is noticed
       in assemble_name then.  This is arguably a bug.  */
    if (DECL_ASSEMBLER_NAME_SET_P (decl)
        && TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)))
!     {
!       if (TREE_PUBLIC (decl))
!         node->externally_visible = true;
!       return true;
!     }
  
    /* If we decided it was needed before, but at the time we didn't have
       the definition available, then it's still needed.  */
Index: cgraph.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraph.h,v
retrieving revision 1.54
diff -c -3 -p -r1.54 cgraph.h
*** cgraph.h	25 May 2005 12:33:31 -0000	1.54
--- cgraph.h	25 May 2005 15:07:18 -0000
*************** struct cgraph_local_info GTY(())
*** 36,41 ****
--- 36,44 ----
       and its address is never taken.  */
    bool local;
  
+   /* Set when function is visible by other units.  */
+   bool externally_visible;
+ 
    /* Set once it has been finalized so we consider it to be output.  */
    bool finalized;
  
*************** struct cgraph_varpool_node GTY(())
*** 177,182 ****
--- 180,187 ----
    bool finalized;
    /* Set when function is scheduled to be assembled.  */
    bool output;
+   /* Set when function is visible by other units.  */
+   bool externally_visible;
    /* Set for aliases once they got through assemble_alias.  */
    bool alias;
  };
*************** extern GTY(()) struct cgraph_node *cgrap
*** 185,190 ****
--- 190,196 ----
  extern GTY(()) int cgraph_n_nodes;
  extern GTY(()) int cgraph_max_uid;
  extern bool cgraph_global_info_ready;
+ extern bool cgraph_function_flags_ready;
  extern GTY(()) struct cgraph_node *cgraph_nodes_queue;
  
  extern GTY(()) struct cgraph_varpool_node *cgraph_varpool_first_unanalyzed_node;
Index: cgraphunit.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cgraphunit.c,v
retrieving revision 1.107
diff -c -3 -p -r1.107 cgraphunit.c
*** cgraphunit.c	20 May 2005 08:05:07 -0000	1.107
--- cgraphunit.c	25 May 2005 15:07:18 -0000
*************** static void cgraph_expand_all_functions 
*** 170,176 ****
  static void cgraph_mark_functions_to_output (void);
  static void cgraph_expand_function (struct cgraph_node *);
  static tree record_call_1 (tree *, int *, void *);
- static void cgraph_mark_local_functions (void);
  static void cgraph_analyze_function (struct cgraph_node *node);
  static void cgraph_create_edges (struct cgraph_node *node, tree body);
  
--- 170,175 ----
*************** static bool
*** 191,196 ****
--- 190,216 ----
  decide_is_function_needed (struct cgraph_node *node, tree decl)
  {
    tree origin;
+   if (MAIN_NAME_P (DECL_NAME (decl))
+       && TREE_PUBLIC (decl))
+     {
+       node->local.externally_visible = true;
+       return true;
+     }
+ 
+   /* If the user told us it is used, then it must be so.  */
+   if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
+     {
+       if (TREE_PUBLIC (decl))
+         node->local.externally_visible = true;
+       return true;
+     }
+ 
+   /* ??? If the assembler name is set by hand, it is possible to assemble
+      the name later after finalizing the function and the fact is noticed
+      in assemble_name then.  This is arguably a bug.  */
+   if (DECL_ASSEMBLER_NAME_SET_P (decl)
+       && TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)))
+     return true;
  
    /* If we decided it was needed before, but at the time we didn't have
       the body of the function available, then it's still needed.  We have
*************** decide_is_function_needed (struct cgraph
*** 200,206 ****
  
    /* Externally visible functions must be output.  The exception is
       COMDAT functions that must be output only when they are needed.  */
!   if (TREE_PUBLIC (decl) && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
      return true;
  
    /* Constructors and destructors are reachable from the runtime by
--- 224,231 ----
  
    /* Externally visible functions must be output.  The exception is
       COMDAT functions that must be output only when they are needed.  */
!   if ((TREE_PUBLIC (decl) && !flag_whole_program)
!       && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
      return true;
  
    /* Constructors and destructors are reachable from the runtime by
*************** decide_is_function_needed (struct cgraph
*** 208,224 ****
    if (DECL_STATIC_CONSTRUCTOR (decl) || DECL_STATIC_DESTRUCTOR (decl))
      return true;
  
-   /* If the user told us it is used, then it must be so.  */
-   if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
-     return true;
- 
-   /* ??? If the assembler name is set by hand, it is possible to assemble
-      the name later after finalizing the function and the fact is noticed
-      in assemble_name then.  This is arguably a bug.  */
-   if (DECL_ASSEMBLER_NAME_SET_P (decl)
-       && TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)))
-     return true;
- 
    if (flag_unit_at_a_time)
      return false;
  
--- 233,238 ----
*************** cgraph_finalize_function (tree decl, boo
*** 418,423 ****
--- 432,444 ----
    if (decide_is_function_needed (node, decl))
      cgraph_mark_needed_node (node);
  
+   /* Since we reclaim unrechable nodes at the end of every language
+      level unit, we need to be conservative about possible entry points
+      there.  */
+   if (flag_whole_program
+       && (TREE_PUBLIC (decl) && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl)))
+     cgraph_mark_reachable_node (node);
+ 
    /* If not unit at a time, go ahead and emit everything we've found
       to be reachable at this time.  */
    if (!nested)
*************** cgraph_expand_all_functions (void)
*** 1041,1066 ****
    free (order);
  }
  
! /* Mark all local functions.
     
     A local function is one whose calls can occur only in the current
     compilation unit and all its calls are explicit, so we can change
     its calling convention.  We simply mark all static functions whose
!    address is not taken as local.  */
  
  static void
! cgraph_mark_local_functions (void)
  {
    struct cgraph_node *node;
  
-   /* Figure out functions we want to assemble.  */
    for (node = cgraph_nodes; node; node = node->next)
      {
        node->local.local = (!node->needed
! 		           && DECL_SAVED_TREE (node->decl)
! 		           && !TREE_PUBLIC (node->decl));
      }
  
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nMarking local functions:");
--- 1062,1123 ----
    free (order);
  }
  
! /* Mark visibility of all functions.
     
     A local function is one whose calls can occur only in the current
     compilation unit and all its calls are explicit, so we can change
     its calling convention.  We simply mark all static functions whose
!    address is not taken as local.
! 
!    We also change the TREE_PUBLIC flag of all declarations that are public
!    in language point of view but we want to overwrite this default
!    via -fwhole-program for the backend point of view.  */
  
  static void
! cgraph_function_and_variable_visibility (void)
  {
    struct cgraph_node *node;
+   struct cgraph_varpool_node *vnode;
  
    for (node = cgraph_nodes; node; node = node->next)
      {
+       if (node->reachable
+ 	  && (DECL_COMDAT (node->decl)
+ 	      || (TREE_PUBLIC (node->decl) && !DECL_EXTERNAL (node->decl)
+ 		  && !flag_whole_program)))
+ 	node->local.externally_visible = 1;
+       if (!node->local.externally_visible && node->analyzed
+ 	  && !DECL_EXTERNAL (node->decl))
+ 	{
+ 	  gcc_assert (flag_whole_program || !TREE_PUBLIC (node->decl));
+ 	  TREE_PUBLIC (node->decl) = 0;
+ 	}
        node->local.local = (!node->needed
! 			   && node->analyzed
! 			   && !TREE_PUBLIC (node->decl));
!     }
!   for (vnode = cgraph_varpool_nodes_queue; vnode; vnode = vnode->next_needed)
!     {
!       if (vnode->needed
! 	  && (DECL_COMDAT (vnode->decl)
! 	      || (TREE_PUBLIC (vnode->decl) && !flag_whole_program)))
! 	vnode->externally_visible = 1;
!       if (!vnode->externally_visible)
! 	{
! 	  gcc_assert (flag_whole_program || !TREE_PUBLIC (vnode->decl));
! 	  TREE_PUBLIC (vnode->decl) = 0;
! 	}
!      gcc_assert (TREE_STATIC (vnode->decl));
      }
  
+   /* Because we have to be conservative on the boundaries of source
+      level units, it is possible that we marked some functions in
+      reachable just because they might be used later via external
+      linkage, but after making them local they are really unreachable
+      now.  */
+   if (flag_whole_program)
+     cgraph_remove_unreachable_nodes (true, cgraph_dump_file);
+ 
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nMarking local functions:");
*************** cgraph_mark_local_functions (void)
*** 1068,1074 ****
--- 1125,1137 ----
  	if (node->local.local)
  	  fprintf (cgraph_dump_file, " %s", cgraph_node_name (node));
        fprintf (cgraph_dump_file, "\n\n");
+       fprintf (cgraph_dump_file, "\nMarking externally visible functions:");
+       for (node = cgraph_nodes; node; node = node->next)
+ 	if (node->local.externally_visible)
+ 	  fprintf (cgraph_dump_file, " %s", cgraph_node_name (node));
+       fprintf (cgraph_dump_file, "\n\n");
      }
+   cgraph_function_flags_ready = true;
  }
  
  /* Return true when function body of DECL still needs to be kept around
*************** cgraph_optimize (void)
*** 1113,1119 ****
    if (!quiet_flag)
      fprintf (stderr, "Performing intraprocedural optimizations\n");
  
!   cgraph_mark_local_functions ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "Marked ");
--- 1176,1182 ----
    if (!quiet_flag)
      fprintf (stderr, "Performing intraprocedural optimizations\n");
  
!   cgraph_function_and_variable_visibility ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "Marked ");
Index: common.opt
===================================================================
RCS file: /cvs/gcc/gcc/gcc/common.opt,v
retrieving revision 1.71
diff -c -3 -p -r1.71 common.opt
*** common.opt	25 May 2005 04:16:38 -0000	1.71
--- common.opt	25 May 2005 15:07:18 -0000
*************** fweb
*** 983,988 ****
--- 983,992 ----
  Common Report Var(flag_web) Init(0)
  Construct webs and split unrelated uses of single variable
  
+ fwhole-program
+ Common Report Var(flag_whole_program) Init(0)
+ Perform whole program optimizations
+ 
  fwrapv
  Common Report Var(flag_wrapv)
  Assume signed arithmetic overflow wraps around
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.625
diff -c -3 -p -r1.625 invoke.texi
*** doc/invoke.texi	25 May 2005 04:17:46 -0000	1.625
--- doc/invoke.texi	25 May 2005 15:07:20 -0000
*************** Objective-C and Objective-C++ Dialects}.
*** 334,340 ****
  -ftree-dominator-opts -ftree-dse -ftree-copyrename -ftree-sink @gol
  -ftree-ch -ftree-sra -ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize @gol
  -ftree-salias -fweb @gol
! -ftree-copy-prop -ftree-store-ccp -ftree-store-copy-prop @gol
  --param @var{name}=@var{value}
  -O  -O0  -O1  -O2  -O3  -Os}
  
--- 334,340 ----
  -ftree-dominator-opts -ftree-dse -ftree-copyrename -ftree-sink @gol
  -ftree-ch -ftree-sra -ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize @gol
  -ftree-salias -fweb @gol
! -ftree-copy-prop -ftree-store-ccp -ftree-store-copy-prop -fwhole-program @gol
  --param @var{name}=@var{value}
  -O  -O0  -O1  -O2  -O3  -Os}
  
*************** Enabled at levels @option{-O2}, @option{
*** 5233,5238 ****
--- 5233,5251 ----
  on targets where the default format for debugging information supports
  variable tracking.
  
+ @item -fwhole-program
+ @opindex fwhole-program
+ Assume that the current compilation unit represents whole program being
+ compiled.  All public functions and variables with the exception of @code{main}
+ and those marged by attribute @code{used} become static functions and in a
+ affect gets more aggresively optimized by interprocedural optimizers.  While
+ this option is equivalent to proper use of @code{static} keyword for programs
+ consitsting of single file, in combination with option @option{--combine} this
+ flag can be used to compile most of smaller scale C programs since the
+ functions and variables become local for the whole combined compilation unit,
+ not for the single source file itself.
+ 
+ 
  @item -fno-cprop-registers
  @opindex fno-cprop-registers
  After register allocation and post-register allocation instruction splitting,



More information about the Gcc-patches mailing list