This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] Enable autoinlining for size at -O1+


Hi,
I the attached patch enable limited function auto-inlining at -O1. This is
another change from policy that we auto inline only at -O3 (we already do inline
functions called once since GCC 3.4), so I would like to ask about opinions.

Inlining is done only when the function is small enough so call overhead is
expected to be larger and it is supposed to help especially C++ code.  For SPEC
I've measured reduction in size by 80KB, GCC benchmark increase in size by 10KB
and speedup in perlbmk (1384->1506), otherwise the patch seems neutral.

The feature is controlled by -finline-small-functions and I've also made
inliner to ignore DECL_INLINE completely.  I would like to remove completely
this bit and bring frontends out of busyness on deciding what should be
inlined.  There are few problems on this road: -Wreturn-type is doing more
checking on inline functions than without assuming that they might be removed.
Since static functions can be removed too, this is not really solving the
problem well, but testsuite rely on it, also C++ frontend is using DECL_INLINE
to decide on instantiation of templates (so templates that will be instantiated
in other unit are instantiated only when they are inline to assist inlining),
so this would need a bit more cleanups I am just looking into now.

The patch also has to fix one place of uninitialized warning in libcpp and work
around bogus warning caused by PR29478.  I would like to deal with this
incrementally. (I don't know how to handle PR29478 that is pretty ugly frontend
trees/gimple trees issue, yet blocking the change because of that seems bad -
enabling the inlining early will get inliner better tested for the release)

Patch was bootstrapped/regtested x86_64-linux.

Honza

	* invoke.texi (-finline-small-functions): Document.
	* ipa-inline.c (cgraph_default_inline_p): Do not use DECL_INLINE
	when deciding what is inlinable.
	(cgraph_decide_recursive_inlining): Handle flag_inline_functions.
	(cgraph_decide_inlining_of_small_function): Handle new flags.
	(cgraph_decide_inlining_incrementally): Likewise.
	* opts.c (decode_options): Enable flag_inline_small_functions at -O1
	* common.opt (finline-small-functions): New.
	* Makefile.in (build/gengtype.o-warn): Work around PR29478

	* traditional.c (_cpp_scan_out_logical_line): Silence uninitialized
	warning.

Index: gcc/doc/invoke.texi
===================================================================
*** gcc/doc/invoke.texi	(revision 124460)
--- gcc/doc/invoke.texi	(working copy)
*************** Objective-C and Objective-C++ Dialects}.
*** 323,329 ****
  -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las  -fgcse-after-reload @gol
  -fcrossjumping  -fif-conversion  -fif-conversion2 @gol
  -finline-functions  -finline-functions-called-once @gol
! -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmodulo-sched -fno-branch-count-reg @gol
  -fno-default-inline  -fno-defer-pop -fmove-loop-invariants @gol
--- 323,329 ----
  -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las  -fgcse-after-reload @gol
  -fcrossjumping  -fif-conversion  -fif-conversion2 @gol
  -finline-functions  -finline-functions-called-once @gol
! -finline-small-functions -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmodulo-sched -fno-branch-count-reg @gol
  -fno-default-inline  -fno-defer-pop -fmove-loop-invariants @gol
*************** compilation time.
*** 4909,4914 ****
--- 4909,4915 ----
  -ftree-fre @gol
  -ftree-ch @gol
  -funit-at-a-time @gol
+ -finline-small-functions @gol
  -fmerge-constants}
  
  @option{-O} also turns on @option{-fomit-frame-pointer} on machines
*************** Don't pay attention to the @code{inline}
*** 5047,5052 ****
--- 5048,5062 ----
  is used to keep the compiler from expanding any functions inline.
  Note that if you are not optimizing, no functions can be expanded inline.
  
+ @item -finline-small-functions
+ @opindex finline-small-functions
+ Integrate functions into their callers when their body is smaller than expected
+ function call code (so overall size of program gets smaller).  The compiler
+ heuristically decides which functions are simple enough to be worth integrating
+ in this way.
+ 
+ Enabled at level @option{-O1}.
+ 
  @item -finline-functions
  @opindex finline-functions
  Integrate all simple functions into their callers.  The compiler
Index: gcc/ipa-inline.c
===================================================================
*** gcc/ipa-inline.c	(revision 124460)
--- gcc/ipa-inline.c	(working copy)
*************** cgraph_default_inline_p (struct cgraph_n
*** 406,415 ****
  
    if (n->inline_decl)
      decl = n->inline_decl;
!   if (!DECL_INLINE (decl))
      {
        if (reason)
! 	*reason = N_("function not inlinable");
        return false;
      }
  
--- 406,415 ----
  
    if (n->inline_decl)
      decl = n->inline_decl;
!   if (!flag_inline_small_functions && !DECL_DECLARED_INLINE_P (decl))
      {
        if (reason)
! 	*reason = N_("function not inline candidate");
        return false;
      }
  
*************** cgraph_decide_recursive_inlining (struct
*** 668,674 ****
    int depth = 0;
    int n = 0;
  
!   if (optimize_size)
      return false;
  
    if (DECL_DECLARED_INLINE_P (node->decl))
--- 668,675 ----
    int depth = 0;
    int n = 0;
  
!   if (optimize_size
!       || (!flag_inline_functions && !DECL_DECLARED_INLINE_P (node->decl)))
      return false;
  
    if (DECL_DECLARED_INLINE_P (node->decl))
*************** cgraph_decide_inlining_of_small_function
*** 916,922 ****
  	    }
  	}
  
!       if ((!cgraph_maybe_hot_edge_p (edge) || optimize_size) && growth > 0)
  	{
            if (!cgraph_recursive_inlining_p (edge->caller, edge->callee,
  				            &edge->inline_failed))
--- 917,927 ----
  	    }
  	}
  
!       if ((!cgraph_maybe_hot_edge_p (edge)
! 	  || (!flag_inline_functions
! 	      || DECL_DECLARED_INLINE_P (edge->callee->decl))
! 	  || optimize_size)
! 	  && growth > 0)
  	{
            if (!cgraph_recursive_inlining_p (edge->caller, edge->callee,
  				            &edge->inline_failed))
*************** cgraph_decide_inlining_incrementally (st
*** 1359,1365 ****
  	/* When the function body would grow and inlining the function won't
  	   eliminate the need for offline copy of the function, don't inline.
  	 */
! 	if (mode == INLINE_SIZE
  	    && (cgraph_estimate_size_after_inlining (1, e->caller, e->callee)
  		> e->caller->global.insns)
  	    && cgraph_estimate_growth (e->callee) > 0)
--- 1364,1372 ----
  	/* When the function body would grow and inlining the function won't
  	   eliminate the need for offline copy of the function, don't inline.
  	 */
! 	if ((mode == INLINE_SIZE
! 	     || (!flag_inline_functions
! 		 && !DECL_DECLARED_INLINE_P (e->callee->decl)))
  	    && (cgraph_estimate_size_after_inlining (1, e->caller, e->callee)
  		> e->caller->global.insns)
  	    && cgraph_estimate_growth (e->callee) > 0)
Index: gcc/opts.c
===================================================================
*** gcc/opts.c	(revision 124460)
--- gcc/opts.c	(working copy)
*************** decode_options (unsigned int argc, const
*** 697,702 ****
--- 697,703 ----
  #ifdef CAN_DEBUG_WITHOUT_FP
        flag_omit_frame_pointer = 1;
  #endif
+       flag_inline_small_functions = 1;
        flag_guess_branch_prob = 1;
        flag_cprop_registers = 1;
        flag_if_conversion = 1;
Index: gcc/common.opt
===================================================================
*** gcc/common.opt	(revision 124460)
--- gcc/common.opt	(working copy)
*************** finline
*** 545,550 ****
--- 545,554 ----
  Common Report Var(flag_no_inline,0) Init(2)
  Pay attention to the \"inline\" keyword
  
+ finline-small-functions
+ Common Report Var(flag_inline_small_functions) Optimization
+ Integrate simple functions into their callers when code size is known to not growth
+ 
  finline-functions
  Common Report Var(flag_inline_functions) Optimization
  Integrate simple functions into their callers
Index: gcc/Makefile.in
===================================================================
*** gcc/Makefile.in	(revision 124460)
--- gcc/Makefile.in	(working copy)
*************** SYSCALLS.c.X-warn = -Wno-strict-prototyp
*** 198,203 ****
--- 198,205 ----
  # recognizing that the loop will always be executed at least once.  We need
  # a new loop optimizer.
  reload1.o-warn = -Wno-error
+ # Work around warning caused by PR29478
+ build/gengtype.o-warn = -Wno-error
  
  # All warnings have to be shut off in stage1 if the compiler used then
  # isn't gcc; configure determines that.  WARN_CFLAGS will be either
Index: libcpp/traditional.c
===================================================================
*** libcpp/traditional.c	(revision 124460)
--- libcpp/traditional.c	(working copy)
*************** _cpp_scan_out_logical_line (cpp_reader *
*** 346,352 ****
    cpp_context *context;
    const uchar *cur;
    uchar *out;
!   struct fun_macro fmacro;
    unsigned int c, paren_depth = 0, quote;
    enum ls lex_state = ls_none;
    bool header_ok;
--- 346,352 ----
    cpp_context *context;
    const uchar *cur;
    uchar *out;
!   struct fun_macro fmacro = {NULL, NULL, NULL, 0, 0, 0};
    unsigned int c, paren_depth = 0, quote;
    enum ls lex_state = ls_none;
    bool header_ok;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]