This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
-fobey-inline (was Re: gcc and inlining)
- From: Richard Guenther <rguenth at tat dot physik dot uni-tuebingen dot de>
- To: Stuart Hastings <stuart at apple dot com>
- Cc: Matt Austern <austern at apple dot com>, Ron Price <ronp at apple dot com>, Mark Mitchell <mark at codesourcery dot com>, <gcc at gcc dot gnu dot org>
- Date: Wed, 12 Mar 2003 22:07:46 +0100 (CET)
- Subject: -fobey-inline (was Re: gcc and inlining)
Hi!
I finally got the patch work for C++ (see attached patch - maybe
completely bogous, though...). An I have some numbers for you:
Without -fobey-inline I get (time per iteration of the bench is displayed)
Benchmark size 32768:
ET: 4.07126e-08
Stencil: 3.08246e-08
ScalarCode (int): 6.23149e-08
ScalarCode (Loc): 2.04469e-07
Benchmark size 327680:
ET: 4.3396e-08
Stencil: 4.61121e-08
ScalarCode (int): 6.61898e-08
ScalarCode (Loc): 2.02973e-07
Benchmark size 3276800:
ET: 4.49557e-08
Stencil: 4.72165e-08
ScalarCode (int): 6.81046e-08
ScalarCode (Loc): 2.11086e-07
Note that for ScalarCode (Loc) which uses Iterator style indexing we are
about one magnitute worse than the rest.
With -fobey-inline the numbers change to:
Benchmark size 32768:
ET: 4.19604e-08
Stencil: 3.06718e-08
ScalarCode (int): 3.93084e-08
ScalarCode (Loc): 7.59901e-08
Benchmark size 327680:
ET: 4.29627e-08
Stencil: 4.6457e-08
ScalarCode (int): 4.75829e-08
ScalarCode (Loc): 8.20892e-08
Benchmark size 3276800:
ET: 4.60333e-08
Stencil: 4.79431e-08
ScalarCode (int): 4.73465e-08
ScalarCode (Loc): 8.29285e-08
Which is now much more reasonable - the optimizers can do useful work on
completely inlined code. (I used -O2 -march=athlon -fomit-frame-pointer
-funroll-loops for the rest of the options)
The patch I used is appended below.
I really would like to have something like this in 3.3!
Thanks, Richard.
Index: gcc/c-objc-common.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/c-objc-common.c,v
retrieving revision 1.18
diff -u -u -r1.18 c-objc-common.c
--- gcc/c-objc-common.c 25 Oct 2002 17:26:51 -0000 1.18
+++ gcc/c-objc-common.c 12 Mar 2003 21:04:52 -0000
@@ -64,7 +64,7 @@
if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn)) != NULL)
return 1;
- return DECL_DECLARED_INLINE_P (fn) && DECL_EXTERNAL (fn);
+ return DECL_DECLARED_INLINE_P (fn) && (DECL_EXTERNAL (fn) || flag_obey_inline);
}
static tree
Index: gcc/flags.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/flags.h,v
retrieving revision 1.93
diff -u -u -r1.93 flags.h
--- gcc/flags.h 20 Oct 2002 19:18:29 -0000 1.93
+++ gcc/flags.h 12 Mar 2003 21:04:52 -0000
@@ -384,6 +384,11 @@
extern int flag_rerun_loop_opt;
+/* Nonzero for -fobey-inline. If true, the 'inline' keyword must be obeyed,
+ regardless of codesize. */
+
+extern int flag_obey_inline;
+
/* Nonzero means make functions that look like good inline candidates
go inline. */
Index: gcc/langhooks.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/langhooks.c,v
retrieving revision 1.34.4.1
diff -u -u -r1.34.4.1 langhooks.c
--- gcc/langhooks.c 19 Feb 2003 05:39:28 -0000 1.34.4.1
+++ gcc/langhooks.c 12 Mar 2003 21:04:52 -0000
@@ -300,7 +300,7 @@
if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn)) != NULL)
return 1;
- return 0;
+ return flag_obey_inline;
}
/* lang_hooks.tree_inlining.add_pending_fn_decls is called before
Index: gcc/toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.690.2.12
diff -u -u -r1.690.2.12 toplev.c
--- gcc/toplev.c 10 Mar 2003 12:27:13 -0000 1.690.2.12
+++ gcc/toplev.c 12 Mar 2003 21:04:53 -0000
@@ -663,6 +663,11 @@
int flag_rerun_loop_opt;
+/* Nonzero for -fobey-inline. If true, the 'inline' keyword must be obeyed,
+ regardless of codesize. */
+
+int flag_obey_inline;
+
/* Nonzero for -finline-functions: ok to inline functions that look like
good inline candidates. */
@@ -1031,6 +1036,8 @@
N_("Generate code for funcs even if they are fully inlined") },
{"inline", &flag_no_inline, 0,
N_("Pay attention to the 'inline' keyword") },
+ {"obey-inline", &flag_obey_inline, 1,
+ N_("Obey 'inline' keyword and always inline, regardless of size") },
{"keep-static-consts", &flag_keep_static_consts, 1,
N_("Emit static const variables even if they are not used") },
{"syntax-only", &flag_syntax_only, 1,
Index: gcc/cp/tree.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cp/tree.c,v
retrieving revision 1.307.2.2
diff -u -u -r1.307.2.2 tree.c
--- gcc/cp/tree.c 7 Mar 2003 21:45:29 -0000 1.307.2.2
+++ gcc/cp/tree.c 12 Mar 2003 21:04:55 -0000
@@ -2239,7 +2239,8 @@
tree fn = *fnp;
if (flag_really_no_inline
- && lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn)) == NULL)
+ && (lookup_attribute ("always_inline", DECL_ATTRIBUTES (fn)) == NULL
+ && !(flag_obey_inline && DECL_DECLARED_INLINE_P(fn))))
return 1;
/* We can inline a template instantiation only if it's fully
Index: gcc/doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.209.2.17
diff -u -u -r1.209.2.17 invoke.texi
--- gcc/doc/invoke.texi 6 Mar 2003 23:35:07 -0000 1.209.2.17
+++ gcc/doc/invoke.texi 12 Mar 2003 21:04:55 -0000
@@ -272,6 +272,7 @@
-fgcse -fgcse-lm -fgcse-sm -floop-optimize -fcrossjumping @gol
-fif-conversion -fif-conversion2 @gol
-finline-functions -finline-limit= at var{n} -fkeep-inline-functions @gol
+-fobey-inline @gol
-fkeep-static-consts -fmerge-constants -fmerge-all-constants @gol
-fmove-all-movables -fnew-ra -fno-branch-count-reg @gol
-fno-default-inline -fno-defer-pop @gol
@@ -3611,6 +3612,12 @@
is declared @code{static}, nevertheless output a separate run-time
callable version of the function. This switch does not affect
@code{extern inline} functions.
+
+ at item -fobey-inline
+ at opindex fobey-inline
+Make the @code{inline} keyword imperative; inline every function marked
+with the @code{inline} keyword, regardless of size. Often leads to
+code bloat.
@item -fkeep-static-consts
@opindex fkeep-static-consts