This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Removing space waste for g++ static inlined objects




Greetings boys and girls,

Recently, I was asked by some folks I know to work on a problem in the
g++ compiler relating to the space allocated to static-storage objects
declared within extern-linkage inlined functions.  Here's a trivial
example of what I'm talking about:

	inline int foobar (int arg)
	{
	    static int big[100000];
	    register int temp;

	    temp = big[0];
	    big[0] = arg;
	    return temp;
	}

This is kind of an interesting case.  According to what I have been told,
the C++ standard requires that even if the function shown above is placed
in an include file, and even if it is included into (say) 100 different
compilation units, there should still end up being one, and only one in-
stance of the `big' object in the final executable program.  (I'm sure
that someone will correct me if that is not the case.)

Anyway, what happens in this case... at least for g++ 2.95.2 (but I suspect
also for the latest pre-3.0 snapshot), and at least on Linux... is that the
_code_ for the `foobar' function gets placed into a special magical section
called:

	.gnu.linkonce.t.foobar__Fi

Each compiled instance of the function's code gets placed into a section
having this exact name.  At link time, the GNU linker sees all these sections,
sees that their names all start with the magical .gnu.linkonce prefix, and
then, because of that, it throws away all but one from this group of identi-
cally-named .gnu.linkonce.foobar__Fi sections.

That's swell.  The compiler is clearly using the .gnu.linkonce magic in the
GNU linker to effectively rid us of the redundant duplicate code for all
these different generated copies of the `foobar' function.  But sadly,
none of this does a darn thing to eliminate the redundant/duplicative space
allocation for all those different generated instances of the `big' object...
a static storage object nested within the extern-linkage inline function
`foobar'.

As you can easily see, in a case like this where the object has a substantial
size (400KB in this example) and where it is being replicated many many times
(due to the location of the containing inline function in some .h file), the
space wasted by failing to ``commonize'' the nested static-storage object
can be REALLY substantial.

I hacked around on the compiler for awhile and came up with the patches shown
below to fix this problme.  I'm quite sure that these are sub-optimal, but
I'm hoping that they may be useful as a starting point for further discussion.

Part of the problem, of course, is that I'm a blind man feeling my way down
a dark corridor towards a solution.  The code in the relevant g++ compiler
sources files is _not_ terribly well documented.

I have two questions, in particular, for anyone who might care to enlighten
me.

First, what is the reason for the statement:

	TREE_PUBLIC (decl) = 1;

within the `make_decl_one_only' function within the varasm.c file?  I see
no good reason for this, and I have found it to be counter-productive.  (For
what I am doing, I want to ``commonize'' certain objects across compilation
unit boundaries, but I *do not* want to give the objects themselves global
scope!)

Second, what is the intended function of the following code segment of the
cp_finish_decl function within cp/decl.c?   The comments seem to say that
whoever last hacked on this could not get it to do the Right Thing.  But
perhaps now _I_ have done so!  (Well, I can hope, anyway.)

----------------------------------------------------------------------------
...
      /* Static data in a function with comdat linkage also has comdat
         linkage.  */
      if (TREE_CODE (decl) == VAR_DECL
          && TREE_STATIC (decl)
          /* Don't mess with __FUNCTION__.  */
          && ! TREE_ASM_WRITTEN (decl)
          && current_function_decl
          && DECL_CONTEXT (decl) == current_function_decl
          && (DECL_THIS_INLINE (current_function_decl)
              || DECL_TEMPLATE_INSTANTIATION (current_function_decl))
          && TREE_PUBLIC (current_function_decl))
        {
          /* Rather than try to get this right with inlining, we suppress
             inlining of such functions.  */
          current_function_cannot_inline
            = "function with static variable cannot be inline";

          /* If flag_weak, we don't need to mess with this, as we can just
             make the function weak, and let it refer to its unique local
             copy.  This works because we don't allow the function to be
             inlined.  */
          if (! flag_weak)
            {
              if (DECL_INTERFACE_KNOWN (current_function_decl))
                {
                  TREE_PUBLIC (decl) = 1;
                  DECL_EXTERNAL (decl) = DECL_EXTERNAL (current_function_decl);
                }
              else if (DECL_INITIAL (decl) == NULL_TREE
                       || DECL_INITIAL (decl) == error_mark_node)
                {
                  TREE_PUBLIC (decl) = 1;
                  DECL_COMMON (decl) = 1;
                }
              /* else we lose. We can only do this if we can use common,
                 which we can't if it has been initialized.  */

              if (TREE_PUBLIC (decl))
                DECL_ASSEMBLER_NAME (decl)
                  = build_static_name (current_function_decl, DECL_NAME (decl));
              else if (! DECL_ARTIFICIAL (decl))
                {
                  cp_warning_at ("sorry: semantics of inline function static data `%#D' are wrong (you'll wind up with multiple copies)", decl);
                  cp_warning_at ("  you can work around this by removing the initializer", decl);
                }
            }
        }
...
----------------------------------------------------------------------------

(Note that my patches below #ifdef out all the code within the body of the
outermost `if' statement shown above.)


Here are the patches I came up with.  These are relative to gcc-2.95.2
sources but I've looked and it looks like they can be patched into the
latest pre-3.0 snapshot also with no problem.

These patches basically cause each static-storage object declared within
an extern-linkage inline function to be dropped into its own private
.gnu.linkonce.xxx section.  (The exact section name in each case is
based upon both the name of the object itself, and the ``name'' of the
containing context.  This is necessary to minimize potential name-related
conflicts/problems, since these section names, in effect, end up having
global scope/significance.)  The GNU linker then throws away all but one
copy of each group of identically-names .gnu/linkonce sections at link-time,
thus yielding the desired space savings.


Comments appreciated, but PLEASE don't just say these patches are no good.
If you think there is a better way, then please SHOW ME what you have in
mind with some patches of your own.  Thanks.


Regards,
rfg




diff -rc2 2.95.2/gcc/cp/decl.c 2.95.2/gcc/cp/decl.c
*** 2.95.2/gcc/cp/decl.c	Sun Aug  8 17:28:33 1999
--- 2.95.2/gcc/cp/decl.c	Tue Feb  6 11:16:55 2001
***************
*** 8104,8107 ****
--- 8104,8108 ----
  	  && TREE_PUBLIC (current_function_decl))
  	{
+ #if 0
  	  /* Rather than try to get this right with inlining, we suppress
  	     inlining of such functions.  */
***************
*** 8138,8141 ****
--- 8139,8145 ----
  		}
  	    }
+ #else
+ 	  comdat_linkage (decl);
+ #endif
  	}
  
***************
*** 9069,9072 ****
--- 9073,9082 ----
        if (declarator && context && current_lang_name != lang_name_c)
  	DECL_ASSEMBLER_NAME (decl) = build_static_name (context, declarator);
+       else if (declarator
+ 	       && current_function_decl
+ 	       && ! RIDBIT_SETP (RID_EXTERN, specbits)
+ 	       && current_lang_name != lang_name_c)
+ 	DECL_ASSEMBLER_NAME (decl) =
+ 	  build_static_name (current_function_decl, declarator);
      }
  
diff -rc2 2.95.2/gcc/varasm.c 2.95.2/gcc/varasm.c
*** 2.95.2/gcc/varasm.c	Wed Jun  9 05:13:49 1999
--- 2.95.2/gcc/varasm.c	Wed Jan 31 15:37:41 2001
***************
*** 4471,4475 ****
--- 4471,4477 ----
      abort ();
  
+ #if 0
    TREE_PUBLIC (decl) = 1;
+ #endif
  
    if (TREE_CODE (decl) == VAR_DECL

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]