This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [LTO] patch: new CALL_EXPR abstractions in builtins.c


Hi Paolo,

I'm begining to suspect that Sandra and I have a slightly clearer
idea of what's going on in this patch.

Previously, for CALL_EXPR fold_build3_stat did:

   tem = fold_ternary (code, type, op0, op1, op2);
   if (!tem)
     tem =  build3_stat (code, type, op0, op1, op2 PASS_MEM_STAT);

now it does:

       tree exp = build3_stat (code, type, op0, op1, op2 PASS_MEM_STAT);
       tem = fold_call_expr (exp, false);
       if (!tem)
	 tem = exp;

notice that we now always have an allocation in build3_stat, but
will ignore the result "exp" if fold_call_expr simplifies something.
This is the source of the wasted/leaked memory with this change.


Previously, in builtins.c alone there were 62 calls to the function
build_function_call_expr which is implemented as:

   call_expr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fn)), fn);
   return fold_build3 (CALL_EXPR, TREE_TYPE (TREE_TYPE (fn)),
  		      call_expr, arglist, NULL_TREE);

note how the use of fold_build3 previously avoided construction of a
CALL_EXPR prior to the change above.  All of these calls were replaced in
the original patch with calls to the new build_call_expr which now
eventually call the inefficient fold_build3.


Re your comment:
>> Note that the overhead only applies to fold_build3. Calls to
>> fold_ternary (CALL_EXPR, ...) luckily are just absent from GCC

Going back to our "sin" example from the earlier e-mail, it turns
out that fold_ternary with a CALL_EXPR was called three times more
often than is fold_build3!


I'll agree with you that if the CALL_EXPR is already built, neither
the new scheme nor the old scheme, waste any memory if it can't be
simplified.  Unfortunately, this is the "rare" case, most calls to
fold things are with speculative changes to see if things will improve.
This explains why fold_ternary is called more often that fold_buildN.

We even have examples where we'd leak memory recursively on a single
call to fold, for example, simplifying strcmp -> strncmp -> memcmp, and
similar paths where at each stage we don't physically need a CALL_EXPR.
Its much better to invoke the appropriate fold_builtin_strncmp, than
to duplicate the memcmp optimizations in strcmp to avoid inefficiencies
if we can't avoid potentially leaking memory with each invocation.


I should hope that I understand this code pretty well as a middle-end
maintainer:  svn blame of builtins.c blames 2232 lines on sayle, 266
lines to kazu and 82 to bonzini.  I can't work out if you're trying
to say that I don't understand what's going on, or that this patch
doesn't leak/waste more memory than mainline?  Fortunately, Sandra
and I have reached agreement on a simple restructuring that would be
unconditionally better than what we currently have.  But I can't help
but get the vibe that you don't think I understand the issues atleast
as well as you do.  Didn't I make a good catch in the review/suggestion?
Don't you want to thank me (and Sandra!) for making the GCC compiler
better? Wouldn't you like to be able to catch similar problems yourself
someday?  :-)



On Wed, 28 Jun 2006, Paolo Bonzini wrote:
> Surely I do too, and Kazu's implementation of fold_buildN and
> fold_{unary,binary,ternary} was very welcome to me too.

Very many thanks again to Kazu for implementing this!
http://gcc.gnu.org/ml/gcc/2004-01/msg00560.html


Now if only I could find a volunteer to work on my ideas on
reconstructing trees from object files to allow LTO optimizations
without affecting object/library file formats, and even work with
non-GCC compiled or legacy libraries. :->

Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]