This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Inlining and estimate_num_insns

From: Jan Hubicka <jh at suse dot cz>
To: Steven Bosscher <stevenb at suse dot de>
Cc: Richard Guenther <richard dot guenther at gmail dot com>,Giovanni Bajo <giovannibajo at libero dot it>,Mark Mitchell <mark at codesourcery dot com>, gcc at gcc dot gnu dot org,Jan Hubicka <hubicka at ucw dot cz>
Date: Tue, 1 Mar 2005 01:33:07 +0100
Subject: Re: Inlining and estimate_num_insns
References: <Pine.LNX.4.44.0502241350470.2297-100000@alwazn.tat.physik.uni-tuebingen.de> <200502280219.00001.stevenb@suse.de> <84fc9c0005022801255fc61758@mail.gmail.com> <200502282156.44751.stevenb@suse.de>

> On Monday 28 February 2005 10:25, Richard Guenther wrote:
> > > I can only wonder why we are having this discussion just after GCC 4.0
> > > was branched, while it was obvious already two years ago that inlining
> > > heuristics were going to be a difficult item with tree-ssa.
> >
> > There were of course complaints and discussions about this, and I even
> > tried to tweak inlining parameters once.  See the audit trails of PR7863
> > and PR8704.  There were people telling me "well in branch XYZ we do so much
> > better", as always, so I was not encouraged to persue this further.
> >
> > Anyway, I think we should try the patch on mainline and I'll plan to
> > re-submit it together with a 10% lowering of the inlining parameters
> > compared to 3.4 (this is conservative for the mean size change for C code,
> > for C++ we're still too high).  I personally cannot afford to do so much
> > testing to please everyone.
> 
> I tested your -fobey-inline patch a bit using the test case from PR8361.
> The run was still going after 3 minutes (without the flag it takes 20s)
> so I terminated it and took the following oprofile:
> 
> CPU: Hammer, speed 1394.98 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 4000
> Counted DATA_CACHE_MISSES events (Data cache misses) with a unit mask of 0x00 (No unit mask) count 1000
> samples  %        samples  %        image name               symbol name
> 4607300  78.7190  98784    79.4179  cc1plus                  cgraph_remove_edge
> 861258   14.7152  15308    12.3070  cc1plus                  cgraph_remove_node
> 60871     1.0400  999       0.8032  cc1plus                  ggc_set_mark
> 56907     0.9723  2054      1.6513  cc1plus                  cgraph_optimize
> 36513     0.6239  1132      0.9101  cc1plus                  cgraph_clone_inlined_nodes
> 29570     0.5052  843       0.6777  cc1plus                  cgraph_postorder
> 16187     0.2766  367       0.2951  cc1plus                  ggc_alloc_stat
> 7787      0.1330  97        0.0780  cc1plus                  gt_ggc_mx_cgraph_node
> 6851      0.1171  138       0.1109  cc1plus                  cgraph_edge
> 6671      0.1140  305       0.2452  cc1plus                  comptypes
> 5776      0.0987  95        0.0764  cc1plus                  gt_ggc_mx_cgraph_edge
> 5243      0.0896  93        0.0748  cc1plus                  gt_ggc_mx_lang_tree_node
> 
> Honza, it seems the cgraph code needs whipping here.

I think I can shot down the cgraph_remove_node lazyness by simple
reference counting, but concerning removal of edges, only alternative I
see is going for vectors/doubly linked lists.  I would still expect this
time to be dominated by later inlining/compation explossion so I would
not take that too seriously (unless proved otherwise by
cgraph_remove_edge being top on overall profile ;)

Honza
> 
> Gr.
> Steven

Follow-Ups:
- Re: Inlining and estimate_num_insns
  - From: Steven Bosscher

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]