This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa] Memory usage in compute_immediate_uses

On Tue, Sep 16, 2003 at 03:38:57PM -0400, Diego Novillo wrote:
> On Tue, 2003-09-16 at 14:33, Daniel Jacobowitz wrote:
> > And just a question.  Sometimes all the immediate_uses varrays are already
> > allocated.  Nothing ever removes from the immediate_uses list.  So...
> > doesn't any call to compute_immediate_uses after they've already been
> > computed put duplicates on the list?  It's not a problem now that CCP is the
> > only thing using it, but while SSA-PRE was using it that must have hurt
> > performance elsewhere.
> >
> Thanks for the analysis, Daniel.  Since you have it instrumented now,
> would you mind doing a run over an entire bootstrap + target library
> build?
> Another thing we could use is ggc_collect() after each pass through
> optimize_function_tree().

Some amusing statistics for you.  All numbers are script-generated and
the script is a little flaky, so add salt to taste.  All numbers are
only for compute_immediate_uses, but probably hold true for
reached_uses and reaching_defs, which are more interesting.  Or will be
when something uses them.

In the course of a bootstrap, 1829 files get built.  In 429 of them we
generate more than 100KiB of unused data in compute_immediate_uses.  69
of them, more than 1MiB.  Our peak is libjava/interpret.c, at 78MiB. 
That's compared to the ideal: a varray whose initial size was accurate,
so no calls to ggc_realloc.

Does this information ever live over a collection anyway?  Especially
if so, another data structure might be more efficient.  Something which
only supported adding and iterating; you could use a chunked array
which allocated new chunks instead of discarding and copying. That's
about the simplest, fastest solution.  ggc_realloc is really quite
inefficient from a garbage standpoint.

As for size: this isn't the ideal quantity to measure, but it was the
quickest hack.  The distribution of
  max (number of uses attached to any one statement in a function)
with one sample per function:
  3223 had no statements with uses at all.
  27557 had a max of 1-4.
  14334 had a max of 5-10.
  7421 had a max of 11-100.
  236 were between 101-678.
  2 had 1274 (probably the same file in two stages).
  One greater than 2K: 4230, in

This suggests that a varray of size ten is not a good choice, but that
a varray of another size would not be a much better choice.

Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]