This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: needless deep recursion in gt-c-decl.h


I now have some numbers!

I looked at expr.o, which needs (on the branch) a recursion depth of
24024, the third-highest (and not typical, most files are < 10000);
that corresponds to perhaps 1M of stack.

I tried both algorithms on the tree marker routine for C.  They both
had about the same effect, reducing the recursion depth to about
16000, with no significant change in CPU time.  (This is with
gcac checking, so any effect should be greatly magnified.)

The 16000 turns out to be mostly RTL marking.  So, I applied the
one-loop method to the RTL marker routine, and it reduced the maximum
stack depth to about 13000.  Better, but not really acceptable, since
this still corresponds to perhaps 600k of stack, more than the limit
on Darwin.

I then generated a coredump when the stack depth was 13000, and looked
at it in gdb.  I was very pleased to find that the gdb folks have
improved the performance (since gdb 4.x) when looking at deep stacks,
it now handles this level of stack depth with no problems at all.

I'd manually applied the one-loop method to the NEXT_INSN field of
insns, and otherwise to fld[0] if it was a RTX.  The stack consisted,
by my unscientific survey, almost entirely of RTX marking, and most of
it (more than half but not 90%) was recursing on PREV_INSN fields.

So, I looked at a two-loop method, looking at fld[0] and fld[2].  This
worked better yet, reducing the maximum stack depth to under 10000,
but that's still ~400k of stack.

Nearly all of the remainder is still recursion on PREV_INSN fields.
This happens because the DECL_RTL of a LABEL_REF will refer to a
CODE_LABEL in the middle of the insn chain, and it happens that this
is the first place the insn chain is encountered when marking.  The
label involved is "plus_expr:" in expand_expr, line 7632.
expand_expr starts at line 6183, so there's a long way to go back.

My current best idea is to rearrange the order that
globals are marked so that the insn chain gets seen before the trees,
if this is possible, but I'm not at all sure that this will work
reliably.

Plan B is to implement a stack using malloc, only on stack-constrained
systems because I expect this will be slower on Linux (it was when I
tried it).

Any other ideas?

On the bright side, my "typical" file, flow.o, is reduced from ~6000
maximum depth to ~500 maximum depth.  If I didn't have to worry about
making stack usage small for every case, the two-loop method would be
quite good enough.

-- 
- Geoffrey Keating <geoffk@geoffk.org> <geoffk@redhat.com>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]