This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Speed up _cpp_clean_line and _cpp_skip_block_comment
Neil Booth <neil@daikokuya.co.uk> writes:
> I see. It looked like 3 sets of 3 columns; Mike made clear that the
> spaces were spurious.
Ah. The spaces are kcachegrind's convention for thousands separators;
it uses thinner spaces, though, so it's clearer. (Can't do that with
fixed width fonts.)
I did some more tests, cooking up an input file that has lots of phase
4 tokens but no macro expansion. In this case _cpp_lex_direct itself
is now the top of the flat profile with 25% of runtime. _cpp_clean_line
is still responsible for most of the L2 cache misses, but then it would
be; I suppose we could play with __builtin_prefetch. However, it only
takes 10% or so of runtime. It is also interesting to observe that in
the call chain cpp_get_token -> _cpp_lex_token -> _cpp_lex_direct, the
higher layers have something like 14% overhead.
I have some tweaks in mind for _cpp_clean_line still, but no ideas for
_cpp_lex_direct and its callers. Maybe you can think of something?
Or maybe that's as tuned as it gets.
zw