This is the mail archive of the
mailing list for the GCC project.
Re: Some (small) c++ compilation profiling data (oprofile)
Zack Weinberg wrote:-
> Ohh. I know what's going on. The C++ parser has this concept of
> "tentative diagnostics", which result from one candidate parsing of an
> expression. If it turns out that that wasn't the right parse, those
> diagnostics get thrown away. Since a mistaken parse is likely to
> result in error messages, there may be a lot of these.
> And of course work on diagnostic generation has assumed that it isn't
> performance critical, so that code is slow.
> I can think of a couple ways to address the problem, but Mark probably
> wants to make the thing work correctly first.
While we're on the subject, the current method we have interfacing
cpplib / C++ is a dog. There are 2 problems I see:
1) Tokens from macro expansion don't have the right line / col info.
This can be solved 2 ways: a) we copy tokens during macro expansion
in cpplib, and when it's finished go back and correct the line / col
info of the remaining tokens to the position of the invoking identifier.
b) The front end does this in its current copying code. I don't think
the front end should be worrying about this; it's a library interface
2) The front end has a lot of infrastructure to go back and forth
in the token stream. To do this it has copied a lot of tokens I
think. It would be nice if it just had to maintain a couple of
pointers or offsets into an array.
I think the best way to solve both of these is for cpplib to do
a) in 1), which with a change in the macro expander would simultaneously
allow us to keep the tokens in a line for 2). There would need to be
1) allow the front end to tell cpplib when it's finished with the
tokens up to a point so cpplib can safely re-use the memory.
2) Since cpplib will probably need to tack on more logical lines
if the front end needs more tokens, it needs a way of expanding
this buffer. If the front end uses an index into an array it
doesn't care, but if it uses token pointers it would need to be
told and them updated.
I think this is better than what we do now, but would involve a small
cpplib performance penalty because of token copying [but the front end
needs to do that at present anyway]. I hope that could be made up by
simplifying some of the kludges in cpplib (which also make the code a
little hard to follow in places). The front end not needing to copy
tokens would be a large win there.
I also would like to introduce a CPP_WHITESPACE token when doing
CPP output. That would allow the removal of the CPP_PADDING
nastiness, and handle the CPP_COMMENT stuff for -C and -CC more
elegantly. It would also mean ISO CPP can retain the form of
whitespace in the input file (as has been often requested, and
is a major reason people still want tradcpp).
FWIW I think cpplib would benefit from lexing tokens a logical line
at a time.