New macro expander preamble

Neil Booth NeilB@earthling.net
Fri Oct 27 10:03:00 GMT 2000


Zack Weinberg wrote:-

> Mmh, but that's not a critical issue... how often do we get escapes in
> a file name?

Every time we boot MS-DOG? :-)

> sorry, I don't follow this explanation.  Seems to me the thing to do
> is reintroduce some indicator - on the token - of being first on a
> line.

Well, let's look at the fundamental problem.  cpp_get_line does the
right job in all cases of returning the line and column of the token
just returned from cpp_get_token.  I can only see two cases when this
line number is different to the line we want the token to appear on in
the output:-

1) the second line of a backslash-escaped newline sequence
2) the token in a multi-line invocation of a funlike macro

Case 2 is actually handled properly already.  The tokens are sucked
into an internal buffer (syntax errors during argument parsing are
reported at the correct position), and then cpp_get_line returns the
position of the macro name for all tokens in the expansion.

Case 1 is what I was trying to fix with an output_line variable, but
it didn't work :-(.  Maybe we just output over multiple lines anyway -
does anyone care?  (OK, I bet Glibc does, somewhere :-))

I don't like the flag idea because keeping track of it is painful.  We
need to clear it from the tokens in case 2, pass it around inside
multi-level macro expansions if it was on the macro name, and anyway
we lose it when we drop CPP_PLACEMARKER tokens (e.g. the expansion of
an empty macro), meaning in that case we'd have to keep it around and
apply it to the next token that came onstream (yet another cpp_reader
variable).  It's this kind of ugliness that caused the line problems
in the existing code, and I'd like to see if there's a better way
before returning to it.

> When I do that with -Wtrigraphs, all I get is "unterminated comment".

Odd, I wonder when that crept in.  I see CVS does this too.

> What, "it is implementation defined whether macro expansion occurs
> within a #pragma directive" ?  We define it never to occur.  Would be
> the end of the story, except that #pragma pack is apparently subject
> to macro expansion in MSVC and we really ought to match that - needs a
> way for the callbacks to request expansion to occur.

I thought we expanded (I'd forgotten we used the EXPAND flag to
control expansion, and just noticed that do_pragma called the
interface that performed expansion), and was referring to not
expanding after STDC.  My patch expands, so that needs to be
corrected.  Maybe that's the glibc problem.

MSVC++ works by expanding everything after the #pragma.  So if you aim
to be compatible with them for #pragma pack, you really need to catch
the case that the "pack" is from a macro expansion too.  I can see
advantages in defaulting to expanding in pragmas, e.g. it allows
things like

#if GCC
#define MY_PRAGMA gcc style
#elif MSC_VER
#define MY_PRAGMA MS style
#endif

#pragma MY_PRAGMA   /* Expands to the correct form for the compiler.  */

What do you think about this?  I don't really care either way, but can
see that allowing macros adds flexibility.

> Hmm, so what guarantees that if I call cpp_token_as_string twice in
> quick succession, that the second string doesn't overwrite the first?

Well spotted :-) The first gets overwritten.  I know of this, and
didn't / don't consider it worth fixing, maybe just documenting with
the function interface.

o cpp_token_as_text is only intended for use in diagnostics.
o You'll only meet it in diagnostics outputting the spelling of two
tokens (at the moment that means just the paste diagnostic)
o The total length of the 2 tokens would need to be over 4K bytes
in the most unlucky case with the buffer sizes I start out with.
o It won't segfault as the overwriting string is still null-terminated,
you'll just get a badly spelt first token <g>.

I consider the cpp_spell_token and cpp_output_token interfaces the ones
to use for guaranteed correctness in spelling.

Neil.


More information about the Gcc-patches mailing list