New macro expander preamble

Neil Booth NeilB@earthling.net
Thu Oct 26 23:31:00 GMT 2000


Zack,

Thanks for the feedback.

Zack Weinberg wrote:-

> I've fixed up the integrated preprocessor and will send a patch
> tomorrow (bootstrap's still running).  It exposed some bugs.  Most of
> them could be addressed by a follow-up patch.

Great.  There are other issues with integrated CPP I don't think
you've come across yet, which is why I was leaving it a bit.
e.g. treatment of strings in linemarkers does not do character
escaping the same way as a separate cpp / cc1 combo.

> There's a serious bug in skip_rest_of_line - it doesn't dump
> lookahead.  This shows up with the integrated preprocessor,
> pragma-[12].c:
> 
> #pragma unknown
>   /* ... */
> 
> The 'unknown' is in the lookahead queue because of do_pragma.
> cb_def_pragma in c-lex.c doesn't consume any tokens because
> -Wunknown-pragmas is off.  skip_rest_of_line ignores the lookahead
> queue, so the next time c_lex calls cpp_get_token, 'unknown' pops up
> as if it had been part of the running text.  I've got this hacked
> around in my tree by setting pfile->skipping and then calling
> _cpp_get_token instead of _cpp_lex_token, in skip_rest_of_line, but
> that may not be the right fix.

Good, I'll look into this.  I think we just call cpp_get_token instead
of _cpp_lex_token, and put a prevent_expansion wrapper around it.

> With the integrated preprocessor, c-torture/execute/920730-1t.c breaks
> - -fpreprocessed is ignoring # 123 markers.  The simplest fix is to
> add IN_I to the dtable entry for #line.

OK, I'll look at this too.

> gcc.dg/cpp/2000720-1.S breaks with or without integrated cpp:
> $ ./xgcc -B./ -E 20000720-1.S 
> # 1 "20000720-1.S"
> 
> 
> 
>          nop call b
> 
> 'nop' and 'call b' need to be on separate lines.  The bug appears to
> be that lookahead doesn't remember that we've advanced to a new line.

Yes, I had this last night.  I still don't understand why it didn't
fail for me originally.  The correct fix is to make output_line part
of struct cpp_lexer_pos.  That makes it a bit less light-weight than
I'd hoped, but we need to get it right.  As a kludge, I made the first
call to cpp_get_output_line in cppmain.c a call to cpp_get_line
instead.

> There's still some debugging printfs (#if'd out) in the memory pool
> code.  Either strip them or change them to #ifdef POOL_DEBUG or
> something like that.

Yes, they can go I think.

> This sort of thing should get a trigraph warning:
> 
> /* blah *??/
> /

It doesn't with -Wtrigraphs?  Or do you mean regardless of
-Wtrigraphs?

> I'm not sure this is enough to get a pointer "suitably aligned so that
> it may be assigned to a pointer to any type of object and then used
> to access such an object or an array of such objects in the space
> allocated."  Might this be what's wrong with the sparc?

Maybe, but I think it's OK.  I think it was caused by a pool
allocation in new_chunk.  I'll send Andreas an update and see if he
has time to do a re-run.

> In the directive table, you took out all the frequency counts and
> information about which directives are extensions.  That may not be
> important anymore, but it was still valuable information.  I'd like
> them put back.

OK.

> Really old K+R compilers don't know #elif.  GCC had to be adjusted not
> to use it, a couple months back -
> 
> Thu Aug  3 10:05:53 2000  Akiko Matsushita <matusita@sra.co.jp>
> 
>         * gengenrtl.c, rtl.c: Avoid #elif.
> 
> Do we want to cause -Wtraditional to warn about it?  Do we want to
> continue to support it in -traditional mode?

Maybe, but let's leave this to a tidy-up patch.

> >  We currently do not support the _Pragma operator.  Support for that
> >  has to be coordinated with the front end.  Proposed implementation:
> >  both #pragma blah blah and _Pragma("blah blah") become
> >  __builtin_pragma(blah blah) and we teach the parser about that.
> 
> Better idea: when we see _Pragma("blah blah"), we run the string
> through a destringizer, then call run_directive(T_PRAGMA) on it.

Yes, that would be fine, prepending #pragma.  I want to fix pragmas,
first, though - they're a bit nasty still, and don't support the exact
semantics of the standard re macro expansion.

> Is the temp string pool ever flushed except at end of execution?

It's a circular list of buffers, so they just gets re-used, provided
they're not locked.  If a buffer is locked, a new buffer is just
allocated and inserted in the list.  They're only freed on exit.

The pools are similar to obstacks, apart from this optional
circularity (circularity == temporary allocations; the identifier pool
is not circular) and the locking mechanism (only necessary for
temporary allocation pools).  Also, the method of temporary allocation
before commit is a little more suited to what I needed.  The whole
thing could probably be cleaner, though.

I'll try and come up with a patch including fixes to the issues you
mention above this evening.  I'll send you that then.

Neil.


More information about the Gcc-patches mailing list