[Patch] libcpp: Fix _Pragma in #__VA_ARGS__ [PR103165]

Jakub Jelinek jakub@redhat.com
Thu Nov 18 11:24:13 GMT 2021

On Wed, Nov 10, 2021 at 09:30:29PM +0000, Joseph Myers wrote:
> On Wed, 10 Nov 2021, Tobias Burnus wrote:
> > Disclaimer: While this patch does a step into the right direction,
> > it probably does help with any of the other _Pragma issues. Neither
> > with 'gcc -E' when the pragma wasn't registered (still expanded too
> > early) nor with the 'GCC diagnostic' issues in general as there the
> > input_location is used to decide when to pop - and depending on the
> > column numbers, this may or may not work.
> And fully correct stringization of _Pragma should respect the spelling of 
> the preprocessing tokens (of the string-literal preprocessing token, that 
> is; spelling variations for the other preprocessing tokens aren't possible 
> here) and the presence or absence of whitespace between them.
> _Pragma("foo")
> _Pragma ("foo")
> _Pragma("foo" )
> _Pragma(L"foo")
> _Pragma ( "foo" )
> (for example) should all have their spelling preserved by stringization 
> (but any nonempty white space sequence becomes a single space).

Yeah.  And not just that, I think also all the exact whitespace in the
string literal (this time with no replacement of nonempty white space with a
single space).

Consider in pragma-3.c e.g.
#define inner(...) #__VA_ARGS__ ; _Pragma   (	"   omp		error severity   (warning)	message (\"Test\") at(compilation)" )
should yield:
  const char *str = "\"1,2\" ; _Pragma ( \"   omp		error severity   (warning)	message (\\\"Test\\\") at(compilation)\" )";

I guess we could encode the PREV_WHITE flags from the ( and ) tokens as 2 separate
bits somewhere (e.g. in some bits of the pragma id), but we need to encode the whole
string literal somewhere too.
Now, in cpp_token we have:
  union cpp_token_u {

    /* Caller-supplied identifier for a CPP_PRAGMA.  */
    unsigned int GTY ((tag ("CPP_TOKEN_FLD_PRAGMA"))) pragma;
where several other members of the union are structs, either with a pair
of unsigned and pointer or two pointers.  So, could we make
the pragma union member also a struct with the pragma id and
pointer to the _Pragma string literal cpp_token?

Though, that doesn't solve the case where in destringize_and_run
pfile->directive_result.type != CPP_PRAGMA.

Are we handling the pragma at a wrong phase of preprocessing?


More information about the Gcc-patches mailing list