This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Fix for backslash interpretation in #line and #-markers (1/2)
- From: Neil Booth <neil at daikokuya dot demon dot co dot uk>
- To: Zack Weinberg <zack at codesourcery dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Mike Stump <mrs at windriver dot com>
- Date: Fri, 15 Feb 2002 07:56:48 +0000
- Subject: Re: Fix for backslash interpretation in #line and #-markers (1/2)
- References: <20020215035800.GB2203@codesourcery.com>
Zack Weinberg wrote:-
> This is the first half of a fix for backslash interpretation in #line
> directives and #-markers. We were processing these with inconsistent
> and buggy semantics, which made it impossible for downstream code to
> process textual preprocessor output under some conditions (rare on
> Unix, but normal on Windows).
But these inconsistent semantics are the ones the standard mandates.
> The new rules are: #line treats its optional filename argument the
> same way that #include does, i.e. backslash is a normal text
> character.
But that standard states that it is a string literal (without escape
interpretation, since that happens later), whereas the RHS of #include
is not.
> In #-markers, however, the string constant is a real
> string constant; backslash introduces an escape sequence.
OK, this makes sense.
> The standalone preprocessor is changed to escape all dangerous characters
> in its output. So, given this input
>
> int x(void) {
> #line 3 "^A^B^C^D^E^F" /* where ^A etc are hard control characters */
> int i = 0;
> #line 5 "c:\newcode\file.c"
> i++;
> # 7 "\""
> return i;
> }
>
> you get this preprocessor output
>
> # 1 "test.c"
> # 1 "<built-in>"
> # 1 "<command line>"
> # 1 "test.c"
> int x(void) {
> # 3 "\001\002\003\004\005\006"
> int i = 0;
> # 5 "c:\\newcode\\file.c"
> i++;
> # 7 "\""
> return i;
> }
OK; apart from the quoted output how is this different? Or is that the
only difference?
> and that can be fed back into the preprocessor safely. Formerly, bad
> things would appear in the output, such as
>
> # 7 """
So why wasn't Mike's original patch OK (did it not dequote # markers
on input)? It quoted the line marker output IIRC.
> Assembly output is still incorrect. I get, for instance,
>
> .file 3 "c:\newcode\file.c"
>
> which the assembler interprets as a filename containing a newline and
> a form feed. That will be fixed by the second half of the patch.
Yup, everything outputting a file name needs to quote it consistently
with whatever reading it in expects.
Neil.