Fix handling of incomplete header names (committed)
Joseph S. Myers
joseph@codesourcery.com
Sun Feb 22 01:09:00 GMT 2009
I previously posted essentially this patch five years ago
<http://article.gmane.org/gmane.comp.gcc.patches/54558> to fix a
(regression) bug in a corner case of lexing (see that message for more
details).
I still think this is a bug in GCC not the standard, and when the
committee since then had the opportunity to address the lexing rules
with DR#324 they did not change the greedy algorithm. As C and C++
front-end maintainers are now preprocessor maintainers and no current
preprocessor maintainer objected in the previous discussion, I have
now committed this updated patch. Bootstrapped with no regressions on
i686-pc-linux-gnu.
libcpp:
2009-02-21 Joseph Myers <joseph@codesourcery.com>
* lex.c (lex_string): Return a CPP_LESS token for missing '>' in a
header name.
(_cpp_lex_direct): Handle this.
gcc/testsuite:
2009-02-21 Joseph Myers <joseph@codesourcery.com>
* gcc.dg/cpp/include4.c: New test.
Index: gcc/testsuite/gcc.dg/cpp/include4.c
===================================================================
--- gcc/testsuite/gcc.dg/cpp/include4.c (revision 0)
+++ gcc/testsuite/gcc.dg/cpp/include4.c (revision 0)
@@ -0,0 +1,14 @@
+/* Preprocessing tokens are always formed according to a greedy algorithm,
+ so "#include <stddef.h" must be interpreted as a sequence of tokens,
+ of which the "h" then gets macro expanded. Likewise the other
+ examples. */
+
+#define h h>
+#include <stddef.h
+#undef h
+
+#define foo stddef.h>
+#include <foo
+
+#include <foo /*
+> */
Index: libcpp/lex.c
===================================================================
--- libcpp/lex.c (revision 144344)
+++ libcpp/lex.c (working copy)
@@ -613,7 +613,9 @@ create_literal (cpp_reader *pfile, cpp_t
/* Lexes a string, character constant, or angle-bracketed header file
name. The stored string contains the spelling, including opening
quote and leading any leading 'L', 'u' or 'U'. It returns the type
- of the literal, or CPP_OTHER if it was not properly terminated.
+ of the literal, or CPP_OTHER if it was not properly terminated, or
+ CPP_LESS for an unterminated header name which must be relexed as
+ normal tokens.
The spelling is NUL-terminated, but it is not guaranteed that this
is the first NUL since embedded NULs are preserved. */
@@ -652,6 +654,14 @@ lex_string (cpp_reader *pfile, cpp_token
else if (c == '\n')
{
cur--;
+ /* Unmatched quotes always yield undefined behavior, but
+ greedy lexing means that what appears to be an unterminated
+ header name may actually be a legitimate sequence of tokens. */
+ if (terminator == '>')
+ {
+ token->type = CPP_LESS;
+ return;
+ }
type = CPP_OTHER;
break;
}
@@ -1181,7 +1191,8 @@ _cpp_lex_direct (cpp_reader *pfile)
if (pfile->state.angled_headers)
{
lex_string (pfile, result, buffer->cur - 1);
- break;
+ if (result->type != CPP_LESS)
+ break;
}
result->type = CPP_LESS;
--
Joseph S. Myers
joseph@codesourcery.com
More information about the Gcc-patches
mailing list