This is the mail archive of the
mailing list for the GCC project.
Re: libcpp how-to question: Tokenizing and spaces & tabs - or special Fortran needs
- From: "N.M. Maclaren" <nmm1 at cam dot ac dot uk>
- To: Dodji Seketeli <dodji at redhat dot com>
- Cc: Tobias Burnus <burnus at net-b dot de>, Tom Tromey <tom at tromey dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, Manuel López-Ibáñez <lopezibanez at gmail dot com>, gfortran <fortran at gcc dot gnu dot org>, gcc <gcc at gcc dot gnu dot org>
- Date: 01 Dec 2014 11:12:39 +0000
- Subject: Re: libcpp how-to question: Tokenizing and spaces & tabs - or special Fortran needs
- Authentication-results: sourceware.org; auth=none
- References: <5479E1EB dot 3080204 at net-b dot de> <874mtfu2h0 dot fsf at redhat dot com>
On Dec 1 2014, Dodji Seketeli wrote:
Just for the record -- as I am trimming the original post for legibility
-- the initial message I am replying to can be read at
Tobias Burnus <email@example.com> writes:
Do you have a suggestion how to best implement this white-space
preserving with libcpp? It can (and presumably should) be a special
flag/function for Fortran.
I would propose that libcpp gets extended to gain a new kind of token
which type would be something like 'CPP_WHITESPACE', which would contain
the exact spelling of the continuous non-vertical spaces that are
There would then be a new libcpp option that would actually make
cpp_get_token() yield that kind of token. The rest of the behaviour of
cpp_get_token() that is today associated with white spaces would remain
I would strongly recommend at least extending that to vertical white
space as well. Non-default vertical white space has caused portability
trouble for many decades in many languages (including Fortran and C).
Inter alia, C's rules for what white space is acceptable and what it
does in translation phases 1-1 are poorly defined (for unavoidable
reasons, unfortunately), usually not well understood, and differ
considerably between compilers. Fortran is easier - it's processor-
dependent, and that's it - experienced people never use anything
non-default. There is also the 'interesting' case of whether CR-LF
is acceptable on Unix-like systems.
Other than that, what you say makes a lot of sense, but I am not an
expert on the internals of gcc/gfortran.
Aside: if anyone, ever again, designs a language where overprinting is
part of the syntax, please sign me up for the cluebat team to educate
them! I.e. bare CRs should either be Macintosh line separators or
diagnosed as errors, in ALL languages.