Bug 88424 - Inserts newlines when preserving comments for a file using Windows newlines
Summary: Inserts newlines when preserving comments for a file using Windows newlines
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: preprocessor (show other bugs)
Version: unknown
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-09 18:16 UTC by Christoph Reiter
Modified: 2024-05-25 18:20 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christoph Reiter 2018-12-09 18:16:15 UTC
gcc --version
gcc (Debian 8.2.0-11) 8.2.0

When preprocessing a file which uses Windows newlines without discarding comments (The -C option) then gcc inserts an extra newline after each line in comments:

* Write the following to test.c and make sure it uses Windows newlines

/**
 * Foo-
 * bar-
 * quux
 */
int main() {
    return 0;
}

* Execute "gcc -C -E test.c"

3) Expected (this is what clang produces):

[...]
/**
 * Foo-
 * bar-
 * quux
 */
int main() {
    return 0;
}

* Actual:

/**

 * Foo-

 * bar-

 * quux

 */
# 6 "test.c"
int main() {
    return 0;
}
Comment 1 Andrew Pinski 2018-12-09 18:22:38 UTC
Most likely \r is being treated as a new line just like \n.  Remember \r by itself is the new line endings on the classic mac os.
Comment 2 Christoph Reiter 2018-12-09 18:27:39 UTC
For context, we use gcc in gobject-introspection where we parse metadata from the C comments and this breaks things when users on Windows have git set up to auto convert line endings: https://gitlab.gnome.org/GNOME/gobject-introspection/issues/243
Comment 3 Peter Damianov 2024-05-25 16:49:47 UTC
Looking in a hex editor, what gcc is doing is changing the CRLF to LFLF, there is no CR in the output.

  /* If the file is using old-school Mac line endings (\r only),
     terminate with another \r, not an \n, so that we do not mistake
     the \r\n sequence for a single DOS line ending and erroneously
     issue the "No newline at end of file" diagnostic.  */
  if (to.len && to.text[to.len - 1] == '\r')
    to.text[to.len] = '\r';
  else
    to.text[to.len] = '\n';

I noticed this code, but commenting it out makes the compiler fail selftests.

I'll keep looking to see if I can find the code responsible for doing that.
Comment 4 Peter Damianov 2024-05-25 16:55:37 UTC
All of the non-commments are turned to LF line endings.

So it must be something specifically to do with comment processing.
Comment 5 Peter Damianov 2024-05-25 18:20:00 UTC
I checked clang's behavior, and it does CRLF -> LF from non-comments, but it leaves them intact in comments. I'm not really sure if this behavior is worth emulating or not. I think it would be better to remove the CR from the CRLF in comments too.