Bug 60723 - Line directives with incorrect system header flag
Summary: Line directives with incorrect system header flag
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: preprocessor (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: ---
Assignee: Dodji Seketeli
URL:
Keywords:
: 61412 61435 (view as bug list)
Depends on: 61817
Blocks:
  Show dependency treegraph
 
Reported: 2014-04-01 01:36 UTC by Nicholas
Modified: 2018-11-20 13:40 UTC (History)
13 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-06-17 00:00:00


Attachments
Proposed Patch (476 bytes, patch)
2014-04-01 01:36 UTC, Nicholas
Details | Diff
Repro header (57 bytes, text/plain)
2014-04-01 01:38 UTC, Nicholas
Details
Repro source (72 bytes, text/x-csrc)
2014-04-01 01:39 UTC, Nicholas
Details
A patch candidate that I am currently testing (2.80 KB, patch)
2014-06-26 10:34 UTC, Dodji Seketeli
Details | Diff
preprocessed input (951 bytes, text/plain)
2014-07-01 14:15 UTC, christophe.lyon
Details
preprocessed input (11.09 KB, text/plain)
2014-07-01 14:15 UTC, christophe.lyon
Details
Updated patch that avoid emitting new lines when it's prohibited (3.65 KB, patch)
2014-07-02 13:43 UTC, Dodji Seketeli
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nicholas 2014-04-01 01:36:56 UTC
Created attachment 32503 [details]
Proposed Patch
Comment 1 Nicholas 2014-04-01 01:37:50 UTC
The preprocessor emits line directives with the system-header flag when using complicated macros. This causes certain warnings, like -Wswitch and -Wreorder, to be suppressed.

This report includes a patch as well as a detailed description.


== Reproduction Steps ==

Preprocessing must be done in a separate phase.

There must be a function-like macro which:

  (1) Is called with a newline inside of its arguments list
  (2) Is defined in a system header
  (3) Expands to contain a builtin macro followed by a non-builtin token


A minimal reproduction is attached. Compiling 'src.cpp' produces a div-by-zero warning when compiled by "g++ -isystem. src.cpp". No warning is emitted when preprocessing is split, such as with "g++ -isystem. -no-integrated-cpp src.cpp".

== Cause ==

The preprocessed output of 'src.cpp' shows that a line directive is being inserted after the __FILE__ macro, and that that directive is flagging 'src.cpp' as a system file (using the flag '3').

In a function-like macro expansion, regular tokens are line-marked as being expanded at the opening parenthesis, while builtins are line-marked as being expanded at the closing parenthesis (both of these choices are reasonable, though the inconsistency is probably a design flaw). When a non-builtin follows a builtin in a multiline macro call, the line numbers of the tokens are inconsistent, forcing a line directive to be inserted.


== Details ==

This bug was occurring in our code base. We use ccache (which preprocesses independently of main compilation) and gflags (which has sufficiently complicated macros).

Testing was done on 4.9.

Git bisect blames 611f1003dbf4ebb341c2eda0fcc0e058c421eb6b (4.8.0 20120430), authored by dodji on Mon Apr 30 11:43:43 2012 +0000. However, that diff does not seem to be the root cause.

gcc -v

Using built-in specs.
COLLECT_GCC=./igpp
COLLECT_LTO_WRAPPER=/data/users/njormrod/src/gcc/_builds/39706bc97269366073d2eb4cf3ecf7872513627d/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../../src/configure --prefix=/data/users/njormrod/src/gcc/_builds/39706bc97269366073d2eb4cf3ecf7872513627d --with-gmp=[local gmp-5.0.1] --with-mpfr=[local mpfr-3.0.0] --with-mpc=[local mpc-0.8.2] --disable-multilib --enable-languages=c,c++ --disable-libgcj --disable-bootstrap --disable-static
Thread model: posix
gcc version 4.9.0 20140331 (experimental) (GCC)
Comment 2 Nicholas 2014-04-01 01:38:53 UTC
Created attachment 32504 [details]
Repro header
Comment 3 Nicholas 2014-04-01 01:39:21 UTC
Created attachment 32505 [details]
Repro source
Comment 4 Manuel López-Ibáñez 2014-04-01 09:51:05 UTC
Patches go to the gcc-patches mailing list for review. You will also need a testcase formatted for inclusion on the testsuite and test your patch against the GCC testsuite. See also: http://gcc.gnu.org/contribute.html

> In a function-like macro expansion, regular tokens are line-marked as being
> expanded at the opening parenthesis, while builtins are line-marked as being
> expanded at the closing parenthesis (both of these choices are reasonable,
> though the inconsistency is probably a design flaw). When a non-builtin
> follows a builtin in a multiline macro call, the line numbers of the tokens
> are inconsistent, forcing a line directive to be inserted.

I don't see how your patch fixes this inconsistency.
Comment 5 Nicholas 2014-04-01 15:30:48 UTC
Thank you for the link, Manuel. This is my first contribution to gcc. I'll see if I can follow your instructions.

As for the content of the patch, it fixes the second and more egregious error. In and of itself, token-line inconsistency is not the end of the world; the real problem is that the code that inserts the bad line directive has a bug. Specifically, it uses the macro's source file to determine the callsite file's properties. That is what the patch fixes.

I do not have a patch for the line inconsistency, which is one of the reasons that I posted this to the bug forum.
Comment 6 Matt Godbolt 2014-06-04 17:28:21 UTC
*** Bug 61412 has been marked as a duplicate of this bug. ***
Comment 7 Nicholas 2014-06-10 06:21:25 UTC
What a pleasant surprise to see your name, Matt.

This dropped off my radar for a bit, but I'm pretty close now to getting a properly organized patch ready. Details to follow within 5 days.
Comment 8 Matt Godbolt 2014-06-10 13:18:31 UTC
Thanks Nicholas :)
Comment 9 Nicholas 2014-06-15 22:31:43 UTC
Manuel, I have filed a patch to gcc-patches. https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00862.html
Comment 10 Manuel López-Ibáñez 2014-06-17 09:53:29 UTC
(In reply to Nicholas from comment #9)
> Manuel, I have filed a patch to gcc-patches.
> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00862.html

Great!  unfortunately I don't have any power to approve patches. You'll have to wait until one of the CPP/C/C++ maintainers approves it. Perhaps you can ping the patch once per week (see the mailing list archives for examples of people pinging patches).

Does it fix the problem reported by Matt in PR6142 also?
Comment 11 Dodji Seketeli 2014-06-25 15:32:02 UTC
I am taking this one.
Comment 12 Dodji Seketeli 2014-06-26 10:34:06 UTC
Created attachment 33010 [details]
A patch candidate that I am currently testing

This the patch I am running through bootstrap at the moment.

The patch basically makes the pre-processor detects changes in system-ness of tokens and upon each of these change, emits a line marking directive so that the information about the system-ness change is not lost.

I haven't commented the code yet.

Nicholas, maybe you could test the patch and give me feedback until my (slow) machine completes the bootstrap?

Thanks.
Comment 13 Dodji Seketeli 2014-07-01 09:17:47 UTC
Author: dodji
Date: Tue Jul  1 09:17:14 2014
New Revision: 212194

URL: https://gcc.gnu.org/viewcvs?rev=212194&root=gcc&view=rev
Log:
PR preprocessor/60723 - missing system-ness marks for macro tokens

When a system macro is expanded in a non-system file during
out-of-line preprocessing, it can happen that the preprocessor forgets
to emit line markers to express the system-ness status of tokens that
come after the expansion of the macro.

That can lead to situations where the entire non-system file can be
considered as being a system file and thus have its warnings be
discarded during the compilation of the resulting preprocessed file.

My understanding is that this is due to the preprocessor not
systematically detecting (and reporting) the change in system-ness of
tokens.

And this is what this patch does.  Each time the system-ness of a
given token is different from the previous token that was emitted by
the preprocessor, it emits a line marker for the sole purpose of
marking the new system-ness of the subsequent tokens to come.

Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk.

gcc/c-family/ChangeLog:
	* c-ppoutput.c (struct print::prev_was_system_token): New data
	member.
	(init_pp_output): Initialize it.
	(maybe_print_line_1, maybe_print_line, print_line_1, print_line)
	(do_line_change): Return a flag saying if a line marker was
	emitted or not.
	(scan_translation_unit): Detect if the system-ness of the token we
	are about to emit is different from the one of the previously
	emitted token.  If so, emit a line marker.  Avoid emitting
	useless adjacent line markers.
	(scan_translation_unit_directives_only): Adjust.

gcc/testsuite/ChangeLog:
	* gcc.dg/cpp/syshdr{4,5}.{c,h}: New test files.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Added:
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.c
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.h
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.c
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.h
Modified:
    trunk/gcc/c-family/ChangeLog
    trunk/gcc/c-family/c-ppoutput.c
    trunk/gcc/testsuite/ChangeLog
Comment 14 christophe.lyon 2014-07-01 14:12:41 UTC
This patch breaks the build of glibc on ARM and AArch64 targets.

With GCC configured for target arm-none-linux-gnueabi

The command line used:
(echo '#define SYSCALL_NAME kill';  echo '#define SYSCALL_NARGS 2';  echo '#define SYSCALL_SYMBOL __kill';  echo '#include <syscall-template.S>';  echo 'weak_alias (__kill, kill)';  echo 'libc_hidden_weak (kill)'; ) | /work1/lyon/Work/ARM/Linaro/builds/gcc-fsf-trunk/tools/bin/arm-none-linux-gnueabi-gcc -c -include ../include/libc-symbols.h       -DASSEMBLER -x assembler-with-cpp - -I../include -I/work1/lyon/Work/ARM/Linaro/builds/gcc-fsf-trunk/obj-arm-none-linux-gnueabi/glibc-1/signal  -I/work1/lyon/Work/ARM/Linaro/builds/gcc-fsf-trunk/obj-arm-none-linux-gnueabi/glibc-1  -I../sysdeps/unix/sysv/linux/arm  -I../nptl/sysdeps/unix/sysv/linux  -I../nptl/sysdeps/pthread  -I../sysdeps/pthread  -I../sysdeps/unix/sysv/linux  -I../sysdeps/gnu  -I../sysdeps/unix/inet  -I../nptl/sysdeps/unix/sysv  -I../sysdeps/unix/sysv  -I../sysdeps/unix/arm  -I../nptl/sysdeps/unix  -I../sysdeps/unix  -I../sysdeps/posix -I../sysdeps/arm/armv7/multiarch  -I../sysdeps/arm/armv7  -I../sysdeps/arm/armv6t2  -I../sysdeps/arm/armv6  -I../sysdeps/arm/nptl  -I../sysdeps/arm/include -I../sysdeps/arm  -I../sysdeps/arm/soft-fp  -I../sysdeps/wordsize-32  -I../sysdeps/ieee754/flt-32  -I../sysdeps/ieee754/dbl-64  -I../sysdeps/ieee754  -I../sysdeps/generic  -I../nptl  -I.. -I../libio -I.  
../sysdeps/unix/syscall-template.S: Assembler messages:
../sysdeps/unix/syscall-template.S:81: Error: missing ')'
../sysdeps/unix/syscall-template.S:81: Error: missing expression -- `ldr r7,=('
../sysdeps/unix/syscall-template.S:81: Error: junk at end of line, first unrecognized character is `('
../sysdeps/unix/syscall-template.S:81: Error: junk at end of line, first unrecognized character is `)'

Attached the output of the above command with -E and -E -dD.
Comment 15 christophe.lyon 2014-07-01 14:15:10 UTC
Created attachment 33039 [details]
preprocessed input

This preprocessed input (-E) shows how commit 212194 is broken.
Comment 16 christophe.lyon 2014-07-01 14:15:52 UTC
Created attachment 33040 [details]
preprocessed input

This preprocessed input (-E -dD) shows how commit 212194 is broken.
Comment 17 Dodji Seketeli 2014-07-02 13:43:18 UTC
Created attachment 33045 [details]
Updated patch that avoid emitting new lines when it's prohibited
Comment 18 Dodji Seketeli 2014-07-02 13:44:29 UTC
Thanks Christophe,

The attached patch above should hopefully fix the issue it was causing.  Is there a chance that you test that it doesn't break your arm build before I try to commit it again?

Thanks in advance.
Comment 19 christophe.lyon 2014-07-03 13:48:02 UTC
As discussed on IRC, I could test your patch, and the compiler now builds successfully.
Comment 20 Dodji Seketeli 2014-07-16 08:58:28 UTC
> In a function-like macro expansion, regular tokens are line-marked as being
> expanded at the opening parenthesis, while builtins are line-marked as being
> expanded at the closing parenthesis (both of these choices are reasonable,
> though the inconsistency is probably a design flaw).

As noted in discussions we had on the mailing list at https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01080.html, I think the fact that tokens resulting from the expansion of a built-in macro have their location set to the closing parenthesis of the invocation of the enclosing function-like macro is a separate issue that needs to be dealt with.

I have thus filed PR61817 to track this.

Thanks.
Comment 21 Dodji Seketeli 2014-07-16 10:34:08 UTC
Author: dodji
Date: Wed Jul 16 10:33:36 2014
New Revision: 212638

URL: https://gcc.gnu.org/viewcvs?rev=212638&root=gcc&view=rev
Log:
PR preprocessor/60723 - missing system-ness marks for macro tokens

When a system macro is expanded in a non-system file during
out-of-line preprocessing, it can happen that the preprocessor forgets
to emit line markers to express the system-ness status of tokens that
come after the expansion of the macro.

That can lead to situations where the entire non-system file can be
considered as being a system file and thus have its warnings be
discarded during the compilation of the resulting preprocessed file.

My understanding is that this is due to the preprocessor not
systematically detecting (and reporting) the change in system-ness of
tokens.

And this is what this patch does.  Each time the system-ness of a
given token is different from the previous token that was emitted by
the preprocessor, it emits a line marker for the sole purpose of
marking the new system-ness of the subsequent tokens to come.

Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk.

gcc/c-family/ChangeLog:
	* c-ppoutput.c (struct print::prev_was_system_token): New data
	member.
	(init_pp_output): Initialize it.
	(maybe_print_line_1, maybe_print_line, print_line_1, print_line)
	(do_line_change): Return a flag saying if a line marker was
	emitted or not.
	(scan_translation_unit): Detect if the system-ness of the token we
	are about to emit is different from the one of the previously
	emitted token.  If so, emit a line marker.  Avoid emitting useless
	adjacent line markers.  Avoid emitting line markers for tokens
	originating from the expansion of built-in macros.
	(scan_translation_unit_directives_only): Adjust.

gcc/testsuite/ChangeLog:
	* gcc.dg/cpp/syshdr{4,5}.{c,h}: New test files.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Added:
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.c
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.h
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.c
    trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.h
Modified:
    trunk/gcc/c-family/ChangeLog
    trunk/gcc/c-family/c-ppoutput.c
    trunk/gcc/testsuite/ChangeLog
Comment 22 Dodji Seketeli 2014-07-16 13:37:59 UTC
So finally the two patches that have been proposed at https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01063.html and https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01064.html have been accepted and committed to trunk.

I'll wait before closing this bug that the stakeholders following this test the tree with the new patches.

In the mean time, I'll look at the additional issue PR61817.

Thanks.
Comment 23 Nicholas 2014-07-17 03:36:05 UTC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61817

The more I talk, the more work you seem to have. :p

Thank you for taking this.


Cheers,
Nicholas
Comment 24 ktkachov 2014-07-22 09:55:37 UTC
(In reply to Dodji Seketeli from comment #22)
> So finally the two patches that have been proposed at
> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01063.html and
> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01064.html have been accepted
> and committed to trunk.
> 
> I'll wait before closing this bug that the stakeholders following this test
> the tree with the new patches.

This breaks building ncurses.

I've filed PR 61832 for this, but now trying to reduce a testcase.

> 
> In the mean time, I'll look at the additional issue PR61817.
> 
> Thanks.
Comment 25 ktkachov 2014-07-24 08:22:31 UTC
(In reply to ktkachov from comment #24)
> (In reply to Dodji Seketeli from comment #22)
> > So finally the two patches that have been proposed at
> > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01063.html and
> > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01064.html have been accepted
> > and committed to trunk.
> > 
> > I'll wait before closing this bug that the stakeholders following this test
> > the tree with the new patches.
> 
> This breaks building ncurses.
> 
> I've filed PR 61832 for this, but now trying to reduce a testcase.

Turns out that was a bug in ncurses, so disregard this
Comment 26 Manuel López-Ibáñez 2014-10-05 23:46:50 UTC
*** Bug 61435 has been marked as a duplicate of this bug. ***
Comment 27 Jakub Jelinek 2015-02-05 10:07:46 UTC
There are various regressions caused by this in various packages, the question is whether we just tell people to preprocess with -P if they do weird thing on the preprocessed output, or whether this fix was worth the trouble.
Affected fedora packages include at least:
cclive
cvc4
dans-gdal-scripts
emacs
ember
flamerobin
ghc-gtk
ghc-webkit
gnote
ksh
libason
libcmis
libgpg
libixion
liborcus
ncurses
openldap
perl
xorg-x11-server
xorg-x11-server-utils
zsh
Comment 28 Jakub Jelinek 2015-02-05 10:08:46 UTC
If this is about ccache, I wonder if ccache couldn't preprocess with -fdirectives-only instead, that way hopefully no info is lost.
Comment 29 Steve Ellcey 2016-12-07 19:23:33 UTC
I posted this question on gcc@gcc.gnu.org and got not response so I thought I would ask here on the bug report.

I am trying to understand the status of this bug and the patch
that fixes it.  It looks like a patch was submitted and checked
in for 5.0 to fix the problem reported and I see the new 
behavior caused by the patch in GCC 5.X compilers.  This behavior
caused a number of issues with configures and scripts that examined
preprocessed output as is mentioned in this bug report.
There was a later bug, 64864, complaining about the behavior and
that was closed as invalid.

But when I look at GCC 6.X or ToT (7.X) compilers I do not see the same
behavior as 5.X.  Was this patch reverted or was a new patch submitted
that undid some of this patches behavior?  I couldn't find any revert or
new patch to replace the original one so I am not sure when or why
the code changed back after the 5.X releases.

Here is a test case that I am preprocessing with g++ -E:

#include <stdbool.h>
class foo {
	void operator= ( bool bit);
	operator bool() const;
};

GCC 5.4 breaks up the operator delcarations with line markers and GCC 6.2
does not.
Comment 30 Eric Gallager 2018-08-20 23:38:00 UTC
(In reply to Steve Ellcey from comment #29)
> I posted this question on gcc@gcc.gnu.org and got not response so I thought
> I would ask here on the bug report.
> 
> I am trying to understand the status of this bug and the patch
> that fixes it.  It looks like a patch was submitted and checked
> in for 5.0 to fix the problem reported and I see the new 
> behavior caused by the patch in GCC 5.X compilers.  This behavior
> caused a number of issues with configures and scripts that examined
> preprocessed output as is mentioned in this bug report.
> There was a later bug, 64864, complaining about the behavior and
> that was closed as invalid.
> 
> But when I look at GCC 6.X or ToT (7.X) compilers I do not see the same
> behavior as 5.X.  Was this patch reverted or was a new patch submitted
> that undid some of this patches behavior?  I couldn't find any revert or
> new patch to replace the original one so I am not sure when or why
> the code changed back after the 5.X releases.
> 
> Here is a test case that I am preprocessing with g++ -E:
> 
> #include <stdbool.h>
> class foo {
> 	void operator= ( bool bit);
> 	operator bool() const;
> };
> 
> GCC 5.4 breaks up the operator delcarations with line markers and GCC 6.2
> does not.

This is the right link for where it is in the archives, right? 
https://gcc.gnu.org/ml/gcc/2016-11/msg00116.html
Probably got lost in the change-of-the-month rollover...