Created attachment 32503 [details] Proposed Patch
The preprocessor emits line directives with the system-header flag when using complicated macros. This causes certain warnings, like -Wswitch and -Wreorder, to be suppressed. This report includes a patch as well as a detailed description. == Reproduction Steps == Preprocessing must be done in a separate phase. There must be a function-like macro which: (1) Is called with a newline inside of its arguments list (2) Is defined in a system header (3) Expands to contain a builtin macro followed by a non-builtin token A minimal reproduction is attached. Compiling 'src.cpp' produces a div-by-zero warning when compiled by "g++ -isystem. src.cpp". No warning is emitted when preprocessing is split, such as with "g++ -isystem. -no-integrated-cpp src.cpp". == Cause == The preprocessed output of 'src.cpp' shows that a line directive is being inserted after the __FILE__ macro, and that that directive is flagging 'src.cpp' as a system file (using the flag '3'). In a function-like macro expansion, regular tokens are line-marked as being expanded at the opening parenthesis, while builtins are line-marked as being expanded at the closing parenthesis (both of these choices are reasonable, though the inconsistency is probably a design flaw). When a non-builtin follows a builtin in a multiline macro call, the line numbers of the tokens are inconsistent, forcing a line directive to be inserted. == Details == This bug was occurring in our code base. We use ccache (which preprocesses independently of main compilation) and gflags (which has sufficiently complicated macros). Testing was done on 4.9. Git bisect blames 611f1003dbf4ebb341c2eda0fcc0e058c421eb6b (4.8.0 20120430), authored by dodji on Mon Apr 30 11:43:43 2012 +0000. However, that diff does not seem to be the root cause. gcc -v Using built-in specs. COLLECT_GCC=./igpp COLLECT_LTO_WRAPPER=/data/users/njormrod/src/gcc/_builds/39706bc97269366073d2eb4cf3ecf7872513627d/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../../src/configure --prefix=/data/users/njormrod/src/gcc/_builds/39706bc97269366073d2eb4cf3ecf7872513627d --with-gmp=[local gmp-5.0.1] --with-mpfr=[local mpfr-3.0.0] --with-mpc=[local mpc-0.8.2] --disable-multilib --enable-languages=c,c++ --disable-libgcj --disable-bootstrap --disable-static Thread model: posix gcc version 4.9.0 20140331 (experimental) (GCC)
Created attachment 32504 [details] Repro header
Created attachment 32505 [details] Repro source
Patches go to the gcc-patches mailing list for review. You will also need a testcase formatted for inclusion on the testsuite and test your patch against the GCC testsuite. See also: http://gcc.gnu.org/contribute.html > In a function-like macro expansion, regular tokens are line-marked as being > expanded at the opening parenthesis, while builtins are line-marked as being > expanded at the closing parenthesis (both of these choices are reasonable, > though the inconsistency is probably a design flaw). When a non-builtin > follows a builtin in a multiline macro call, the line numbers of the tokens > are inconsistent, forcing a line directive to be inserted. I don't see how your patch fixes this inconsistency.
Thank you for the link, Manuel. This is my first contribution to gcc. I'll see if I can follow your instructions. As for the content of the patch, it fixes the second and more egregious error. In and of itself, token-line inconsistency is not the end of the world; the real problem is that the code that inserts the bad line directive has a bug. Specifically, it uses the macro's source file to determine the callsite file's properties. That is what the patch fixes. I do not have a patch for the line inconsistency, which is one of the reasons that I posted this to the bug forum.
*** Bug 61412 has been marked as a duplicate of this bug. ***
What a pleasant surprise to see your name, Matt. This dropped off my radar for a bit, but I'm pretty close now to getting a properly organized patch ready. Details to follow within 5 days.
Thanks Nicholas :)
Manuel, I have filed a patch to gcc-patches. https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00862.html
(In reply to Nicholas from comment #9) > Manuel, I have filed a patch to gcc-patches. > https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00862.html Great! unfortunately I don't have any power to approve patches. You'll have to wait until one of the CPP/C/C++ maintainers approves it. Perhaps you can ping the patch once per week (see the mailing list archives for examples of people pinging patches). Does it fix the problem reported by Matt in PR6142 also?
I am taking this one.
Created attachment 33010 [details] A patch candidate that I am currently testing This the patch I am running through bootstrap at the moment. The patch basically makes the pre-processor detects changes in system-ness of tokens and upon each of these change, emits a line marking directive so that the information about the system-ness change is not lost. I haven't commented the code yet. Nicholas, maybe you could test the patch and give me feedback until my (slow) machine completes the bootstrap? Thanks.
Author: dodji Date: Tue Jul 1 09:17:14 2014 New Revision: 212194 URL: https://gcc.gnu.org/viewcvs?rev=212194&root=gcc&view=rev Log: PR preprocessor/60723 - missing system-ness marks for macro tokens When a system macro is expanded in a non-system file during out-of-line preprocessing, it can happen that the preprocessor forgets to emit line markers to express the system-ness status of tokens that come after the expansion of the macro. That can lead to situations where the entire non-system file can be considered as being a system file and thus have its warnings be discarded during the compilation of the resulting preprocessed file. My understanding is that this is due to the preprocessor not systematically detecting (and reporting) the change in system-ness of tokens. And this is what this patch does. Each time the system-ness of a given token is different from the previous token that was emitted by the preprocessor, it emits a line marker for the sole purpose of marking the new system-ness of the subsequent tokens to come. Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk. gcc/c-family/ChangeLog: * c-ppoutput.c (struct print::prev_was_system_token): New data member. (init_pp_output): Initialize it. (maybe_print_line_1, maybe_print_line, print_line_1, print_line) (do_line_change): Return a flag saying if a line marker was emitted or not. (scan_translation_unit): Detect if the system-ness of the token we are about to emit is different from the one of the previously emitted token. If so, emit a line marker. Avoid emitting useless adjacent line markers. (scan_translation_unit_directives_only): Adjust. gcc/testsuite/ChangeLog: * gcc.dg/cpp/syshdr{4,5}.{c,h}: New test files. Signed-off-by: Dodji Seketeli <dodji@redhat.com> Added: trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.c trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.h trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.c trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.h Modified: trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/c-ppoutput.c trunk/gcc/testsuite/ChangeLog
This patch breaks the build of glibc on ARM and AArch64 targets. With GCC configured for target arm-none-linux-gnueabi The command line used: (echo '#define SYSCALL_NAME kill'; echo '#define SYSCALL_NARGS 2'; echo '#define SYSCALL_SYMBOL __kill'; echo '#include <syscall-template.S>'; echo 'weak_alias (__kill, kill)'; echo 'libc_hidden_weak (kill)'; ) | /work1/lyon/Work/ARM/Linaro/builds/gcc-fsf-trunk/tools/bin/arm-none-linux-gnueabi-gcc -c -include ../include/libc-symbols.h -DASSEMBLER -x assembler-with-cpp - -I../include -I/work1/lyon/Work/ARM/Linaro/builds/gcc-fsf-trunk/obj-arm-none-linux-gnueabi/glibc-1/signal -I/work1/lyon/Work/ARM/Linaro/builds/gcc-fsf-trunk/obj-arm-none-linux-gnueabi/glibc-1 -I../sysdeps/unix/sysv/linux/arm -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../sysdeps/pthread -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../nptl/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/arm -I../nptl/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../sysdeps/arm/armv7/multiarch -I../sysdeps/arm/armv7 -I../sysdeps/arm/armv6t2 -I../sysdeps/arm/armv6 -I../sysdeps/arm/nptl -I../sysdeps/arm/include -I../sysdeps/arm -I../sysdeps/arm/soft-fp -I../sysdeps/wordsize-32 -I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../nptl -I.. -I../libio -I. ../sysdeps/unix/syscall-template.S: Assembler messages: ../sysdeps/unix/syscall-template.S:81: Error: missing ')' ../sysdeps/unix/syscall-template.S:81: Error: missing expression -- `ldr r7,=(' ../sysdeps/unix/syscall-template.S:81: Error: junk at end of line, first unrecognized character is `(' ../sysdeps/unix/syscall-template.S:81: Error: junk at end of line, first unrecognized character is `)' Attached the output of the above command with -E and -E -dD.
Created attachment 33039 [details] preprocessed input This preprocessed input (-E) shows how commit 212194 is broken.
Created attachment 33040 [details] preprocessed input This preprocessed input (-E -dD) shows how commit 212194 is broken.
Created attachment 33045 [details] Updated patch that avoid emitting new lines when it's prohibited
Thanks Christophe, The attached patch above should hopefully fix the issue it was causing. Is there a chance that you test that it doesn't break your arm build before I try to commit it again? Thanks in advance.
As discussed on IRC, I could test your patch, and the compiler now builds successfully.
> In a function-like macro expansion, regular tokens are line-marked as being > expanded at the opening parenthesis, while builtins are line-marked as being > expanded at the closing parenthesis (both of these choices are reasonable, > though the inconsistency is probably a design flaw). As noted in discussions we had on the mailing list at https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01080.html, I think the fact that tokens resulting from the expansion of a built-in macro have their location set to the closing parenthesis of the invocation of the enclosing function-like macro is a separate issue that needs to be dealt with. I have thus filed PR61817 to track this. Thanks.
Author: dodji Date: Wed Jul 16 10:33:36 2014 New Revision: 212638 URL: https://gcc.gnu.org/viewcvs?rev=212638&root=gcc&view=rev Log: PR preprocessor/60723 - missing system-ness marks for macro tokens When a system macro is expanded in a non-system file during out-of-line preprocessing, it can happen that the preprocessor forgets to emit line markers to express the system-ness status of tokens that come after the expansion of the macro. That can lead to situations where the entire non-system file can be considered as being a system file and thus have its warnings be discarded during the compilation of the resulting preprocessed file. My understanding is that this is due to the preprocessor not systematically detecting (and reporting) the change in system-ness of tokens. And this is what this patch does. Each time the system-ness of a given token is different from the previous token that was emitted by the preprocessor, it emits a line marker for the sole purpose of marking the new system-ness of the subsequent tokens to come. Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk. gcc/c-family/ChangeLog: * c-ppoutput.c (struct print::prev_was_system_token): New data member. (init_pp_output): Initialize it. (maybe_print_line_1, maybe_print_line, print_line_1, print_line) (do_line_change): Return a flag saying if a line marker was emitted or not. (scan_translation_unit): Detect if the system-ness of the token we are about to emit is different from the one of the previously emitted token. If so, emit a line marker. Avoid emitting useless adjacent line markers. Avoid emitting line markers for tokens originating from the expansion of built-in macros. (scan_translation_unit_directives_only): Adjust. gcc/testsuite/ChangeLog: * gcc.dg/cpp/syshdr{4,5}.{c,h}: New test files. Signed-off-by: Dodji Seketeli <dodji@redhat.com> Signed-off-by: Dodji Seketeli <dodji@redhat.com> Added: trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.c trunk/gcc/testsuite/gcc.dg/cpp/syshdr4.h trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.c trunk/gcc/testsuite/gcc.dg/cpp/syshdr5.h Modified: trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/c-ppoutput.c trunk/gcc/testsuite/ChangeLog
So finally the two patches that have been proposed at https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01063.html and https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01064.html have been accepted and committed to trunk. I'll wait before closing this bug that the stakeholders following this test the tree with the new patches. In the mean time, I'll look at the additional issue PR61817. Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61817 The more I talk, the more work you seem to have. :p Thank you for taking this. Cheers, Nicholas
(In reply to Dodji Seketeli from comment #22) > So finally the two patches that have been proposed at > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01063.html and > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01064.html have been accepted > and committed to trunk. > > I'll wait before closing this bug that the stakeholders following this test > the tree with the new patches. This breaks building ncurses. I've filed PR 61832 for this, but now trying to reduce a testcase. > > In the mean time, I'll look at the additional issue PR61817. > > Thanks.
(In reply to ktkachov from comment #24) > (In reply to Dodji Seketeli from comment #22) > > So finally the two patches that have been proposed at > > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01063.html and > > https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01064.html have been accepted > > and committed to trunk. > > > > I'll wait before closing this bug that the stakeholders following this test > > the tree with the new patches. > > This breaks building ncurses. > > I've filed PR 61832 for this, but now trying to reduce a testcase. Turns out that was a bug in ncurses, so disregard this
*** Bug 61435 has been marked as a duplicate of this bug. ***
There are various regressions caused by this in various packages, the question is whether we just tell people to preprocess with -P if they do weird thing on the preprocessed output, or whether this fix was worth the trouble. Affected fedora packages include at least: cclive cvc4 dans-gdal-scripts emacs ember flamerobin ghc-gtk ghc-webkit gnote ksh libason libcmis libgpg libixion liborcus ncurses openldap perl xorg-x11-server xorg-x11-server-utils zsh
If this is about ccache, I wonder if ccache couldn't preprocess with -fdirectives-only instead, that way hopefully no info is lost.
I posted this question on gcc@gcc.gnu.org and got not response so I thought I would ask here on the bug report. I am trying to understand the status of this bug and the patch that fixes it. It looks like a patch was submitted and checked in for 5.0 to fix the problem reported and I see the new behavior caused by the patch in GCC 5.X compilers. This behavior caused a number of issues with configures and scripts that examined preprocessed output as is mentioned in this bug report. There was a later bug, 64864, complaining about the behavior and that was closed as invalid. But when I look at GCC 6.X or ToT (7.X) compilers I do not see the same behavior as 5.X. Was this patch reverted or was a new patch submitted that undid some of this patches behavior? I couldn't find any revert or new patch to replace the original one so I am not sure when or why the code changed back after the 5.X releases. Here is a test case that I am preprocessing with g++ -E: #include <stdbool.h> class foo { void operator= ( bool bit); operator bool() const; }; GCC 5.4 breaks up the operator delcarations with line markers and GCC 6.2 does not.
(In reply to Steve Ellcey from comment #29) > I posted this question on gcc@gcc.gnu.org and got not response so I thought > I would ask here on the bug report. > > I am trying to understand the status of this bug and the patch > that fixes it. It looks like a patch was submitted and checked > in for 5.0 to fix the problem reported and I see the new > behavior caused by the patch in GCC 5.X compilers. This behavior > caused a number of issues with configures and scripts that examined > preprocessed output as is mentioned in this bug report. > There was a later bug, 64864, complaining about the behavior and > that was closed as invalid. > > But when I look at GCC 6.X or ToT (7.X) compilers I do not see the same > behavior as 5.X. Was this patch reverted or was a new patch submitted > that undid some of this patches behavior? I couldn't find any revert or > new patch to replace the original one so I am not sure when or why > the code changed back after the 5.X releases. > > Here is a test case that I am preprocessing with g++ -E: > > #include <stdbool.h> > class foo { > void operator= ( bool bit); > operator bool() const; > }; > > GCC 5.4 breaks up the operator delcarations with line markers and GCC 6.2 > does not. This is the right link for where it is in the archives, right? https://gcc.gnu.org/ml/gcc/2016-11/msg00116.html Probably got lost in the change-of-the-month rollover...
Redoing some CCs that got silently removed without showing up in the bug history (presumably due to the server migration)
The master branch has been updated by Lewis Hyatt <lhyatt@gcc.gnu.org>: https://gcc.gnu.org/g:ddb7f0a0cac48762ba6408d69538f8115c4a2739 commit r13-3264-gddb7f0a0cac48762ba6408d69538f8115c4a2739 Author: Lewis Hyatt <lhyatt@gmail.com> Date: Thu Oct 6 18:05:02 2022 -0400 preprocessor: Fix tracking of system header state [PR60014,PR60723] The token_streamer class (which implements gcc mode -E and -save-temps/-no-integrated-cpp) needs to keep track whether the last tokens output were in a system header, so that it can generate line marker annotations as necessary for a downstream consumer to reconstruct the state. The logic for tracking it, which was added by r5-1863 to resolve PR60723, has some edge case issues as revealed by the three new test cases. The first, coming from the original PR60014, was incidentally fixed by r9-1926 for unrelated reasons. The other two were still failing on master prior to this commit. Such code paths were not realizable prior to r13-1544, which made it possible for the token streamer to see CPP_PRAGMA tokens in more contexts. The two main issues being corrected here are: 1) print.prev_was_system_token needs to indicate whether the previous token output was in a system location. However, it was not being set on every token, only on those that triggered the main code path; specifically it was not triggered on a CPP_PRAGMA token. Testcase 2 covers this case. 2) The token_streamer uses a variable "line_marker_emitted" to remember whether a line marker has been emitted while processing a given token, so that it wouldn't be done more than once in case multiple conditions requiring a line marker are true. There was no reason for this to be a member variable that retains its value from token to token, since it is just needed for tracking the state locally while processing a single given token. The fact that it could retain its value for a subsequent token is rather difficult to observe, but testcase 3 demonstrates incorrect behavior resulting from that. Moving this to a local variable also simplifies understanding the control flow going forward. gcc/c-family/ChangeLog: PR preprocessor/60014 PR preprocessor/60723 * c-ppoutput.cc (class token_streamer): Remove member line_marker_emitted to... (token_streamer::stream): ...a local variable here. Set print.prev_was_system_token on all code paths. gcc/testsuite/ChangeLog: PR preprocessor/60014 PR preprocessor/60723 * gcc.dg/cpp/pr60014-1.c: New test. * gcc.dg/cpp/pr60014-1.h: New test. * gcc.dg/cpp/pr60014-2.c: New test. * gcc.dg/cpp/pr60014-2.h: New test. * gcc.dg/cpp/pr60014-3.c: New test. * gcc.dg/cpp/pr60014-3.h: New test.