Created attachment 33219 [details] patch to revert 210055 and 211656 The powerpc-eabi (and powerpc-eabispe -mno-spe; I haven't tried other powerpc-*) preprocessor misbehaves when a line ends with "vector". I happened to hit this in assembler comments, but that does not appear to be a requirement. Examples of lines which cause the failure: vector # vector # x vector x # vector ; vector # .vector # +vector Examples of lines which do not cause the failure: # vectors # vector x # vector. # _vector // vector (unless -C is given) vector; The symptom depends on whether that is the last non-whitespace line in the file. If it is, then the result is an ICE. If it is not, then just "vector" appears by itself on the next line. This did not happen on GCC 4.9.0, but does happen in GCC 4.9.1 and the current trunk. Bisecting between 4.9.0 and 4.9.1 points to SVN revision 210055 as introducing this behavior. Reverting r210055 (and also r211656 which seemed to be dependant upon r210055) appears to fix the issue, and a patch (against 4.9.1) doing that is attached. ---- $ prefix/bin/powerpc-eabi-gcc -v Using built-in specs. COLLECT_GCC=prefix/bin/powerpc-eabi-gcc COLLECT_LTO_WRAPPER=/home/drivshin/gcc-powerpc-eabi/prefix/bin/../libexec/gcc/powerpc-eabi/4.9.1/lto-wrapper Target: powerpc-eabi Configured with: ../gcc-4.9.1/configure -v --prefix=/home/drivshin/gcc-powerpc-eabi/build-gcc/../prefix --target=powerpc-eabi --enable-languages=c Thread model: single gcc version 4.9.1 (GCC) ---- $ echo -e "# comment ending in vector\n# another comment" | prefix/bin/powerpc-eabi-cpp -x assembler-with-cpp # 1 "<stdin>" # 1 "<built-in>" # 1 "<command-line>" # 1 "<stdin>" # comment ending in vector # 2 "<stdin>" # another comment ---- $ echo -e "# comment ending in vector" | prefix/bin/powerpc-eabi-cpp -x assembler-with-cpp # 1 "<stdin>" # 1 "<built-in>" # 1 "<command-line>" # 1 "<stdin>" <stdin>:1:0: internal compiler error: Segmentation fault 0x82f7bf crash_signal ../../gcc-4.9.1/gcc/toplev.c:337 0xc37b75 _cpp_lex_direct ../../gcc-4.9.1/libcpp/lex.c:2171 0xc389fb _cpp_lex_token ../../gcc-4.9.1/libcpp/lex.c:2055 0xc3d117 cpp_get_token_1 ../../gcc-4.9.1/libcpp/macro.c:2359 0x559877 scan_translation_unit ../../gcc-4.9.1/gcc/c-family/c-ppoutput.c:176 0x559877 preprocess_file(cpp_reader*) ../../gcc-4.9.1/gcc/c-family/c-ppoutput.c:101 0x558360 c_common_init() ../../gcc-4.9.1/gcc/c-family/c-opts.c:1047 0x4ff08d c_objc_common_init() ../../gcc-4.9.1/gcc/c/c-objc-common.c:65 0x8312c6 lang_dependent_init ../../gcc-4.9.1/gcc/toplev.c:1712 0x8312c6 do_compile ../../gcc-4.9.1/gcc/toplev.c:1900
This is still happening in the latest trunk and latest 4.9 branch code. Simplified steps to reproduce: ../gcc.svn/configure --prefix=${PWD}/../local --enable-languages=c --with-gnu-as --with-gnu-ld --disable-libstdcxx-pch --target=powerpc-eabi --disable-shared --with-newlib make all-gcc make install-gcc echo -e "# comment ending in vector" | ../local/bin/powerpc-eabi-cpp -x assembler-with-cpp I'm fairly certain this is the same root cause as bug 51654, and changeset r210055 just exposed some non-altivec powerpc targets to it. In addition to the workarounds mentioned there (bug 51654, comment 3), removing the call to init_vector_keywords() in rs6000_cpu_cpp_builtins() also works. Since those vector keywords only have effect if TARGET_ALTIVEC (see rs6000_macro_to_expand()), making their definition conditional upon TARGET_ALTIVEC resolves the 4.9.1 regression (as best I can tell). Although that obviously does not resolve the underlying issue, which has existed since at least 4.6 (according to bug 51654).
All powerpc64*-*-*-* targets appear to be affected.
See also pr65638 for a similar problem.
*** Bug 65638 has been marked as a duplicate of this bug. ***
The problem is that cpp_peek_token, if it returns CPP_EOF, is fatal in the preprocessing: do { peektok = _cpp_lex_token (pfile); if (peektok->type == CPP_EOF) return peektok; } while (index--); but the macro_to_expand stuff (for which cpp_peek_token has been written, BTW) really assumes that it can non-destructively peek tokens when needed.
Obviously regression from the times when vector wasn't a conditional macro.
Created attachment 35194 [details] gcc5-pr61977.patch Untested fix.
I briefly tested the patch, and it does fix the ICE in the case where the conditional macro is the last token. However it does not fix the situation where there are more (non-blank) lines after the conditional macro: $ /bin/echo -e "; comment ending in vector\n; another comment" | ../local/bin/powerpc-eabi-cpp -x assembler-with-cpp # 1 "<stdin>" # 1 "<built-in>" # 1 "<command-line>" # 1 "<stdin>" ; comment ending in vector # 2 "<stdin>" ; another comment There is a newline which gets inserted in the stream before the conditional macro token. In many cases that is harmless, but in some (like the above single-line assembler comment) the result is to turn valid code into invalid code.
I think the extra newline is the result of maybe_print_line() being invoked when trying to peek past a newline in the input. #0 maybe_print_line_1 (src_loc=134, stream=0x361e3b8800 <_IO_2_1_stdout_>) at c-ppoutput.c:352 #1 maybe_print_line (src_loc=134) at c-ppoutput.c:385 #2 do_line_change (pfile=0x1ef2910, token=0x1f1dd68, src_loc=134, parsing_args=0) at c-ppoutput.c:463 #3 cb_line_change (pfile=0x1ef2910, token=0x1f1dd68, parsing_args=0) at c-ppoutput.c:490 #4 _cpp_lex_token (pfile=0x1ef2910) at lex.c:2192 #5 cpp_peek_token (pfile=0x1ef2910, index=0) at lex.c:2085 #6 cpp_get_token_1 (pfile=0x1ef2910, location=0x7fffffffd8fc) at macro.c:2501 If I'm understanding the logic correctly, when _cpp_lex_direct() sees the newline, the processing of the line is considered complete, and therefore that line of output complete. But because of the conditional macro that's not entirely true, and the output only has the line up to (but not including) the conditional macro token itself at that point.
Please see: http://gcc.gnu.org/ml/gcc-patches/2015-04/msg00004.html http://gcc.gnu.org/ml/gcc-patches/2015-04/msg00005.html http://gcc.gnu.org/ml/gcc-patches/2015-04/msg00013.html
Author: jakub Date: Thu Apr 2 11:54:58 2015 New Revision: 221838 URL: https://gcc.gnu.org/viewcvs?rev=221838&root=gcc&view=rev Log: PR preprocessor/61977 * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Don't predefine __vector/__bool/__pixel macros nor context sensitive macros for CLK_ASM. * config/spu/spu-c.c (spu_cpu_cpp_builtins): Similarly. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000-c.c trunk/gcc/config/spu/spu-c.c
Author: jakub Date: Thu Apr 2 11:57:02 2015 New Revision: 221839 URL: https://gcc.gnu.org/viewcvs?rev=221839&root=gcc&view=rev Log: PR preprocessor/61977 * lex.c (cpp_peek_token): Temporarily clear pfile->cb.line_change. * gcc.target/powerpc/pr61977-1.c: New test. * gcc.target/powerpc/pr61977-2.c: New test. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr61977-1.c trunk/gcc/testsuite/gcc.target/powerpc/pr61977-2.c Modified: trunk/gcc/testsuite/ChangeLog trunk/libcpp/ChangeLog trunk/libcpp/lex.c
Two out of the 3 patches applied to trunk, still waiting for review of the first patch.
Author: jakub Date: Mon Apr 6 17:01:50 2015 New Revision: 221882 URL: https://gcc.gnu.org/viewcvs?rev=221882&root=gcc&view=rev Log: PR preprocessor/61977 * lex.c (cpp_peek_token): If peektok is CPP_EOF, back it up with all tokens peeked by the current function. * gcc.dg/cpp/pr61977.c: New test. Added: trunk/gcc/testsuite/gcc.dg/cpp/pr61977.c Modified: trunk/gcc/testsuite/ChangeLog trunk/libcpp/ChangeLog trunk/libcpp/lex.c
Fixed.