Bug 60436 - [4.8/4.9/5 Regression] C preprocessor segfaults on assembly file
Summary: [4.8/4.9/5 Regression] C preprocessor segfaults on assembly file
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: preprocessor (show other bugs)
Version: 4.8.2
: P2 normal
Target Milestone: 4.8.4
Assignee: Jakub Jelinek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-06 02:05 UTC by Dan Doel
Modified: 2014-11-28 17:11 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.7.3
Known to fail: 4.8.3, 4.9.0
Last reconfirmed: 2014-03-06 00:00:00


Attachments
gcc5-pr60436.patch (562 bytes, patch)
2014-11-24 12:29 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Doel 2014-03-06 02:05:03 UTC
Greetings,

While testing the latest GHC, I came across a segfault in GCC. I've tracked it down to the preprocessor segfaulting on an intermediate assembly file produced during compilation. I've also reproduced the segfault using the stock 4.8.2 release, although I first encountered it on the 2014-02-06 snapshot which is currently distributed on Arch.

The assembly file is very large, but I have no smaller test case. I cannot attach it here, because xz -9e only compresses it to ~2MiB. However, you can find it at the Arch bug report I'll link below. Various bits about the reproduction of the segfault are also quite odd:

  The assembly file contains no macros or includes.

  A macro must be defined to produce the segfault. The one GHC uses is
    -DTABLES_NEXT_TO_CODE. One can also get the segfault by putting
    '#define TABLES_NEXT_TO_CODE' at the top of the file

  TABLES_NEXT_TO_CODE is meaningless, though. Any macro definition of 13
  letters or more triggers the segfault. But 12 or fewer seems to work fine.

  Adding a blank line anywhere before line 395,848 stops the segfault, but
  adding blank lines after does nothing.

  Truncating the file before line 4,400,769 stops the segfault, but truncating
  after does nothing.

Here's the crash output of running `cpp -v -DABCDEFGHIJKLM` on the file:

----

Using built-in specs.
COLLECT_GCC=cpp
Target: x86_64-unknown-linux-gnu
Configured with: ./configure
Thread model: posix
gcc version 4.8.2 (GCC) 
COLLECT_GCC_OPTIONS='-E' '-v' '-D' 'ABCDEFGHIJKLM' '-o' 'ghc3240_8.pps' '-mtune=generic' '-march=x86-64'
 /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.8.2/cc1 -E -lang-asm -quiet -v -D ABCDEFGHIJKLM ghc3240_8.s -o ghc3240_8.pps -mtune=generic -march=x86-64 -fno-directives-only
ignoring nonexistent directory "/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.8.2/../../../../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.8.2/include
 /usr/local/include
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.8.2/include-fixed
 /usr/include
End of search list.
ghc3240_8.s:1:0: internal compiler error: Segmentation fault
 .section .rodata
 ^
0x80cdcf crash_signal
	../.././gcc/toplev.c:332
0xc4916e get_data_from_adhoc_loc(line_maps*, unsigned int)
	../.././libcpp/line-map.c:156
0xc371db expand_location_1
	../.././gcc/input.c:57
0xc3740d expand_location(unsigned int)
	../.././gcc/input.c:150
0x578c1d scan_translation_unit
	../.././gcc/c-family/c-ppoutput.c:214
0x578c1d preprocess_file(cpp_reader*)
	../.././gcc/c-family/c-ppoutput.c:101
0x5774d0 c_common_init()
	../.././gcc/c-family/c-opts.c:1026
0x52e59d c_objc_common_init()
	../.././gcc/c/c-objc-common.c:63
0x80e7c6 lang_dependent_init
	../.././gcc/toplev.c:1688
0x80e7c6 do_compile
	../.././gcc/toplev.c:1850


----

Here is a link to the Arch bug I've filed. You can find the offending file (compressed) there:

    https://bugs.archlinux.org/task/39180

If any further information is desired, let me know. Thanks.
Comment 1 Markus Trippelsdorf 2014-03-06 08:30:35 UTC
Confirmed. Both trunk and 4.8.3 segfault. 4.7.3 is fine.

/usr/libexec/gcc/x86_64-pc-linux-gnu/4.8.3/cc1 -o /dev/null -E -lang-asm -quiet -v -D ABCDEFGHIJKLM ghc3240_8.s

#0  0x0000000000c7136e in get_data_from_adhoc_loc(line_maps*, unsigned int) ()
#1  0x0000000000c67fe8 in expand_location(unsigned int) ()
#2  0x00000000004cc7b3 in preprocess_file(cpp_reader*) ()
#3  0x0000000000c806df in c_common_init() ()
#4  0x0000000000c7c90c in c_objc_common_init() ()
#5  0x0000000000cc12fb in toplev_main(int, char**) ()
#6  0x00007ffff7756fb0 in __libc_start_main () from /lib/libc.so.6
#7  0x0000000000c7bb1a in _start ()

/usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1 -o /dev/null -E -lang-asm -quiet -v -D ABCDEFGHIJKLM ghc3240_8.s

#0  0x0000000000aeeaac in expand_location_1(unsigned int, bool) [clone .lto_priv.2583] ()
#1  0x0000000000b60b10 in preprocess_file(cpp_reader*) ()
#2  0x0000000000b53316 in c_common_init() ()
#3  0x0000000000b1190b in c_objc_common_init() ()
#4  0x0000000000aeb141 in toplev_main(int, char**) ()
#5  0x00007ffff7756fb0 in __libc_start_main () from /lib/libc.so.6
#6  0x0000000000ae56e9 in _start ()

Valgrind shows:

==28570== Invalid read of size 4
==28570==    at 0xAEEAAC: expand_location_1(unsigned int, bool) [clone .lto_priv.2583] (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB60B0F: preprocess_file(cpp_reader*) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB53315: c_common_init() (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB1190A: c_objc_common_init() (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xAEB140: toplev_main(int, char**) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0x4D70FAF: (below main) (in /lib64/libc-2.19.so)
==28570==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==28570== 
==28570== Invalid read of size 4
==28570==    at 0xAEEAAC: expand_location_1(unsigned int, bool) [clone .lto_priv.2583] (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB6DB50: location_get_source_line(expanded_location, int*) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xAEF365: diagnostic_show_locus(diagnostic_context*, diagnostic_info const*) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xAEC294: diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0x4DC59B: internal_error(char const*, ...) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xA377DB: crash_signal(int) [clone .lto_priv.1176] (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0x4D8508F: ??? (in /lib64/libc-2.19.so)
==28570==    by 0xAEEAAB: expand_location_1(unsigned int, bool) [clone .lto_priv.2583] (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB60B0F: preprocess_file(cpp_reader*) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB53315: c_common_init() (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xB1190A: c_objc_common_init() (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==    by 0xAEB140: toplev_main(int, char**) (in /usr/libexec/gcc/x86_64-pc-linux-gnu/4.9.0/cc1)
==28570==  Address 0x610 is not stack'd, malloc'd or (recently) free'd

Could be related to PR58893.
Comment 2 Markus Trippelsdorf 2014-03-06 09:20:40 UTC
ASAN:SIGSEGV
=================================================================
==20669==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000018 (pc 0x0000023044cf sp 0x7fff99677740 bp 0x000080000001 T0)
    #0 0x23044ce in get_data_from_adhoc_loc(line_maps*, unsigned int) ../../gcc/libcpp/line-map.c:156
    #1 0x22c9240 in expand_location_1 ../../gcc/gcc/input.c:142
    #2 0x22cb68d in expand_location(unsigned int) ../../gcc/gcc/input.c:724
    #3 0x7a3cd9 in scan_translation_unit ../../gcc/gcc/c-family/c-ppoutput.c:214
    #4 0x7a3cd9 in preprocess_file(cpp_reader*) ../../gcc/gcc/c-family/c-ppoutput.c:101
    #5 0x7a05df in c_common_init() ../../gcc/gcc/c-family/c-opts.c:1040
    #6 0x68bccd in c_objc_common_init() ../../gcc/gcc/c/c-objc-common.c:65
    #7 0x1251227 in lang_dependent_init ../../gcc/gcc/toplev.c:1712
    #8 0x1251227 in do_compile ../../gcc/gcc/toplev.c:1900
    #9 0x1251227 in toplev_main(int, char**) ../../gcc/gcc/toplev.c:1990
    #10 0x7f051b6cafaf in __libc_start_main (/lib/libc.so.6+0x1ffaf)
    #11 0x5beb60 (/var/tmp/gcc_sani/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/cc1+0x5beb60)

AddressSanitizer can not provide additional info.
Comment 3 Jakub Jelinek 2014-03-06 09:24:00 UTC
Strange, can't reproduce this myself using the same options, neither in bootstrapped cc1, nor in non-bootstrapped -O0 built, nor bootstrapped cc1 under valgrind, both 4.8 (older and latest) and trunk.
I'd have thought this would be r200376, but that is fixed on the trunk and now (since today) on the 4.8 branch too.
Comment 4 Markus Trippelsdorf 2014-03-06 10:29:18 UTC
Probably started with r191494.
Comment 5 Markus Trippelsdorf 2014-03-06 10:40:13 UTC
markus@x4 tmp % gdb --args /var/tmp/gcc_test/usr/local/bin/g++ -DTABLES_NEXT_TO_CODE -x assembler-with-cpp -c ghc3240_8.s
Reading symbols from /var/tmp/gcc_test/usr/local/bin/g++...done.
(gdb) run
Starting program: /var/tmp/gcc_test/usr/local/bin/g++ -DTABLES_NEXT_TO_CODE -x assembler-with-cpp -c ghc3240_8.s
[New process 7553]
process 7553 is executing new program: /var/tmp/gcc_test/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/cc1

Program received signal SIGABRT, Aborted.
[Switching to process 7553]
0x00007ffff7604ff4 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007ffff7604ff4 in raise () from /lib/libc.so.6
#1  0x00007ffff76063e7 in abort () from /lib/libc.so.6
#2  0x0000000000f25198 in linemap_location_from_macro_expansion_p (set=<optimized out>, location=<optimized out>, location@entry=2147483542)
    at ../../gcc/libcpp/line-map.c:948
#3  0x0000000000f252ff in linemap_lookup (set=set@entry=0x7ffff7ff8000, line=line@entry=2147483542) at ../../gcc/libcpp/line-map.c:642
#4  0x0000000000f253bc in linemap_macro_loc_to_exp_point (set=0x7ffff7ff8000, location=2147483542, original_map=original_map@entry=0x7fffffffdfb8)
    at ../../gcc/libcpp/line-map.c:1181
#5  0x0000000000f25611 in linemap_resolve_location (set=<optimized out>, loc=<optimized out>, loc@entry=2147483542, lrk=<optimized out>, map=map@entry=0x7fffffffdfb8)
    at ../../gcc/libcpp/line-map.c:1262
#6  0x0000000000f0e3ae in expand_location_1 (loc=loc@entry=2147483542, expansion_point_p=expansion_point_p@entry=true) at ../../gcc/gcc/input.c:164
#7  0x0000000000f0f08e in expand_location (loc=loc@entry=2147483542) at ../../gcc/gcc/input.c:724
#8  0x00000000005ec236 in maybe_print_line_1 (stream=0x15e2860, src_loc=2147483542) at ../../gcc/gcc/c-family/c-ppoutput.c:314
#9  maybe_print_line (src_loc=src_loc@entry=2147483542) at ../../gcc/gcc/c-family/c-ppoutput.c:351
#10 0x00000000005ec7fb in do_line_change (pfile=0x15d8cd0, token=0x15d9210, src_loc=2147483542, parsing_args=0) at ../../gcc/gcc/c-family/c-ppoutput.c:420
#11 0x0000000000f240b4 in _cpp_lex_token (pfile=0x15d8cd0) at ../../gcc/libcpp/lex.c:2078
#12 0x0000000000f28d10 in cpp_get_token_1 (pfile=0x15d8cd0, location=0x1d81, location@entry=0x7fffffffe134) at ../../gcc/libcpp/macro.c:2359
#13 0x0000000000f28f75 in cpp_get_token_with_location (pfile=pfile@entry=0x15d8cd0, loc=loc@entry=0x7fffffffe134) at ../../gcc/libcpp/macro.c:2541
#14 0x00000000005ec9d8 in scan_translation_unit (pfile=0x15d8cd0) at ../../gcc/gcc/c-family/c-ppoutput.c:176
#15 preprocess_file (pfile=0x15d8cd0) at ../../gcc/gcc/c-family/c-ppoutput.c:101
#16 0x00000000005eb3e9 in c_common_init () at ../../gcc/gcc/c-family/c-opts.c:1040
#17 0x000000000057dd7e in c_objc_common_init () at ../../gcc/gcc/c/c-objc-common.c:65
#18 0x000000000099f477 in lang_dependent_init (name=0x7fffffffe73b "ghc3240_8.s") at ../../gcc/gcc/toplev.c:1712
#19 do_compile () at ../../gcc/gcc/toplev.c:1900
#20 toplev_main (argc=14, argv=0x7fffffffe2c8) at ../../gcc/gcc/toplev.c:1990
#21 0x00007ffff75f0fb0 in __libc_start_main () from /lib/libc.so.6
#22 0x00000000005306a1 in _start ()
(gdb)

location@entry=2147483542 = 0x7FFFFF96 is near "#define MAX_SOURCE_LOCATION 0x7FFFFFFF"
Comment 6 Markus Trippelsdorf 2014-03-06 14:09:41 UTC
(In reply to Markus Trippelsdorf from comment #4)
> Probably started with r191494.

No, sorry.
This issue started with r192715.
Comment 7 Markus Trippelsdorf 2014-03-06 14:31:32 UTC
Adding "-nostdinc" is a simple workaround.
Comment 8 joseph@codesourcery.com 2014-03-06 18:09:41 UTC
Implicit preincludes should probably be disabled when preprocessing .S 
files (though I don't know if that would help with this issue).
Comment 9 Jakub Jelinek 2014-11-24 12:29:23 UTC
Created attachment 34089 [details]
gcc5-pr60436.patch

Untested fix.  For add_map we have code to handle this.  But if add_map is false too many times, we can still overflow.  This patch forces add_map to true if highest is too high.
Comment 10 Jakub Jelinek 2014-11-25 11:16:59 UTC
Author: jakub
Date: Tue Nov 25 11:16:27 2014
New Revision: 218042

URL: https://gcc.gnu.org/viewcvs?rev=218042&root=gcc&view=rev
Log:
	PR preprocessor/60436
	* line-map.c (linemap_line_start): If highest is above 0x60000000
	and we are still tracking columns or highest is above 0x70000000,
	force add_map.

Modified:
    trunk/libcpp/ChangeLog
    trunk/libcpp/line-map.c
Comment 11 Jakub Jelinek 2014-11-28 13:37:44 UTC
Author: jakub
Date: Fri Nov 28 13:37:13 2014
New Revision: 218154

URL: https://gcc.gnu.org/viewcvs?rev=218154&root=gcc&view=rev
Log:
	Backported from mainline
	2014-11-25  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/60436
	* line-map.c (linemap_line_start): If highest is above 0x60000000
	and we are still tracking columns or highest is above 0x70000000,
	force add_map.

Modified:
    branches/gcc-4_9-branch/libcpp/ChangeLog
    branches/gcc-4_9-branch/libcpp/line-map.c
Comment 12 Jakub Jelinek 2014-11-28 17:06:05 UTC
Author: jakub
Date: Fri Nov 28 17:05:34 2014
New Revision: 218168

URL: https://gcc.gnu.org/viewcvs?rev=218168&root=gcc&view=rev
Log:
	Backported from mainline
	2014-11-25  Jakub Jelinek  <jakub@redhat.com>

	PR preprocessor/60436
	* line-map.c (linemap_line_start): If highest is above 0x60000000
	and we are still tracking columns or highest is above 0x70000000,
	force add_map.

Modified:
    branches/gcc-4_8-branch/libcpp/ChangeLog
    branches/gcc-4_8-branch/libcpp/line-map.c
Comment 13 Jakub Jelinek 2014-11-28 17:11:52 UTC
Fixed.