===== gcc41bug.c ===== #include <stddef.h> typedef struct chunk_s chunk_t; struct chunk_s { unsigned int size; int offset_next; }; #define GET_ADDRESS(baseptr, offset) \ ((chunk_t *)((char *)(baseptr) + (offset))) void gcc_bug_test(chunk_t *first) { chunk_t *cur; while (1) { cur = GET_ADDRESS(first, first->offset_next); if (GET_ADDRESS(first, cur->offset_next) == first) first->offset_next = 0; if (cur->size == 0) break; } } ===== end of gcc41bug.c ===== command line: $ i686-pc-linux-gnu-gcc-4.1.0 -c -O1 -o gcc41bug.o gcc41bug.c i686-pc-linux-gnu-gcc-4.1.0: Internal error: Segmentation fault (program cc1) 1. Works fine without -O1. 2. Code might look meaningless, but it's extracted part from a real application code $ gcc -v Using built-in specs. Target: i686-pc-linux-gnu Configured with: /var/tmp/portage/gcc-4.1.0/work/gcc-4.1.0/configure --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/4.1.0 --includedir=/usr/lib/gcc/i686-pc-linux-gnu/4.1.0/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/4.1.0 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/4.1.0/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/4.1.0/info --with-gxx-include-dir=/usr/lib/gcc/i686-pc-linux-gnu/4.1.0/include/g++-v4 --host=i686-pc-linux-gnu --build=i686-pc-linux-gnu --disable-altivec --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --disable-libunwind-exceptions --disable-multilib --disable-libmudflap --disable-libssp --disable-libgcj --enable-languages=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu Thread model: posix gcc version 4.1.0 (Gentoo 4.1.0)
#2 0x081a4404 in lookup_as_function (x=<value optimized out>, code=PLUS) at /home/peshtigo/pinskia/src/gnu/gcc/src/gcc/cse.c:1453 #3 0x081a58d0 in fold_rtx (x=<value optimized out>, insn=0x0) at /home/peshtigo/pinskia/src/gnu/gcc/src/gcc/cse.c:4213 #4 0x081a5a9e in fold_rtx (x=<value optimized out>, insn=0x0) at /home/peshtigo/pinskia/src/gnu/gcc/src/gcc/cse.c:3452 #5 0x081a58e4 in fold_rtx (x=<value optimized out>, insn=0x0) at /home/peshtigo/pinskia/src/gnu/gcc/src/gcc/cse.c:4218 #6 0x081a5a9e in fold_rtx (x=<value optimized out>, insn=0x0) at /home/peshtigo/pinskia/src/gnu/gcc/src/gcc/cse.c:3452 #7 0x081a58e4 in fold_rtx (x=<value optimized out>, insn=0x0) at /home/peshtigo/pinskia/src/gnu/gcc/src/gcc/cse.c:4218 #
Confirmed.
Reduced testcase (without any preprocessed macros or includes): struct chunk_s { unsigned int size; int offset_next; }; typedef struct chunk_s chunk_t; void gcc_bug_test(chunk_t *first) { chunk_t * cur; char * first0; do { first0 = (char *) first; cur = (chunk_t *) (first->offset_next + first0); if ((chunk_t *) (first0 + cur->offset_next) != first) return ; first->offset_next = 0; } while (cur->size != 0); }
Infinite recursion is going on in CSE. Here is what happens. At cse.c:4278 in fold_rtx, we have (gdb) call debug_rtx(x) (plus:SI (reg/v/f:SI 60 [ first ]) (const_int 4 [0x4])) (gdb) call debug_rtx(y) (plus:SI (reg/v/f:SI 60 [ first ]) (mem/s/j:SI (plus:SI (reg/v/f:SI 59 [ cur ]) (const_int 4 [0x4])) [0 <variable>.offset_next+0 S4 A32])) (gdb) Here we call fold_rtx (XEXP (y, 1), 0), where XEXP (y, 1) is (mem/s/j:SI (plus:SI (reg/v/f:SI 59 [ cur ]) (const_int 4 [0x4])) [0 <variable>.offset_next+0 S4 A32]) fold_rtx delegates all processing of MEMs to fold_rtx_mem, so fold_rtx_mem gets (mem/s/j:SI (plus:SI (reg/v/f:SI 59 [ cur ]) (const_int 4 [0x4])) [0 <variable>.offset_next+0 S4 A32]) fold_rtx_mem in turn calls fold_rtx with the address inside the MEM, which is: (plus:SI (reg/v/f:SI 59 [ cur ]) (const_int 4 [0x4])) [0 <variable>.offset_next+0 S4 A32]) fold_rtx later calls lookup_as_function on (reg/v/f:SI 59 [ cur ]), which returns (plus:SI (reg/v/f:SI 60 [ first ]) (mem/s/j:SI (plus:SI (reg/v/f:SI 60 [ first ]) (const_int 4 [0x4])) [0 <variable>.offset_next+0 S4 A32])) which is the same as what we started with.
Will not be fixed in 4.1.1; adjust target milestone to 4.1.2.
Would be fixed with fwprop due to not recursively calling fold_rtx.
Janis, Could you do a regression hunt on this bug for me? Thanks, Andrew Pinski
A regression hunt on powerpc-linux using an i686-linux cross compiler identified this patch: http://gcc.gnu.org/viewcvs?view=rev&rev=102570 r102570 | hubicka | 2005-07-29 22:22:41 +0000 (Fri, 29 Jul 2005) The compiler didn't segfault on the test system, but it did run for a very long time so the test script used by the regression hunt timed out after 5 seconds.
The patch identified in comment #8 can't have caused the CSE problem, but it probably exposed a latent bug.
Investigating.
Ugh. This is an oscillation with period 6 (not 3 as indicated in comment #3) between fold_rtx and fold_rtx_mem: #0 fold_rtx (x=0x55759dd4, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:3621 #1 0x081a640a in fold_rtx_mem (x=0x55759de0, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:3453 #2 0x081a8203 in fold_rtx (x=0x55759de0, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:3678 #3 0x081ad4d0 in fold_rtx (x=0x55759d8c, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:4283 #4 0x081a640a in fold_rtx_mem (x=0x55759d98, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:3453 #5 0x081a8203 in fold_rtx (x=0x55759d98, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:3678 #6 0x081ad4d0 in fold_rtx (x=0x55759dd4, insn=0x0) at /home/eric/svn/gcc/gcc/cse.c:4283 The fundamental problem is that fold_rtx operates recursively on its operands after equivalencing them, unlike the tree folder. More specifically: /* If we have (<op> <reg> <const_int>) for an associative OP and REG is known to be of similar form, we may be able to replace the operation with a combined operation. This may eliminate the intermediate operation if every use is simplified in this way. Note that the similar optimization done by combine.c only works if the intermediate operation's result has only one reference. */ This is a recipe for infinite loops and the code already contains a couple of kludges against that: /* If we have compiled a statement like "if (x == (x & mask1))", and now are looking at "x & mask2", we will have a case where the first operand of Y is the same as our first operand. Unless we detect this case, an infinite loop will result. */ if (XEXP (y, 0) == folded_arg0) break; [...] /* If Y contains our first operand (the most common way this can happen is if Y is a MEM), we would do into an infinite loop if we tried to fold it. So don't in that case. */ if (! reg_mentioned_p (folded_arg0, y)) y = fold_rtx (y, insn); I have another one for the -O1 case but it falls apart at -O2 because PRE hoists an equivalencing insn out of the extended BB under consideration. The problematic pattern is set (reg1) (plus (reg) (mem (plus (reg2) (const_int)))) set (reg2) (plus (reg) (mem (plus (reg1) (const_int)))) so in particular it defeats all the first order kludges like the above two.
*** Bug 28010 has been marked as a duplicate of this bug. ***
*** Bug 28569 has been marked as a duplicate of this bug. ***
Subject: Bug 27616 Author: ebotcazou Date: Mon Sep 4 19:33:24 2006 New Revision: 116683 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116683 Log: PR rtl-optimization/27616 * cse.c (table_size): New static variable. (new_basic_block): Initialize it to 0. (remove_from_table): Decrement it. (insert): Increment it. (fold_rtx_mem_1): New function, renamed from fold_rtx_mem. (fold_rtx_mem): Enforce a cap on the recursion depth. Call fold_rtx_mem_1 if under the cap. (fold_rtx) <RTX_COMM_ARITH>: In the associative case, delay a little the lookup of the equivalent expression and test for equality of the first operand of the equivalent expression before in turn looking up an equivalent constant for the second operand. Added: trunk/gcc/testsuite/gcc.c-torture/compile/20060904-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/cse.c trunk/gcc/testsuite/ChangeLog
Subject: Bug 27616 Author: ebotcazou Date: Mon Sep 4 19:35:09 2006 New Revision: 116684 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116684 Log: PR rtl-optimization/27616 * cse.c (table_size): New static variable. (new_basic_block): Initialize it to 0. (remove_from_table): Decrement it. (insert): Increment it. (fold_rtx_mem_1): New function, renamed from fold_rtx_mem. (fold_rtx_mem): Enforce a cap on the recursion depth. Call fold_rtx_mem_1 if under the cap. (fold_rtx) <RTX_COMM_ARITH>: In the associative case, delay a little the lookup of the equivalent expression and test for equality of the first operand of the equivalent expression before in turn looking up an equivalent constant for the second operand. Added: branches/gcc-4_1-branch/gcc/testsuite/gcc.c-torture/compile/20060904-1.c Modified: branches/gcc-4_1-branch/gcc/ChangeLog branches/gcc-4_1-branch/gcc/cse.c branches/gcc-4_1-branch/gcc/testsuite/ChangeLog
Fixed everywhere.