The following program ---------- #include <iostream> #include <boost/numeric/ublas/config.hpp> #include <boost/numeric/ublas/vector.hpp> #include <boost/numeric/ublas/matrix.hpp> #include <boost/numeric/ublas/banded.hpp> #include <boost/numeric/ublas/io.hpp> namespace ublas = boost::numeric::ublas; template<class M> void initialize_matrix (M &m) { int size1 = m.size1 (); int size2 = m.size2 (); for (int i = 0; i < size1; ++ i) for (int j = std::max (i - 1, 0); j < std::min (i + 2, size2); ++ j) m (i, j) = i * size1 + j + 1.f; } int main () { ublas::banded_matrix<double> m1 (3, 3, 1, 1), m2 (3, 3, 1, 1); ublas::banded_adaptor<ublas::banded_matrix<double> > bam1 (m1, 1, 1), bam2 (m2, 1, 1); initialize_matrix (bam1); initialize_matrix (bam2); bam1.swap (bam2); std::cout << "bam1.swap (bam2) = " << bam1 << " " << bam2 << std::endl; } ---------- works as intended when compiled with -O2 and core dumps in swap() when compiled with -O2 -frename-registers like ---------- g++ -v -save-temps -I/usr/local/lib/boost_dev/boost -fabi-version=0 -DNDEBUG -O2 -frename-registers bug.cpp -o bug Reading specs from /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/specs Configured with: ./configure Thread model: posix gcc version 3.3 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/cc1plus -E -D__GNUG__=3 -quiet -v -I/usr/local/lib/boost_dev/boost -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=0 -D_GNU_SOURCE -DNDEBUG bug.cpp -fabi-version=0 -frename-registers -O2 bug.ii ignoring nonexistent directory "NONE/include" ignoring nonexistent directory "/usr/local/i686-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/lib/boost_dev/boost /usr/local/include/c++/3.3 /usr/local/include/c++/3.3/i686-pc-linux-gnu /usr/local/include/c++/3.3/backward /usr/local/include /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/include /usr/include End of search list. /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/cc1plus -fpreprocessed bug.ii -quiet -dumpbase bug.cpp -auxbase bug -O2 -version -fabi-version=0 -frename-registers -o bug.s GNU C++ version 3.3 (i686-pc-linux-gnu) compiled by GNU C version 2.95.3 20010315 (SuSE). GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=31945 as -V -Qy -o bug.o bug.s GNU assembler version 2.11.90.0.29 (i486-suse-linux) using BFD version 2.11.90.0.29 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/collect2 -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o bug /usr/lib/crt1.o /usr/lib/crti.o /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/crtbegin.o -L/usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3 -L/usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/../../.. bug.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/crtend.o /usr/lib/crtn.o ---------- The program work correct when compiled with -O2 -frename-registers under GCC 3.2.1.
Created attachment 4226 [details] Preprocessed source
This is very hard to isolate is there any way you could make a smaller example?
Created attachment 4244 [details] Preprocessed source (reduced) Thanks, Joerg
Subject: Re: -O2 -frename-registers generates wrong code Hi Andrew, I've just tried to reduce the test case for that bug once again (eliminating some std:: dependencies), but failed. It would be great, if someone (means you ;-) could give me some advice on how to advance. Thanks, Joerg ----- Original Message ----- From: "pinskia at physics dot uc dot edu" <gcc-bugzilla@gcc.gnu.org> To: <jhr.walter@t-online.de> Sent: Wednesday, June 18, 2003 7:17 PM Subject: [Bug optimization/11198] -O2 -frename-registers generates wrong code > PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org. > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11198 > > > pinskia at physics dot uc dot edu changed: > > What |Removed |Added > -------------------------------------------------------------------------- -- > Status|UNCONFIRMED |WAITING > > > ------- Additional Comments From pinskia at physics dot uc dot edu 2003-06-18 17:17 ------- > This is very hard to isolate is there any way you could make a smaller example? > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter.
No one has figuared out how to reduce this but it still is a bug.
Created attachment 4333 [details] Preprocessed source (further reduced) still 2000 lines left :-(
Created attachment 4341 [details] Preprocessed source (~400 lines)
Reducing this stuff from ~34500 line to roughly 400 was quite a nightmare. Alas, I'm stuck now: If I try to further simplify the code (e.g. by removing unused template parameters) the bug vanishes. I'm afraid somebody has to debug this stuff to find the bug in gcc. The good news is, the bug doesn't seem to show up in mainline or releases before gcc 3.3. Regards, Volker
It looks as if the problem appeared with Eric's patch http://gcc.gnu.org/ml/gcc-cvs/2003-03/msg00627.html
Created attachment 4342 [details] Now at 314 lines Now at 314 lines. I start to think that this bug is essentially unrelated to -frename-registers, since random changes either make the segfault go away when compiled with -frename-registers, or let a segfault appear when compiled _without_! Will try further. W.
Created attachment 4343 [details] Now at 203 lines Still very fragile testcase :-( It's now at 203 lines, and we get this behavior: tmp/gg> ~/bin/gcc-3.3/bin/c++ x.cc -O2 && ./a.out tmp/gg> ~/bin/gcc-3.3/bin/c++ x.cc -O2 -frename-registers && ./a.out Segmentation fault
If it was Eric's patch, then this is a binutils bug because the asm looks the same between the two versions (except for use of different registers), note this is still present in the latest cvs version of binutils: GNU assembler version 2.14.90 (i686-pc-linux-gnu) using BFD version 2.14.90 20030703.
Created attachment 4344 [details] Now 122 lines OK, it's now 122 lines and I gotta go. Should now be a little simpler to actually figure out a) what the program is supposed to do, and b) why it's crashing. W.
Good asm: pushl %ebp movl %esp, %ebp subl $56, %esp movl $0, -16(%ebp) movl 8(%ebp), %eax movl 12(%ebp), %edx movl $0, -52(%ebp) movl (%eax), %ecx movl %eax, -24(%ebp) movl -32(%ebp), %eax movl %ecx, -20(%ebp) movl $0, -32(%ebp) movl %eax, -12(%ebp) movl (%edx), %eax movl %ecx, -40(%ebp) movl %edx, -40(%ebp) movl %eax, -36(%ebp) movl %eax, -56(%ebp) movl -48(%ebp), %eax movl %eax, -28(%ebp) xorl %eax, %eax cmpl (%ecx), %eax; dies here in bad, ecx = -40(%ebp) Bad: pushl %ebp movl %esp, %ebp subl $56, %esp movl $0, -16(%ebp) movl 8(%ebp), %eax movl 12(%ebp), %edx movl $0, -52(%ebp) movl (%eax), %ecx movl %eax, -24(%ebp) movl -32(%ebp), %eax movl %edx, -40(%ebp) movl %ecx, -20(%ebp) movl %eax, -12(%ebp) movl (%edx), %eax movl -48(%ebp), %edx movl $0, -32(%ebp) movl %eax, -36(%ebp) movl %edx, -28(%ebp) xorl %edx, %edx cmpl (%ecx), %edx ; dies here, ecx = -20(%ebp), not the -40(%ebp) where the good one is.
Not quite. Don't forget that the AT&T style uses 'mov src,dest'. According to GDB, the problem is rather here: 0x08048458 29 if (j < size2_ && i-j < 2) 804844f: dd 00 fldl (%eax) 8048451: 8b 4d d8 mov 0xffffffd8(%ebp),%ecx 8048454: 8b 11 mov (%ecx),%edx 8048456: 31 c9 xor %ecx,%ecx 8048458: 3b 0a cmp (%edx),%ecx
Here's the code at stake (from the buggy version, but functionally identical to the correct version): .L17: fldl (%eax) movl -40(%ebp), %ecx movl (%ecx), %edx xorl %ecx, %ecx cmpl (%edx), %ecx The problem is the contents of -40(%ebp). In the correct version, -40(%ebp) is written twice in a row movl (%edx), %eax movl %ecx, -40(%ebp) movl %edx, -40(%ebp) movl %eax, -36(%ebp) In the buggy version, -40(%ebp) is written twice too, but in the reverse order movl %edx, -40(%ebp) movl %ecx, -20(%ebp) movl %eax, -12(%ebp) movl (%edx), %eax movl -48(%ebp), %edx movl $0, -32(%ebp) movl %eax, -36(%ebp) movl %edx, -28(%ebp) xorl %edx, %edx cmpl (%ecx), %edx movl %ecx, -40(%ebp) The 2nd scheduling pass swaps the two sets, because the memory locations are not marked as aliasing: (insn:HI 27 21 40 0 0x4020fc08 (set (mem/s:SI (plus:SI (reg/f:SI 6 ebp) (const_int -40 [0xffffffd8])) [7 <variable>.dead1+0 S4 A32]) (reg/v/f:SI 2 ecx [64])) 38 {*movsi_1} (insn_list 21 (nil)) (nil)) (insn:HI 83 73 87 0 0x4020fd10 (set (mem/s:SI (plus:SI (reg/f:SI 6 ebp) (const_int -40 [0xffffffd8])) [5 <variable>.a+0 S4 A32]) (reg/v/u/f:SI 1 edx [59])) 38 {*movsi_1} (nil) (expr_list:REG_DEAD (reg/v/u/f:SI 1 edx [59]) (nil))) If I understand correctly the code, the variable 'dead1' comes from the first iterator it1, while the variable 'a' comes from the second iterator it2. For some reason, they are wrongly given the same stack slot.
Could this be the same as bug 11320?
If bug 11320 is indeed a reload bug, no. The problem pertains to aliasing here: the slot assigned to the temporary matrix<>::iterator1(bam1.m) is reused for adaptor<>::a_iterator1 it2 because the types conflict according to objects_must_conflict_p. The reasoning is probably that, since the types conflicts, operations on objects of these types cannot be reordered by the scheduling passes later. The problem is that at least one member of the first type (dead1) doesn't conflict with at least one member of the second type (a), so the scheduling pass swaps insn 27 and insn 83.
Subject: Bug 11198 CVSROOT: /cvs/gcc Module name: gcc Changes by: ebotcazou@gcc.gnu.org 2003-07-07 07:25:36 Modified files: gcc : ChangeLog alias.c gcc/testsuite : ChangeLog Added files: gcc/testsuite/g++.dg/opt: stack1.C Log message: PR optimization/11198 * alias.c (objects_must_conflict_p): Return 1 if the types have the same alias set, not if the alias sets only conflict. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.372&r2=2.373 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/alias.c.diff?cvsroot=gcc&r1=1.195&r2=1.196 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&r1=1.2845&r2=1.2846 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/opt/stack1.C.diff?cvsroot=gcc&r1=NONE&r2=1.1
Subject: Bug 11198 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_3-branch Changes by: ebotcazou@gcc.gnu.org 2003-07-07 07:31:48 Modified files: gcc : ChangeLog alias.c gcc/testsuite : ChangeLog Added files: gcc/testsuite/g++.dg/opt: stack1.C Log message: PR optimization/11198 * alias.c (objects_must_conflict_p): Return 1 if the types have the same alias set, not if the alias sets only conflict. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.16114.2.636&r2=1.16114.2.637 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/alias.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.181.2.1&r2=1.181.2.2 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.2261.2.220&r2=1.2261.2.221 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/g++.dg/opt/stack1.C.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=NONE&r2=1.1.2.1
See http://gcc.gnu.org/ml/gcc-patches/2003-07/msg00564.html
see http://x-factor-2013.blogspot.co.uk/2012/04/son-i-am-disappoint-is-x-factors-simon.html