Bug 11198 - [3.3 regression] -O2 -frename-registers generates wrong code
Summary: [3.3 regression] -O2 -frename-registers generates wrong code
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.3
: P1 critical
Target Milestone: 3.3.1
Assignee: Eric Botcazou
Keywords: wrong-code
Depends on:
Reported: 2003-06-15 17:49 UTC by Joerg Walter
Modified: 2012-05-11 18:41 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2003-06-28 19:44:44

Preprocessed source (137.27 KB, application/x-bzip2)
2003-06-15 17:57 UTC, Joerg Walter
Preprocessed source (reduced) (90.71 KB, application/x-bzip2)
2003-06-18 20:39 UTC, Joerg Walter
Preprocessed source (further reduced) (7.47 KB, application/octet-stream)
2003-07-03 01:42 UTC, Volker Reichelt
Preprocessed source (~400 lines) (2.15 KB, application/octet-stream)
2003-07-03 20:37 UTC, Volker Reichelt
Now at 314 lines (1.77 KB, text/plain)
2003-07-03 23:07 UTC, Wolfgang Bangerth
Now at 203 lines (1.19 KB, text/plain)
2003-07-04 00:31 UTC, Wolfgang Bangerth
Now 122 lines (818 bytes, text/plain)
2003-07-04 01:56 UTC, Wolfgang Bangerth

Note You need to log in before you can comment on or make changes to this bug.
Description Joerg Walter 2003-06-15 17:49:57 UTC
The following program

#include <iostream>

#include <boost/numeric/ublas/config.hpp>
#include <boost/numeric/ublas/vector.hpp>
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/banded.hpp>
#include <boost/numeric/ublas/io.hpp>

namespace ublas = boost::numeric::ublas;

template<class M>
void initialize_matrix (M &m) {
    int size1 = m.size1 ();
    int size2 = m.size2 ();
    for (int i = 0; i < size1; ++ i)
        for (int j = std::max (i - 1, 0); j < std::min (i + 2, size2); ++ j)
            m (i, j) = i * size1 + j + 1.f;

int main () {
    ublas::banded_matrix<double> m1 (3, 3, 1, 1), m2 (3, 3, 1, 1);
    ublas::banded_adaptor<ublas::banded_matrix<double> > bam1 (m1, 1, 1), bam2 
(m2, 1, 1);
    initialize_matrix (bam1);
    initialize_matrix (bam2);
    bam1.swap (bam2);
    std::cout << "bam1.swap (bam2) = " << bam1 << " " << bam2 << std::endl;


works as intended when compiled with -O2 and core dumps in swap() when 
compiled with -O2 -frename-registers like

g++ -v -save-temps -I/usr/local/lib/boost_dev/boost -fabi-version=0 -DNDEBUG 
-O2 -frename-registers    bug.cpp   -o bug
Reading specs from /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/specs
Configured with: ./configure
Thread model: posix
gcc version 3.3
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/cc1plus -E -D__GNUG__=3 -quiet 
-v -I/usr/local/lib/boost_dev/boost -D__GNUC__=3 -D__GNUC_MINOR__=3 
-D__GNUC_PATCHLEVEL__=0 -D_GNU_SOURCE -DNDEBUG bug.cpp -fabi-version=0 
-frename-registers -O2 bug.ii
ignoring nonexistent directory "NONE/include"
ignoring nonexistent directory "/usr/local/i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
End of search list.
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/cc1plus -fpreprocessed bug.ii 
-quiet -dumpbase bug.cpp -auxbase bug -O2 -version -fabi-version=0 
-frename-registers -o bug.s
GNU C++ version 3.3 (i686-pc-linux-gnu)
	compiled by GNU C version 2.95.3 20010315 (SuSE).
GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=31945
 as -V -Qy -o bug.o bug.s
GNU assembler version (i486-suse-linux) using BFD version
 /usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/collect2 -m elf_i386 
-dynamic-linker /lib/ld-linux.so.2 -o bug /usr/lib/crt1.o /usr/lib/crti.o 
-L/usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/../../.. bug.o -lstdc++ -lm 
-lgcc_s -lgcc -lc -lgcc_s -lgcc 
/usr/local/lib/gcc-lib/i686-pc-linux-gnu/3.3/crtend.o /usr/lib/crtn.o


The program work correct when compiled with -O2 -frename-registers under GCC 
Comment 1 Joerg Walter 2003-06-15 17:57:31 UTC
Created attachment 4226 [details]
Preprocessed source
Comment 2 Andrew Pinski 2003-06-18 17:17:21 UTC
This is very hard to isolate is there any way you could make a smaller example?
Comment 3 Joerg Walter 2003-06-18 20:39:07 UTC
Created attachment 4244 [details]
Preprocessed source (reduced)

Comment 4 Joerg Walter 2003-06-23 21:39:38 UTC
Subject: Re:  -O2 -frename-registers generates wrong code

Hi Andrew,

I've just tried to reduce the test case for that bug once again (eliminating
some std:: dependencies), but failed. It would be great, if someone (means
you ;-) could give me some advice on how to advance.


----- Original Message -----
From: "pinskia at physics dot uc dot edu" <gcc-bugzilla@gcc.gnu.org>
To: <jhr.walter@t-online.de>
Sent: Wednesday, June 18, 2003 7:17 PM
Subject: [Bug optimization/11198] -O2 -frename-registers generates wrong

> PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11198
> pinskia at physics dot uc dot edu changed:
>            What    |Removed                     |Added
> --------------------------------------------------------------------------
>              Status|UNCONFIRMED                 |WAITING
> ------- Additional Comments From pinskia at physics dot uc dot edu
2003-06-18 17:17 -------
> This is very hard to isolate is there any way you could make a smaller
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.

Comment 5 Andrew Pinski 2003-06-28 19:44:44 UTC
No one has figuared out how to reduce this but it still is a bug.
Comment 6 Volker Reichelt 2003-07-03 01:42:44 UTC
Created attachment 4333 [details]
Preprocessed source (further reduced)

still 2000 lines left :-(
Comment 7 Volker Reichelt 2003-07-03 20:37:09 UTC
Created attachment 4341 [details]
Preprocessed source (~400 lines)
Comment 8 Volker Reichelt 2003-07-03 20:46:21 UTC
Reducing this stuff from ~34500 line to roughly 400 was quite a nightmare.
Alas, I'm stuck now: If I try to further simplify the code (e.g. by removing
unused template parameters) the bug vanishes.

I'm afraid somebody has to debug this stuff to find the bug in gcc.

The good news is, the bug doesn't seem to show up in mainline or releases
before gcc 3.3.

Comment 9 Volker Reichelt 2003-07-03 23:04:04 UTC
It looks as if the problem appeared with Eric's patch

Comment 10 Wolfgang Bangerth 2003-07-03 23:07:47 UTC
Created attachment 4342 [details]
Now at 314 lines

Now at 314 lines. I start to think that this bug is essentially unrelated to
-frename-registers, since random changes either make the segfault go
away when compiled with -frename-registers, or let a segfault appear when

Will try further.

Comment 11 Wolfgang Bangerth 2003-07-04 00:31:03 UTC
Created attachment 4343 [details]
Now at 203 lines

Still very fragile testcase :-( It's now at 203 lines, and we get this

tmp/gg> ~/bin/gcc-3.3/bin/c++ x.cc -O2 && ./a.out
tmp/gg> ~/bin/gcc-3.3/bin/c++ x.cc -O2 -frename-registers && ./a.out
Segmentation fault
Comment 12 Andrew Pinski 2003-07-04 01:18:39 UTC
If it was Eric's patch, then this is a binutils bug because the asm looks the same between the two 
versions (except for use of different registers), note this is still present in the latest cvs version of 
GNU assembler version 2.14.90 (i686-pc-linux-gnu) using BFD version 2.14.90 20030703.
Comment 13 Wolfgang Bangerth 2003-07-04 01:56:45 UTC
Created attachment 4344 [details]
Now 122 lines

OK, it's now 122 lines and I gotta go. Should now be a little simpler to
figure out a) what the program is supposed to do, and b) why it's crashing.

Comment 14 Andrew Pinski 2003-07-04 02:23:47 UTC
Good asm:
	pushl	%ebp
	movl	%esp, %ebp
	subl	$56, %esp
	movl	$0, -16(%ebp)
	movl	8(%ebp), %eax
	movl	12(%ebp), %edx
	movl	$0, -52(%ebp)
	movl	(%eax), %ecx
	movl	%eax, -24(%ebp)
	movl	-32(%ebp), %eax
	movl	%ecx, -20(%ebp)
	movl	$0, -32(%ebp)
	movl	%eax, -12(%ebp)
	movl	(%edx), %eax
	movl	%ecx, -40(%ebp)
	movl	%edx, -40(%ebp)
	movl	%eax, -36(%ebp)
	movl	%eax, -56(%ebp)
	movl	-48(%ebp), %eax
	movl	%eax, -28(%ebp)
	xorl	%eax, %eax
	cmpl	(%ecx), %eax; dies here in bad, ecx = -40(%ebp)


	pushl	%ebp
	movl	%esp, %ebp
	subl	$56, %esp
	movl	$0, -16(%ebp)
	movl	8(%ebp), %eax
	movl	12(%ebp), %edx
	movl	$0, -52(%ebp)
	movl	(%eax), %ecx
	movl	%eax, -24(%ebp)
	movl	-32(%ebp), %eax
	movl	%edx, -40(%ebp)
	movl	%ecx, -20(%ebp)
	movl	%eax, -12(%ebp)
	movl	(%edx), %eax
	movl	-48(%ebp), %edx
	movl	$0, -32(%ebp)
	movl	%eax, -36(%ebp)
	movl	%edx, -28(%ebp)
	xorl	%edx, %edx
	cmpl	(%ecx), %edx ; dies here, ecx =  -20(%ebp), not the -40(%ebp) where the good 
one is.
Comment 15 Eric Botcazou 2003-07-04 08:02:50 UTC
Not quite. Don't forget that the AT&T style uses 'mov src,dest'.

According to GDB, the problem is rather here:
0x08048458      29            if (j < size2_ && i-j < 2)

 804844f:       dd 00                   fldl   (%eax)
 8048451:       8b 4d d8                mov    0xffffffd8(%ebp),%ecx
 8048454:       8b 11                   mov    (%ecx),%edx
 8048456:       31 c9                   xor    %ecx,%ecx
 8048458:       3b 0a                   cmp    (%edx),%ecx
Comment 16 Eric Botcazou 2003-07-04 09:20:19 UTC
Here's the code at stake (from the buggy version, but functionally identical to
the correct version):

	fldl	(%eax)
	movl	-40(%ebp), %ecx
	movl	(%ecx), %edx
	xorl	%ecx, %ecx
	cmpl	(%edx), %ecx

The problem is the contents of -40(%ebp).

In the correct version, -40(%ebp) is written twice in a row

	movl	(%edx), %eax
	movl	%ecx, -40(%ebp)
	movl	%edx, -40(%ebp)
	movl	%eax, -36(%ebp)

In the buggy version, -40(%ebp) is written twice too, but in the reverse order

	movl	%edx, -40(%ebp)
	movl	%ecx, -20(%ebp)
	movl	%eax, -12(%ebp)
	movl	(%edx), %eax
	movl	-48(%ebp), %edx
	movl	$0, -32(%ebp)
	movl	%eax, -36(%ebp)
	movl	%edx, -28(%ebp)
	xorl	%edx, %edx
	cmpl	(%ecx), %edx
	movl	%ecx, -40(%ebp)

The 2nd scheduling pass swaps the two sets, because the memory locations are not
marked as aliasing:

(insn:HI 27 21 40 0 0x4020fc08 (set (mem/s:SI (plus:SI (reg/f:SI 6 ebp)
                (const_int -40 [0xffffffd8])) [7 <variable>.dead1+0 S4 A32])
        (reg/v/f:SI 2 ecx [64])) 38 {*movsi_1} (insn_list 21 (nil))

(insn:HI 83 73 87 0 0x4020fd10 (set (mem/s:SI (plus:SI (reg/f:SI 6 ebp)
                (const_int -40 [0xffffffd8])) [5 <variable>.a+0 S4 A32])
        (reg/v/u/f:SI 1 edx [59])) 38 {*movsi_1} (nil)
    (expr_list:REG_DEAD (reg/v/u/f:SI 1 edx [59])

If I understand correctly the code, the variable 'dead1' comes from the first
iterator it1, while the variable 'a' comes from the second iterator it2. For
some reason, they are wrongly given the same stack slot.

Comment 17 Andreas Schwab 2003-07-04 09:38:28 UTC
Could this be the same as bug 11320? 
Comment 18 Eric Botcazou 2003-07-04 13:12:04 UTC
If bug 11320 is indeed a reload bug, no.

The problem pertains to aliasing here: the slot assigned to the temporary


is reused for

   adaptor<>::a_iterator1 it2

because the types conflict according to objects_must_conflict_p. The reasoning
is probably that, since the types conflicts, operations on objects of these
types cannot be reordered by the scheduling passes later.

The problem is that at least one member of the first type (dead1) doesn't
conflict with at least one member of the second type (a), so the scheduling pass
swaps insn 27 and insn 83.
Comment 19 CVS Commits 2003-07-07 07:25:40 UTC
Subject: Bug 11198

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	ebotcazou@gcc.gnu.org	2003-07-07 07:25:36

Modified files:
	gcc            : ChangeLog alias.c 
	gcc/testsuite  : ChangeLog 
Added files:
	gcc/testsuite/g++.dg/opt: stack1.C 

Log message:
	PR optimization/11198
	* alias.c (objects_must_conflict_p): Return 1 if the types have
	the same alias set, not if the alias sets only conflict.


Comment 21 Eric Botcazou 2003-07-07 07:37:27 UTC
See http://gcc.gnu.org/ml/gcc-patches/2003-07/msg00564.html