This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
Re: grokfield miscompiled with -O9
On Jan 8, 2000, "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de> wrote:
>> No. Someone should sit down and figure out what the actual problem
>> is.
> Ok. Here is as far as I got.
We've been duplicating work :-(
I was ready to start writing this message when I received yours.
First of all, I derived a minimal testcase from a crash within
stage2 ix86_can_use_return_insn_p(), while compiling stage2 crtend.c.
reglimit would not be properly initialized. The function had been
incorrectly compiled by stage1, so I simplified it as follows. It is
miscompiled on x86 with the current gcc CVS tree, with -O3:
void foo () {}
int main () {
int i;
asm ("" : "=r" (i) : "0" (-1));
asm ("" : "=r" (i) : "0" (i ? 1 : 2));
if (i != 1)
abort();
}
The fact that foo() was needed to trigger the problem indicated that
there was some information propagating from one function to another,
and this shouldn't happen.
Anyway, I spent a lot of time to try to figure out what was wrong. I
found that jump would convert the RTL for (i ? 1 : 2) to something
as follows (this is actually from the flow dump, but it is simpler
than the result of jump, and contains enough information to trigger
the bug):
(insn 39 37 40 (set (reg:CCNO 17 flags)
(compare:CCNO (reg:SI 27)
(const_int 0 [0x0]))) -1 (nil)
(nil))
(insn 40 39 41 (set (subreg:QI (reg:SI 28) 0)
(eq:QI (reg:CCNO 17 flags)
(const_int 0 [0x0]))) -1 (nil)
(nil))
(insn 41 40 42 (parallel[
(set (reg:SI 28)
(zero_extend:SI (subreg:QI (reg:SI 28) 0)))
(clobber (reg:CC 17 flags))
] ) -1 (nil)
(nil))
(insn 42 41 44 (parallel[
(set (reg:SI 28)
(plus:SI (reg:SI 28)
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
] ) -1 (nil)
(nil))
It turned out that expand_compound_operation would disregard the
zero_extend because reg_nonzero_bits[28] said only its two
least-significant bits could be zero, so it would combine insns 41 and
42, resulting:
(insn 42 41 23 (parallel[
(set (reg:SI 28)
(plus:SI (reg:SI 28)
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
] ) 174 {*addsi_1} (insn_list 40 (nil))
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
So the higher-order bytes never get zeroed out. In fact, the assembly
code generated for main contains:
orl $-1, %eax
testl %eax, %eax
sete %al
incl %eax
cmpl $1, %eax
The real problem was that insn 40 shouldn't have been emitted like
that. I believe subregs should never be used to initialize a
register like that. So I tried to find out why it was being generated
like that, and I found, in i386.c:
int
ix86_expand_setcc (code, unordered, dest)
[snip]
if (!cse_not_expected)
tmp = gen_reg_rtx (QImode);
else
tmp = gen_lowpart (QImode, dest);
But when the jump pass (indirectly) called this function,
cse_not_expected had not been cleared after the delayed generation of
foo(). So, instead of generating a separate QImode virtual reg, it
re-used dest.
So I installed the following patch, to initialize cse_not_expected in
the beginning of rest_of_compilation. It allowed me to bootstrap with
BOOT_CFLAGS=-O3 without having to revert the patch for
jump_optimize_1(). In fact, reverting the patch helped hide the bug
in this case, because jump_optimize_1() would create itself another
register.
Ok to install this one?
Index: gcc/ChangeLog
from Alexandre Oliva <oliva@lsd.ic.unicamp.br>
* toplev.c (rest_of_compilation): Initialize cse_not_expected as
in prepare_function_start().
Index: gcc/toplev.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/toplev.c,v
retrieving revision 1.276
diff -u -r1.276 toplev.c
--- gcc/toplev.c 2000/01/07 00:16:51 1.276
+++ gcc/toplev.c 2000/01/08 16:58:36
@@ -2777,6 +2777,10 @@
int failure = 0;
int rebuild_label_notes_after_reload;
+ /* When processing delayed functions, prepare_function_start() won't
+ have been run to re-initialize it. */
+ cse_not_expected = ! optimize;
+
/* First, remove any notes we don't need. That will make iterating
over the instruction sequence faster, and allow the garbage
collector to reclaim the memory used by the notes. */
--
Alexandre Oliva http://www.ic.unicamp.br/~oliva IC-Unicamp, Bra[sz]il
oliva@{lsd.ic.unicamp.br,guarana.{org,com}} aoliva@{acm,computer}.org
oliva@{gnu.org,kaffe.org,{egcs,sourceware}.cygnus.com,samba.org}
** I may forward mail about projects to mailing lists; please use them