Bug 20973 - [4.0 Regression] kdelibs (khtml) miscompiled by reload
Summary: [4.0 Regression] kdelibs (khtml) miscompiled by reload
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.1
Assignee: Not yet assigned to anyone
Keywords: wrong-code
Depends on:
Blocks: 20949
  Show dependency treegraph
Reported: 2005-04-12 17:28 UTC by Michael Matz
Modified: 2005-04-23 18:52 UTC (History)
8 users (show)

See Also:
Host: i386-linux
Known to work:
Known to fail:
Last reconfirmed: 2005-04-12 18:01:07

the preprocessed source showing the problem (201.04 KB, application/octet-stream)
2005-04-12 17:30 UTC, Michael Matz
Patch for this problem (913 bytes, patch)
2005-04-12 19:47 UTC, Michael Matz
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Matz 2005-04-12 17:28:56 UTC
gcc 4 (about RC1) miscompiles khtml, in fact something in CSS, which basically 
leads to all websites being misrendered.  I can't easily reduce the testcase, 
but have applied the whole preprocessed source of css/cssstyleselector.ii. 
It is to be compiled with g++ -O2 -fPIC -march=i586 -mtune=i686 
-fno-exceptions.  A more detailed analysis will follow, as we've found out 
some things already.
Comment 1 Michael Matz 2005-04-12 17:30:29 UTC
Created attachment 8610 [details]
the preprocessed source showing the problem
Comment 2 Andrew Pinski 2005-04-12 17:33:00 UTC
Reload, wow, not that unexcepted really.
Comment 3 Michael Matz 2005-04-12 17:34:51 UTC
Here some mails we exchanged:  
Adding something complex seems to change how  
things are allocated or REG_DEAD notes distributed or something like this.  
If we add 'kdDebug( 6080 ) << "applying property " << id << endl;' at the  
very start, it seems to work.  If we comment it out, and add a  
+       printf ("apply %d id %d\n", apply, id);  
        if(!apply) return;  
            case CSS_PROP_MAX_WIDTH:  
at line 2834, one can see where it goes wrong.  The printf will show id  
being 0 every time.  In the debugger printing 'id' (which refers to the  
stack position) shows the correct value.  At that point the compiler has  
placed id into %esi, which indeed contains 0 here.  
This $esi is loaded at the beginning of the function with the argument,  
and the first time it changes (when setting a watchpoint on $esi in gdb)  
is at line 2823:  
                l = Length(primitiveValue->computeLength(style,  
paintDeviceMetrics), Fixed,  
This corresponds to this insn:  
0x4028a449 : and    $0xf0000000,%esi  
After that, until the printf $esi is not correctly loaded back from the  
parameter stack slot, and remains 0.  The above insn corresponds to this  
(insn 12371 55063 12368  
/usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel [  
            (set (reg:SI 4 si [orig:3843 D.83927 ] [3843])  
                (and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843])  
                    (const_int -268435456 [0xf0000000])))  
            (clobber (reg:CC 17 flags))  
        ]) 200 {*andsi_1} (nil)  
(I think, there are multiple insns masking the high 4 bit out of %esi, and  
I've not yet traced through all paths).  So somehow I think GCC put  
something into %esi while it still was live.  Wasn't there some error in  
placing REG_DEAD notes just a few days ago?  
Comment 4 Michael Matz 2005-04-12 17:37:08 UTC
Another mail: 
I think the dust settles a bit.  We have this situation in 
.t69.final_cleanup (excerpt from applyRule): 
  struct Length D.83927; 
  D.83927.D.21947.l.value = (int) primitiveValue->m_value.num; 
  D.83927.D.21947.l.type = 1; 
  D.83927.D.21947.l.quirk = 0; 
  l = D.83927; 
  apply = 1; 
  printf (&"apply %d id %d\n"[0], (int) apply, id); 
  if (apply == 0) goto <L9089>; else goto <L2910>; 
The above is the only use of D.83927, so it first is initialized 
memberwise, and then copied as a whole.  This corresponds to these insns 
from .22.lreg (not the full above sequence, but only the interesting 
part).  Reg 4101 is 'l' and is constructed piecewise.  Reg 4197 is 'id'. 
;; Start of basic block 1449, registers live: 3 [bx] 6 [bp] 7 [sp] 16 
   [argp] 18 [fpsr] 20 [frame] 3669 3843 4196 4197 
  (insn:HI 12371 12370 12372 1449 
  /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel 
            (set (reg/v:SI 4101 [ l ]) 
                (and:SI (reg:SI 3843 [ D.83927 ]) 
                    (const_int -268435456 [0xf0000000]))) 
            (clobber (reg:CC 17 flags)) 
        ]) 200 {*andsi_1} (nil) 
    (expr_list:REG_UNUSED (reg:CC 17 flags) 
        (expr_list:REG_DEAD (reg:SI 3843 [ D.83927 ]) 
            (expr_list:REG_UNUSED (reg:CC 17 flags) 
Note how reg 3843 dies here.  Additionally it should be mentioned that 
it's never initialized (it's live up through the first insn), as it 
probably represents the yet-uninitialized part of a bitfield in 
construction.  After the above follows some more bitmanipulation on reg 
4101, setting finally all fields, then the block ends: 
;; End of basic block 1449, registers live: 
 3 [bx] 6 [bp] 7 [sp] 16 [argp] 20 [frame] 3668 4101 4196 4197 
;; Start of basic block 1450, registers live: 3 [bx] 6 [bp] 7 [sp] 16 
   [argp] 20 [frame] 3668 4101 4196 4197 
(insn:HI 12382 12381 12383 1450 
 (set (mem:SI (plus:SI (reg/f:SI 7 sp) 
                (const_int 8 [0x8])) [0 S4 A32]) 
        (reg/v:SI 4197 [ id ])) 35 {*movsi_1} (nil) 
So, and this is the setup of the arguments to the printf.  So, we know 
that 4197 is live throughout block 1449, and correctly conflicts with reg 
4101 (see below), which is the only register set here, by the bitmasking 
insn 12371. 
Now we look at .23.greg dump, first the interesting conflicts: 
;; 3843 conflicts: 
;; 4101 conflicts: ... 4197 ... 
;; 4197 conflicts: ... 4101 ... 
So, all is well.  But then reload goes and breaks it: 
Spilling for insn 12371. 
Spilling for insn 12371. 
Reloads for insn # 12371 
Reload 0: reload_in (SI) = (reg:SI 4 si [orig:3843 D.83927 ] [3843]) 
        reload_out (SI) = (reg/v:SI 4101 [ l ]) 
        GENERAL_REGS, RELOAD_OTHER (opnum = 0) 
        reload_in_reg: (reg:SI 4 si [orig:3843 D.83927 ] [3843]) 
        reload_out_reg: (reg/v:SI 4101 [ l ]) 
        reload_reg_rtx: (reg:SI 4 si [orig:3843 D.83927 ] [3843]) 
Yes, right, reload needs to fix up the insn to make both registers the 
same and it wants to use $esi as reg for reg3843.  Problem is, it wants 
also to alloc reg4197 to $esi: 
3826 in 0  3843 in 4  3844 in 4  3846 in 4  3847 in 1  3912 in 0 
4173 in 5  4180 in 4  4197 in 4  4198 in 5  4199 in 0  4202 in 0 
(4101 is not allocated by the way because reload is able to optimize it 
out, and instead uses 3843 in it's place).  So this then results in this 
(insn:HI 12371 12370 43006 
1448 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/mis 
            (set (reg:SI 4 si [orig:3843 D.83927 ] [3843]) 
                (and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843]) 
                    (const_int -268435456 [0xf0000000]))) 
            (clobber (reg:CC 17 flags)) 
        ]) 200 {*andsi_1} (nil) 
at which point %esi is lost.  I think the problem is, that global-alloc 
uses %esi for 4197 and 3843, which is okay, as 3843 is uninitialized and 
hence doesn't conflict with 4197.  But reload then goes on and actually 
uses this uninitialized register for _setting_.  Basically it transforms 
this insn: 
  rA <- op (rB) 
where rA and rB need to match into 
  rB <- op (rB) 
   [rA <- rB] 
where the latter move is not emitted because rA could be optimized away 
(rA is our 4101). 
This introduces a new setting of rB, which is a problem if rB was 
uninitialized.  Normally reload should have generated 
  rA <- rB 
  rA <- op (rA) 
Comment 5 Jakub Jelinek 2005-04-12 18:01:06 UTC
For those who want a quick workaround, changing khtmllayout.h to:
-        Length(LengthType t) { l.value = 0; l.type = t; l.quirk = 0; }
+        Length(LengthType t) { _length = 0; l.type = t; }
         Length(int v, LengthType t, bool q=false)
-        { l.value = v; l.type = t; l.quirk = q; }
+        { _length = 0; l.value = v; l.type = t; l.quirk = q; }

makes the bug go away.  The first changed line actually might result in
better generated code with the current GCC's deficiencies in bitfield handling.

I guess we need a bitfield optimization pass near the end of tree passes that
will figure out bitfield accesses to the same underlying field and change that
into arithmetics on a union or something.  Don't even look at assembly for
that RTL optimizers can't figure out, but a tree optimizer could turn that
into 2 instructions:
struct { int a:2,b:2,c:2,d:2,e:2,f:2,g:2,h:2,i:2,j:2,k:2,l:2,m:2,n:2;} s;
void foo (void)
  s.a |= 1; s.b &= 1; s.c |= 1; s.d &= 1; s.e |= 1; s.f &= 1; s.g |= 1;
  s.h &= 1; s.i |= 1; s.j &= 1; s.k |= 1; s.l &= 1; s.m |= 1; s.n &= 1;
Comment 6 Michael Matz 2005-04-12 19:05:06 UTC
The problem is in reload.c:find_dummy_reload.  It tries to use the input reg 
as reload register for an in-out reload and has certain conditions when it 
can't do so: 
  /* Consider using IN if OUT was not acceptable 
     or if OUT dies in this insn (like the quotient in a divmod insn). 
     We can't use IN unless it is dies in this insn, 
     which means we must know accurately which hard regs are live. 
     Also, the result can't go in IN if IN is used within OUT, 
     or if OUT is an earlyclobber and IN appears elsewhere in the insn.  */ 
  if (hard_regs_live_known 
      && REG_P (in) 
      && (value == 0 
          || find_reg_note (this_insn, REG_UNUSED, real_out)) 
      && find_reg_note (this_insn, REG_DEAD, real_in) 
      && !fixed_regs[REGNO (in)] 
      && HARD_REGNO_MODE_OK (REGNO (in), 
But this doesn't check if IN is used uninitialized.  In that case it also 
can't be used.  It's not immediately clear to me how to check for this, 
as nowhere is it noted that this or that pseudo is actually uninitialized 
and only therefore got a register by global.c.  One could look at the  
global_live_at_start of the first block, if it mentions the original pseudo 
number of the IN operand. 
The problem of course being that at this point the hardregs already are 
substituted into the REG expressions.  So one would have to trust the 
original regnos noted there.  And it's not clear that this is the only 
place in reload which is confused by hardregs corresponding to uninitialized 
Comment 7 Michael Matz 2005-04-12 19:47:18 UTC
I have a patch for reload, which fixes the bug, when looking at the dumps. 
At least now find_reg is used for the insn in question, which also evicts 
pseudos using the same reg as the chosen final reg_rtx.  I have only tested 
this by copying around the generated .s file, not by building a new compiler. 
(Hence it's also obviously not regression or bootstrap tested, but I'll do 
this now.) 
Comment 8 Michael Matz 2005-04-12 19:47:54 UTC
Created attachment 8614 [details]
Patch for this problem
Comment 9 Dirk Mueller 2005-04-13 01:54:37 UTC
will test the patch with a KDE recompilation. 
Comment 10 Michael Matz 2005-04-18 14:22:23 UTC
This patch fixes the regressions in khtml for us. 
Comment 11 Mark Mitchell 2005-04-18 14:42:07 UTC
Michael, have you run the GCC testsuite with your patch?  If not, please run on
several platforms and confirm that you get no regressions.
Comment 12 Michael Matz 2005-04-18 14:51:09 UTC
where I submitted the patch: 
  the problem in khtml.  I've bootstrapped it with gcc 4.0 on  
  i686,x86_64,ppc,ppc64,ia64,s390 (s390x was breaking for different  
  reasons), all languages (with Ada ;) ).  There were no regressions (in  
  fact some fixed Ada testcases, but I'm not sure if they were real). 
Note the last sentence. 
Comment 13 Jim Wilson 2005-04-19 06:22:15 UTC
The reload patch seems reasonable, though not very satisfying.  It is only
disabling some optimizations within reload, and hence should be somewhat safe. 
Though it is still a reload patch, and all reload patches are dangerous.

There is a hunk of code added twice, but only one of the two instances is
commented.  The other one should have a comment also.

Another way to look at this problem is to point out that we have an
uninitialized register, and claim that this is wrong.

This is actually dangerous on Itanium.  Suppose we call a hand-written glibc
routine which leaves a few NaT bits lying around.  Suppose we then assign the
uninitialized pseudo to one of these registers with a NaT bit set.  When we read
the register, we get an unhandled NaT consumption fault, and the program dies. 
This isn't hypothetical.  This actually happened once in the early days of the
Itanium port.  It is rather hard to reproduce, as gcc doesn't have speculation
support yet, so it is rare for a register to have the NaT bit set.  There are
only a few hand written library routines that can set NaT bits.  This will be a
more serious problem once we do have speculation support.

The old code for structures was careful to always initialize the structure
before doing any piecemeal initialization, if the structure was in a register. 
This way, we never had any uses of an unitialized register.  This is done in
store_constructor for instance.

Unfortunately, this isn't true now that we have tree-ssa, because structure
references are decomposed, and there is no longer any attempt to initialize the
whole structure if it is allocated to a register.

This will have to be fixed for Itanium.  All we need is an instruction to set
the uninitialized register to zero immediately before it is used.  It will then
be optimized away by combine, which will convert
  (set reg 0)
  (set reg (ior reg constant))
  (set reg constant)

Since we must eliminate the uninitialized register uses anyways, we perhaps
don't have to fix the reload issue.  It is probably harmless to make both fixes
though.  The reload patch should never trigger once the uninitialized register
problem is fixed.
Comment 14 Michael Matz 2005-04-19 16:51:08 UTC
I agree with most of what Jim said.  Except for the part that we maybe don't 
have to fix the reload issue, when we fix usage of the uninitialized register 
for piecewise struct initialization.  The latter will fix this particular 
instance of the problem, but reload still would be confused by uninitialized 
registers.  I think they can happen also for other reasons, like the user 
having some uninitialized variables, which perhaps never are used at runtime, 
but still could result in reload miscompiling the program the same way as  
seen here.  As reload bugs are particular hard to track down, I really think 
we should fix this one for good, that it doesn't come back biting us in the 
If agreed I will cleanup the patch to comment both locations (I don't think 
it would deserve an own subfunction). 
Comment 15 GCC Commits 2005-04-22 17:30:55 UTC
Subject: Bug 20973

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-4_0-branch
Changes by:	matz@gcc.gnu.org	2005-04-22 17:30:21

Modified files:
	gcc            : ChangeLog reload.c 

Log message:
	PR middle-end/20973
	* reload.c (push_reload, find_dummy_reload): Check for uninitialized


Comment 16 Andrew Pinski 2005-04-22 17:58:51 UTC
Comment 17 Paul Schlie 2005-04-22 18:01:17 UTC
(In reply to comment #15)
> Subject: Bug 20973
> Branch:         gcc-4_0-branch
> Changes by:     matz@gcc.gnu.org        2005-04-22 17:30:21
> Modified files:
>         gcc            : ChangeLog reload.c 

Is 4.1 intended to have the same behavior as well?
Comment 18 Andrew Pinski 2005-04-22 18:03:11 UTC
(In reply to comment #17)
> Is 4.1 intended to have the same behavior as well?
the mainline was fixed already except it did not show up here: