gcc 4 (about RC1) miscompiles khtml, in fact something in CSS, which basically leads to all websites being misrendered. I can't easily reduce the testcase, but have applied the whole preprocessed source of css/cssstyleselector.ii. It is to be compiled with g++ -O2 -fPIC -march=i586 -mtune=i686 -fno-exceptions. A more detailed analysis will follow, as we've found out some things already.
Created attachment 8610 [details] the preprocessed source showing the problem
Reload, wow, not that unexcepted really.
Here some mails we exchanged: Adding something complex seems to change how things are allocated or REG_DEAD notes distributed or something like this. If we add 'kdDebug( 6080 ) << "applying property " << id << endl;' at the very start, it seems to work. If we comment it out, and add a + printf ("apply %d id %d\n", apply, id); if(!apply) return; switch(id) { case CSS_PROP_MAX_WIDTH: at line 2834, one can see where it goes wrong. The printf will show id being 0 every time. In the debugger printing 'id' (which refers to the stack position) shows the correct value. At that point the compiler has placed id into %esi, which indeed contains 0 here. This $esi is loaded at the beginning of the function with the argument, and the first time it changes (when setting a watchpoint on $esi in gdb) is at line 2823: l = Length(primitiveValue->computeLength(style, paintDeviceMetrics), Fixed, primitiveValue->isQuirkValue()); This corresponds to this insn: 0x4028a449 : and $0xf0000000,%esi After that, until the printf $esi is not correctly loaded back from the parameter stack slot, and remains 0. The above insn corresponds to this RTL: (insn 12371 55063 12368 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel [ (set (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (const_int -268435456 [0xf0000000]))) (clobber (reg:CC 17 flags)) ]) 200 {*andsi_1} (nil) (I think, there are multiple insns masking the high 4 bit out of %esi, and I've not yet traced through all paths). So somehow I think GCC put something into %esi while it still was live. Wasn't there some error in placing REG_DEAD notes just a few days ago?
Another mail: I think the dust settles a bit. We have this situation in .t69.final_cleanup (excerpt from applyRule): struct Length D.83927; <L2904>:; D.83927.D.21947.l.value = (int) primitiveValue->m_value.num; D.83927.D.21947.l.type = 1; D.83927.D.21947.l.quirk = 0; l = D.83927; apply = 1; <L2908>:; printf (&"apply %d id %d\n"[0], (int) apply, id); if (apply == 0) goto <L9089>; else goto <L2910>; The above is the only use of D.83927, so it first is initialized memberwise, and then copied as a whole. This corresponds to these insns from .22.lreg (not the full above sequence, but only the interesting part). Reg 4101 is 'l' and is constructed piecewise. Reg 4197 is 'id'. ;; Start of basic block 1449, registers live: 3 [bx] 6 [bp] 7 [sp] 16 [argp] 18 [fpsr] 20 [frame] 3669 3843 4196 4197 .... (insn:HI 12371 12370 12372 1449 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel [ (set (reg/v:SI 4101 [ l ]) (and:SI (reg:SI 3843 [ D.83927 ]) (const_int -268435456 [0xf0000000]))) (clobber (reg:CC 17 flags)) ]) 200 {*andsi_1} (nil) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_DEAD (reg:SI 3843 [ D.83927 ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))))) Note how reg 3843 dies here. Additionally it should be mentioned that it's never initialized (it's live up through the first insn), as it probably represents the yet-uninitialized part of a bitfield in construction. After the above follows some more bitmanipulation on reg 4101, setting finally all fields, then the block ends: ;; End of basic block 1449, registers live: 3 [bx] 6 [bp] 7 [sp] 16 [argp] 20 [frame] 3668 4101 4196 4197 ;; Start of basic block 1450, registers live: 3 [bx] 6 [bp] 7 [sp] 16 [argp] 20 [frame] 3668 4101 4196 4197 ... (insn:HI 12382 12381 12383 1450 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/css/cssstyleselector.cpp:2834 (set (mem:SI (plus:SI (reg/f:SI 7 sp) (const_int 8 [0x8])) [0 S4 A32]) (reg/v:SI 4197 [ id ])) 35 {*movsi_1} (nil) (nil)) So, and this is the setup of the arguments to the printf. So, we know that 4197 is live throughout block 1449, and correctly conflicts with reg 4101 (see below), which is the only register set here, by the bitmasking insn 12371. Now we look at .23.greg dump, first the interesting conflicts: ;; 3843 conflicts: ;; 4101 conflicts: ... 4197 ... ;; 4197 conflicts: ... 4101 ... So, all is well. But then reload goes and breaks it: ... Spilling for insn 12371. Spilling for insn 12371. Reloads for insn # 12371 Reload 0: reload_in (SI) = (reg:SI 4 si [orig:3843 D.83927 ] [3843]) reload_out (SI) = (reg/v:SI 4101 [ l ]) GENERAL_REGS, RELOAD_OTHER (opnum = 0) reload_in_reg: (reg:SI 4 si [orig:3843 D.83927 ] [3843]) reload_out_reg: (reg/v:SI 4101 [ l ]) reload_reg_rtx: (reg:SI 4 si [orig:3843 D.83927 ] [3843]) ... Yes, right, reload needs to fix up the insn to make both registers the same and it wants to use $esi as reg for reg3843. Problem is, it wants also to alloc reg4197 to $esi: ... 3826 in 0 3843 in 4 3844 in 4 3846 in 4 3847 in 1 3912 in 0 4173 in 5 4180 in 4 4197 in 4 4198 in 5 4199 in 0 4202 in 0 ... (4101 is not allocated by the way because reload is able to optimize it out, and instead uses 3843 in it's place). So this then results in this insn: (insn:HI 12371 12370 43006 1448 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/mis (set (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843]) (const_int -268435456 [0xf0000000]))) (clobber (reg:CC 17 flags)) ]) 200 {*andsi_1} (nil) at which point %esi is lost. I think the problem is, that global-alloc uses %esi for 4197 and 3843, which is okay, as 3843 is uninitialized and hence doesn't conflict with 4197. But reload then goes on and actually uses this uninitialized register for _setting_. Basically it transforms this insn: rA <- op (rB) where rA and rB need to match into rB <- op (rB) [rA <- rB] where the latter move is not emitted because rA could be optimized away (rA is our 4101). This introduces a new setting of rB, which is a problem if rB was uninitialized. Normally reload should have generated rA <- rB rA <- op (rA)
For those who want a quick workaround, changing khtmllayout.h to: - Length(LengthType t) { l.value = 0; l.type = t; l.quirk = 0; } + Length(LengthType t) { _length = 0; l.type = t; } Length(int v, LengthType t, bool q=false) - { l.value = v; l.type = t; l.quirk = q; } + { _length = 0; l.value = v; l.type = t; l.quirk = q; } makes the bug go away. The first changed line actually might result in better generated code with the current GCC's deficiencies in bitfield handling. I guess we need a bitfield optimization pass near the end of tree passes that will figure out bitfield accesses to the same underlying field and change that into arithmetics on a union or something. Don't even look at assembly for that RTL optimizers can't figure out, but a tree optimizer could turn that into 2 instructions: struct { int a:2,b:2,c:2,d:2,e:2,f:2,g:2,h:2,i:2,j:2,k:2,l:2,m:2,n:2;} s; void foo (void) { s.a |= 1; s.b &= 1; s.c |= 1; s.d &= 1; s.e |= 1; s.f &= 1; s.g |= 1; s.h &= 1; s.i |= 1; s.j &= 1; s.k |= 1; s.l &= 1; s.m |= 1; s.n &= 1; }
The problem is in reload.c:find_dummy_reload. It tries to use the input reg as reload register for an in-out reload and has certain conditions when it can't do so: /* Consider using IN if OUT was not acceptable or if OUT dies in this insn (like the quotient in a divmod insn). We can't use IN unless it is dies in this insn, which means we must know accurately which hard regs are live. Also, the result can't go in IN if IN is used within OUT, or if OUT is an earlyclobber and IN appears elsewhere in the insn. */ if (hard_regs_live_known && REG_P (in) && REGNO (in) < FIRST_PSEUDO_REGISTER && (value == 0 || find_reg_note (this_insn, REG_UNUSED, real_out)) && find_reg_note (this_insn, REG_DEAD, real_in) && !fixed_regs[REGNO (in)] && HARD_REGNO_MODE_OK (REGNO (in), But this doesn't check if IN is used uninitialized. In that case it also can't be used. It's not immediately clear to me how to check for this, as nowhere is it noted that this or that pseudo is actually uninitialized and only therefore got a register by global.c. One could look at the global_live_at_start of the first block, if it mentions the original pseudo number of the IN operand. The problem of course being that at this point the hardregs already are substituted into the REG expressions. So one would have to trust the original regnos noted there. And it's not clear that this is the only place in reload which is confused by hardregs corresponding to uninitialized pseudos.
I have a patch for reload, which fixes the bug, when looking at the dumps. At least now find_reg is used for the insn in question, which also evicts pseudos using the same reg as the chosen final reg_rtx. I have only tested this by copying around the generated .s file, not by building a new compiler. (Hence it's also obviously not regression or bootstrap tested, but I'll do this now.)
Created attachment 8614 [details] Patch for this problem
will test the patch with a KDE recompilation.
This patch fixes the regressions in khtml for us.
Michael, have you run the GCC testsuite with your patch? If not, please run on several platforms and confirm that you get no regressions.
From http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01508.html where I submitted the patch: the problem in khtml. I've bootstrapped it with gcc 4.0 on i686,x86_64,ppc,ppc64,ia64,s390 (s390x was breaking for different reasons), all languages (with Ada ;) ). There were no regressions (in fact some fixed Ada testcases, but I'm not sure if they were real). Note the last sentence.
The reload patch seems reasonable, though not very satisfying. It is only disabling some optimizations within reload, and hence should be somewhat safe. Though it is still a reload patch, and all reload patches are dangerous. There is a hunk of code added twice, but only one of the two instances is commented. The other one should have a comment also. Another way to look at this problem is to point out that we have an uninitialized register, and claim that this is wrong. This is actually dangerous on Itanium. Suppose we call a hand-written glibc routine which leaves a few NaT bits lying around. Suppose we then assign the uninitialized pseudo to one of these registers with a NaT bit set. When we read the register, we get an unhandled NaT consumption fault, and the program dies. This isn't hypothetical. This actually happened once in the early days of the Itanium port. It is rather hard to reproduce, as gcc doesn't have speculation support yet, so it is rare for a register to have the NaT bit set. There are only a few hand written library routines that can set NaT bits. This will be a more serious problem once we do have speculation support. The old code for structures was careful to always initialize the structure before doing any piecemeal initialization, if the structure was in a register. This way, we never had any uses of an unitialized register. This is done in store_constructor for instance. Unfortunately, this isn't true now that we have tree-ssa, because structure references are decomposed, and there is no longer any attempt to initialize the whole structure if it is allocated to a register. This will have to be fixed for Itanium. All we need is an instruction to set the uninitialized register to zero immediately before it is used. It will then be optimized away by combine, which will convert (set reg 0) (set reg (ior reg constant)) into (set reg constant) Since we must eliminate the uninitialized register uses anyways, we perhaps don't have to fix the reload issue. It is probably harmless to make both fixes though. The reload patch should never trigger once the uninitialized register problem is fixed.
I agree with most of what Jim said. Except for the part that we maybe don't have to fix the reload issue, when we fix usage of the uninitialized register for piecewise struct initialization. The latter will fix this particular instance of the problem, but reload still would be confused by uninitialized registers. I think they can happen also for other reasons, like the user having some uninitialized variables, which perhaps never are used at runtime, but still could result in reload miscompiling the program the same way as seen here. As reload bugs are particular hard to track down, I really think we should fix this one for good, that it doesn't come back biting us in the future. If agreed I will cleanup the patch to comment both locations (I don't think it would deserve an own subfunction).
Subject: Bug 20973 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-4_0-branch Changes by: matz@gcc.gnu.org 2005-04-22 17:30:21 Modified files: gcc : ChangeLog reload.c Log message: PR middle-end/20973 * reload.c (push_reload, find_dummy_reload): Check for uninitialized pseudos. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.7592.2.177&r2=2.7592.2.178 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/reload.c.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.268&r2=1.268.2.1
Fixed.
(In reply to comment #15) > Subject: Bug 20973 > Branch: gcc-4_0-branch > Changes by: matz@gcc.gnu.org 2005-04-22 17:30:21 > Modified files: > gcc : ChangeLog reload.c Is 4.1 intended to have the same behavior as well?
(In reply to comment #17) > Is 4.1 intended to have the same behavior as well? the mainline was fixed already except it did not show up here: http://gcc.gnu.org/ml/gcc-cvs/2005-04/msg01085.html