| Summary: | [4.0 Regression] kdelibs (khtml) miscompiled by reload | ||
|---|---|---|---|
| Product: | gcc | Reporter: | Michael Matz <matz> |
| Component: | middle-end | Assignee: | Not yet assigned to anyone <unassigned> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | bernds, gcc-bugs, ismail, jakub, mlists, mueller, pawel_sikora, than |
| Priority: | P2 | Keywords: | wrong-code |
| Version: | 4.0.0 | ||
| Target Milestone: | 4.0.1 | ||
| Host: | i386-linux | Target: | |
| Build: | Known to work: | ||
| Known to fail: | Last reconfirmed: | 2005-04-12 18:01:07 | |
| Bug Depends on: | |||
| Bug Blocks: | 20949 | ||
| Attachments: |
the preprocessed source showing the problem
Patch for this problem |
||
|
Description
Michael Matz
2005-04-12 17:28:56 UTC
Created attachment 8610 [details]
the preprocessed source showing the problem
Reload, wow, not that unexcepted really. Here some mails we exchanged:
Adding something complex seems to change how
things are allocated or REG_DEAD notes distributed or something like this.
If we add 'kdDebug( 6080 ) << "applying property " << id << endl;' at the
very start, it seems to work. If we comment it out, and add a
+ printf ("apply %d id %d\n", apply, id);
if(!apply) return;
switch(id)
{
case CSS_PROP_MAX_WIDTH:
at line 2834, one can see where it goes wrong. The printf will show id
being 0 every time. In the debugger printing 'id' (which refers to the
stack position) shows the correct value. At that point the compiler has
placed id into %esi, which indeed contains 0 here.
This $esi is loaded at the beginning of the function with the argument,
and the first time it changes (when setting a watchpoint on $esi in gdb)
is at line 2823:
l = Length(primitiveValue->computeLength(style,
paintDeviceMetrics), Fixed,
primitiveValue->isQuirkValue());
This corresponds to this insn:
0x4028a449 : and $0xf0000000,%esi
After that, until the printf $esi is not correctly loaded back from the
parameter stack slot, and remains 0. The above insn corresponds to this
RTL:
(insn 12371 55063 12368
/usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel [
(set (reg:SI 4 si [orig:3843 D.83927 ] [3843])
(and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843])
(const_int -268435456 [0xf0000000])))
(clobber (reg:CC 17 flags))
]) 200 {*andsi_1} (nil)
(I think, there are multiple insns masking the high 4 bit out of %esi, and
I've not yet traced through all paths). So somehow I think GCC put
something into %esi while it still was live. Wasn't there some error in
placing REG_DEAD notes just a few days ago?
Another mail:
I think the dust settles a bit. We have this situation in
.t69.final_cleanup (excerpt from applyRule):
struct Length D.83927;
<L2904>:;
D.83927.D.21947.l.value = (int) primitiveValue->m_value.num;
D.83927.D.21947.l.type = 1;
D.83927.D.21947.l.quirk = 0;
l = D.83927;
apply = 1;
<L2908>:;
printf (&"apply %d id %d\n"[0], (int) apply, id);
if (apply == 0) goto <L9089>; else goto <L2910>;
The above is the only use of D.83927, so it first is initialized
memberwise, and then copied as a whole. This corresponds to these insns
from .22.lreg (not the full above sequence, but only the interesting
part). Reg 4101 is 'l' and is constructed piecewise. Reg 4197 is 'id'.
;; Start of basic block 1449, registers live: 3 [bx] 6 [bp] 7 [sp] 16
[argp] 18 [fpsr] 20 [frame] 3669 3843 4196 4197
....
(insn:HI 12371 12370 12372 1449
/usr/src/packages/BUILD/kdelibs-3.4.0/khtml/misc/khtmllayout.h:49 (parallel
[
(set (reg/v:SI 4101 [ l ])
(and:SI (reg:SI 3843 [ D.83927 ])
(const_int -268435456 [0xf0000000])))
(clobber (reg:CC 17 flags))
]) 200 {*andsi_1} (nil)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 3843 [ D.83927 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))))
Note how reg 3843 dies here. Additionally it should be mentioned that
it's never initialized (it's live up through the first insn), as it
probably represents the yet-uninitialized part of a bitfield in
construction. After the above follows some more bitmanipulation on reg
4101, setting finally all fields, then the block ends:
;; End of basic block 1449, registers live:
3 [bx] 6 [bp] 7 [sp] 16 [argp] 20 [frame] 3668 4101 4196 4197
;; Start of basic block 1450, registers live: 3 [bx] 6 [bp] 7 [sp] 16
[argp] 20 [frame] 3668 4101 4196 4197
...
(insn:HI 12382 12381 12383 1450
/usr/src/packages/BUILD/kdelibs-3.4.0/khtml/css/cssstyleselector.cpp:2834
(set (mem:SI (plus:SI (reg/f:SI 7 sp)
(const_int 8 [0x8])) [0 S4 A32])
(reg/v:SI 4197 [ id ])) 35 {*movsi_1} (nil)
(nil))
So, and this is the setup of the arguments to the printf. So, we know
that 4197 is live throughout block 1449, and correctly conflicts with reg
4101 (see below), which is the only register set here, by the bitmasking
insn 12371.
Now we look at .23.greg dump, first the interesting conflicts:
;; 3843 conflicts:
;; 4101 conflicts: ... 4197 ...
;; 4197 conflicts: ... 4101 ...
So, all is well. But then reload goes and breaks it:
...
Spilling for insn 12371.
Spilling for insn 12371.
Reloads for insn # 12371
Reload 0: reload_in (SI) = (reg:SI 4 si [orig:3843 D.83927 ] [3843])
reload_out (SI) = (reg/v:SI 4101 [ l ])
GENERAL_REGS, RELOAD_OTHER (opnum = 0)
reload_in_reg: (reg:SI 4 si [orig:3843 D.83927 ] [3843])
reload_out_reg: (reg/v:SI 4101 [ l ])
reload_reg_rtx: (reg:SI 4 si [orig:3843 D.83927 ] [3843])
...
Yes, right, reload needs to fix up the insn to make both registers the
same and it wants to use $esi as reg for reg3843. Problem is, it wants
also to alloc reg4197 to $esi:
...
3826 in 0 3843 in 4 3844 in 4 3846 in 4 3847 in 1 3912 in 0
4173 in 5 4180 in 4 4197 in 4 4198 in 5 4199 in 0 4202 in 0
...
(4101 is not allocated by the way because reload is able to optimize it
out, and instead uses 3843 in it's place). So this then results in this
insn:
(insn:HI 12371 12370 43006
1448 /usr/src/packages/BUILD/kdelibs-3.4.0/khtml/mis
(set (reg:SI 4 si [orig:3843 D.83927 ] [3843])
(and:SI (reg:SI 4 si [orig:3843 D.83927 ] [3843])
(const_int -268435456 [0xf0000000])))
(clobber (reg:CC 17 flags))
]) 200 {*andsi_1} (nil)
at which point %esi is lost. I think the problem is, that global-alloc
uses %esi for 4197 and 3843, which is okay, as 3843 is uninitialized and
hence doesn't conflict with 4197. But reload then goes on and actually
uses this uninitialized register for _setting_. Basically it transforms
this insn:
rA <- op (rB)
where rA and rB need to match into
rB <- op (rB)
[rA <- rB]
where the latter move is not emitted because rA could be optimized away
(rA is our 4101).
This introduces a new setting of rB, which is a problem if rB was
uninitialized. Normally reload should have generated
rA <- rB
rA <- op (rA)
For those who want a quick workaround, changing khtmllayout.h to:
- Length(LengthType t) { l.value = 0; l.type = t; l.quirk = 0; }
+ Length(LengthType t) { _length = 0; l.type = t; }
Length(int v, LengthType t, bool q=false)
- { l.value = v; l.type = t; l.quirk = q; }
+ { _length = 0; l.value = v; l.type = t; l.quirk = q; }
makes the bug go away. The first changed line actually might result in
better generated code with the current GCC's deficiencies in bitfield handling.
I guess we need a bitfield optimization pass near the end of tree passes that
will figure out bitfield accesses to the same underlying field and change that
into arithmetics on a union or something. Don't even look at assembly for
that RTL optimizers can't figure out, but a tree optimizer could turn that
into 2 instructions:
struct { int a:2,b:2,c:2,d:2,e:2,f:2,g:2,h:2,i:2,j:2,k:2,l:2,m:2,n:2;} s;
void foo (void)
{
s.a |= 1; s.b &= 1; s.c |= 1; s.d &= 1; s.e |= 1; s.f &= 1; s.g |= 1;
s.h &= 1; s.i |= 1; s.j &= 1; s.k |= 1; s.l &= 1; s.m |= 1; s.n &= 1;
}
The problem is in reload.c:find_dummy_reload. It tries to use the input reg
as reload register for an in-out reload and has certain conditions when it
can't do so:
/* Consider using IN if OUT was not acceptable
or if OUT dies in this insn (like the quotient in a divmod insn).
We can't use IN unless it is dies in this insn,
which means we must know accurately which hard regs are live.
Also, the result can't go in IN if IN is used within OUT,
or if OUT is an earlyclobber and IN appears elsewhere in the insn. */
if (hard_regs_live_known
&& REG_P (in)
&& REGNO (in) < FIRST_PSEUDO_REGISTER
&& (value == 0
|| find_reg_note (this_insn, REG_UNUSED, real_out))
&& find_reg_note (this_insn, REG_DEAD, real_in)
&& !fixed_regs[REGNO (in)]
&& HARD_REGNO_MODE_OK (REGNO (in),
But this doesn't check if IN is used uninitialized. In that case it also
can't be used. It's not immediately clear to me how to check for this,
as nowhere is it noted that this or that pseudo is actually uninitialized
and only therefore got a register by global.c. One could look at the
global_live_at_start of the first block, if it mentions the original pseudo
number of the IN operand.
The problem of course being that at this point the hardregs already are
substituted into the REG expressions. So one would have to trust the
original regnos noted there. And it's not clear that this is the only
place in reload which is confused by hardregs corresponding to uninitialized
pseudos.
I have a patch for reload, which fixes the bug, when looking at the dumps. At least now find_reg is used for the insn in question, which also evicts pseudos using the same reg as the chosen final reg_rtx. I have only tested this by copying around the generated .s file, not by building a new compiler. (Hence it's also obviously not regression or bootstrap tested, but I'll do this now.) Created attachment 8614 [details]
Patch for this problem
will test the patch with a KDE recompilation. This patch fixes the regressions in khtml for us. Michael, have you run the GCC testsuite with your patch? If not, please run on several platforms and confirm that you get no regressions. From http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01508.html where I submitted the patch: the problem in khtml. I've bootstrapped it with gcc 4.0 on i686,x86_64,ppc,ppc64,ia64,s390 (s390x was breaking for different reasons), all languages (with Ada ;) ). There were no regressions (in fact some fixed Ada testcases, but I'm not sure if they were real). Note the last sentence. The reload patch seems reasonable, though not very satisfying. It is only disabling some optimizations within reload, and hence should be somewhat safe. Though it is still a reload patch, and all reload patches are dangerous. There is a hunk of code added twice, but only one of the two instances is commented. The other one should have a comment also. Another way to look at this problem is to point out that we have an uninitialized register, and claim that this is wrong. This is actually dangerous on Itanium. Suppose we call a hand-written glibc routine which leaves a few NaT bits lying around. Suppose we then assign the uninitialized pseudo to one of these registers with a NaT bit set. When we read the register, we get an unhandled NaT consumption fault, and the program dies. This isn't hypothetical. This actually happened once in the early days of the Itanium port. It is rather hard to reproduce, as gcc doesn't have speculation support yet, so it is rare for a register to have the NaT bit set. There are only a few hand written library routines that can set NaT bits. This will be a more serious problem once we do have speculation support. The old code for structures was careful to always initialize the structure before doing any piecemeal initialization, if the structure was in a register. This way, we never had any uses of an unitialized register. This is done in store_constructor for instance. Unfortunately, this isn't true now that we have tree-ssa, because structure references are decomposed, and there is no longer any attempt to initialize the whole structure if it is allocated to a register. This will have to be fixed for Itanium. All we need is an instruction to set the uninitialized register to zero immediately before it is used. It will then be optimized away by combine, which will convert (set reg 0) (set reg (ior reg constant)) into (set reg constant) Since we must eliminate the uninitialized register uses anyways, we perhaps don't have to fix the reload issue. It is probably harmless to make both fixes though. The reload patch should never trigger once the uninitialized register problem is fixed. I agree with most of what Jim said. Except for the part that we maybe don't have to fix the reload issue, when we fix usage of the uninitialized register for piecewise struct initialization. The latter will fix this particular instance of the problem, but reload still would be confused by uninitialized registers. I think they can happen also for other reasons, like the user having some uninitialized variables, which perhaps never are used at runtime, but still could result in reload miscompiling the program the same way as seen here. As reload bugs are particular hard to track down, I really think we should fix this one for good, that it doesn't come back biting us in the future. If agreed I will cleanup the patch to comment both locations (I don't think it would deserve an own subfunction). Subject: Bug 20973 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-4_0-branch Changes by: matz@gcc.gnu.org 2005-04-22 17:30:21 Modified files: gcc : ChangeLog reload.c Log message: PR middle-end/20973 * reload.c (push_reload, find_dummy_reload): Check for uninitialized pseudos. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=2.7592.2.177&r2=2.7592.2.178 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/reload.c.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.268&r2=1.268.2.1 Fixed. (In reply to comment #15) > Subject: Bug 20973 > Branch: gcc-4_0-branch > Changes by: matz@gcc.gnu.org 2005-04-22 17:30:21 > Modified files: > gcc : ChangeLog reload.c Is 4.1 intended to have the same behavior as well? (In reply to comment #17) > Is 4.1 intended to have the same behavior as well? the mainline was fixed already except it did not show up here: http://gcc.gnu.org/ml/gcc-cvs/2005-04/msg01085.html |