This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
PR 11319 analyzed (Re: Patch fixing 3.3 bug PR 9745 and PR 10021)
On Friday 11 July 2003 08:55, Jim Wilson wrote:
> On Thu, 2003-07-10 at 09:18, Mark Mitchell wrote:
> > I concur -- Jim, please apply the patch and close those PRs which are
> > indeed fixed.
>
> I've added the patch to mainline and the 3.3 branch.
>
> This fixes 9745, but not 10021 nor 11319. I will keep plugging away on
> this as time permits, trying to find patches for the other two that are
> acceptable to everyone.
I looked at what goes wrong with PR 11319. In 11319 we have the following
short code fragment miscompiled:
for (i = 0; i < 8; i += 8)
{
cr = 0x13;
for (j = 0; j < 8; j++)
tr[j] = cr;
*x = tr[0];
}
in RTLed C this looks essentially like this:
for (i = 0; i < 8; i += 8)
{
cr = 0x13;
for (j = 0; j < 8; j++)
{
reg123 = cr;
tr[j] = reg123; // insn 73
}
reg138 = tr[0]; //insn 86
*x = reg138;
}
The INSN for tr[j] = reg123 looks like this:
(insn 73 71 75 7 0x3027e300 (set (mem/s:SI (plus:SI (reg:SI 136)
(const_int 32 [0x20])) [4 tr S4 A32])
(reg/v:SI 123)) 300 {*movsi_internal1} (nil)
(expr_list:REG_EQUAL (const_int 19 [0x13])
(nil)))
The INSN for reg138 = tr[0] looks like this:
(insn 86 126 87 8 0x3027e300 (set (reg:SI 138)
(mem/s:SI (plus:SI (reg/f:SI 31 r31)
(const_int 40 [0x28])) [4 tr+0 S4 A128])) 300
{*movsi_internal1} (nil)
(nil))
Now what happens is that load_mems() puts insn 86 on the movables list cause
true_dependence() says that this insn doesn't conflict with insn 73. Lateron
move_movables() then creates this code:
for (i = 0; i < 8; i += 8)
{
cr = 0x13;
reg138 = tr[0]; //insn 86
for (j = 0; j < 8; j++)
{
reg123 = cr;
tr[j] = reg123; // insn 73
}
*x = reg138;
}
Certainly this doesn't set *x to the expected value.
This testcase quite nicely shows why Dale's "hackish" (why?) patch cured it.
Dale's patch changes true_dependence() to look at the MEM_EXPRs of the MEMs
above (what shows up as [4 tr S4 A32] and [4 tr+0 S4 A128] in the above RTL)
and decides that they describe the same "object" and so the accesses
conflict.
This raised a few questions for me:
- why is Dale's patch "hackish", except for not curing all possible cases?
- why does move_movables() fiddle with single insns at all? Shouldn't it leave
that to the scheduler? (note that the insn isn't hoisted out of the loop)
- why does it move the insn to the beginning of the loop, even across a inner
loop?
- why does load_mems() use true_depedence() even if -fno-strict-aliasing?
Currently I'm leaning to apply Dale's patch for 3.3.1, and the testcase to
both branch and mainline.
What do you think?
Franz.