This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Patch fixing 3.3 bug PR 9745 and PR 10021

Sorry about the delays.  Too many distractions over the holiday weekend
here in the US, including some unfortunate networking/mail problems. 
And this problem has gotten a lot more complicated than I initially

I have checked in the non-controversial part of the patch, which was the
emit-rtl.c set_mem_attributes_minus_bitpos part.  After I checked it in,
I noticed that you asked me only to check it in only if it was known to
fix any existing 3.3.1 PRs.  Actually, I don't know that for sure, I
haven't checked that.  I know that the patch does something useful, and
I don't mind taking some heat if I did something wrong, and I want to
make forward progress here, so I am leaving it in for now.  It should
stay on mainline, but it could be removed from the 3.3.1 branch if you
don't want it there.

This thread has pointed out that my patch causes a performance
regression on SPEC.  I don't have a copy of SPEC, so I can't investigate
this.  Also, I don't have access to hardware for the affected targets
(powerpc, m68k), and running SPEC on a simulator is impractical.  I
don't recall seeing anyone post specific info about the performance
regressions, just that they exist.

There has been some discussion of whether we should fix the bug or not. 
I think that we should be fixing the bug.  Because of the nature of the
problem, an aliasing problem with loops, it will be hard for users to
understand when they will be affected by it.  Exposing the performance
regression will be inconvenient for some people, but it at least gives
those people incentive to fix the problem.  As long as we work around
the problem, there is no incentive to fix the problem correctly.

There is clearly a problem with alias.c, since one part assumes that a
base value can not be a REG, and another part explicitly sets a base
value to a REG.  I don't see how this code in alias.c can work, and this
bug has apparently been here since the beginning, since I can see the
same code in gcc-2.95.3.  I choose to fix find_base_value, an
alternative would be to change the assumptions in base_alias_check. 
This might be more desirable if there is less of an effect on
performance, but I unfortunately can't check this.

I haven't mentioned this earlier, and I haven't seen anyone else mention
it, but for PR 10021, and alias failure is occurring while loop is
running.  loop has a function loop_regs_update which tries to update
alias info for regs created/modified by loop.  This code is rather
simplistic, it just passes the SET_SRC of new instructions to alias.c. 
However, in most cases, these are registers created for reduced
bivs/givs, in which case we can do much better by passing through the
base address of the biv that the register is derived from.  This should
give us better alias info, and may help fix the performance problem, but
I can't check that.

There is also the issue that if we happen to call loop_regs_update for a
register for which we already have base value info, then we will end up
clobbering it.  I don't know for sure that this can happen, but this
code does look suspect to me.  Internal to alias.c we use record_set
which correctly deals with multiple sets to a register.  However, when
called from loop, we use record_base_value which has no support for
handling multiple sets.  This we may be losing alias info here by
overwriting useful info with less useful info, or we may be accidentally
putting in bad info.  Even if there is no previous info to overwrite, we
may still create problems if we have an insn sequence which writes to
the same register more than once.  This is certainly a possibility if
the sequence contains a multiply that was expanded as a shift/add

There is a related issue here, it may be that the only time when a REG
base value is a problem is when that REG came from loop_regs_update.  In
which case fixes to this function may eliminate the need for my
find_base_value patch.  I haven't investigated this yet, but I think
this is definitely worth looking into.

Looking at PR 9745, I see a comment from Janis Johnson pointing out that
the problem first appeared when David Miller eliminated SEQUENCE rtl. 
Looking at loop_regs_update in loop.c, I see an obvious problem.  We are
inserting the sequence first, and then scanning it.  This worked fine
when the sequence was encapsulated inside a SEQUENCE.  However, without
the SEQUENCE, a scan of the sequence insn chain after it has been
inserted won't end until we reach the end of the function.  Stepping
through the function in gdb shows that this is indeed happening. Since
loop_regs_update doesn't handle multiple sets, this will really wreck
havoc with the alias info.  Obviously, we need to scan the sequence
first before we insert it.  I have attached a patch for that.  I don't
know if this will have any effect on the performance problem, it is
possible, but I can't check that.  If I try this patch with mainline gcc
sources against the PR 9745 examples, then I do get a big effect on the
output.  Too big of an effect to easily tell if the bugs are fixed by
this patch, but it looks hopeful.

Dale's patch doesn't fix the problem, and shouldn't be necessary if the
alias info is correct.  I suspect my latest patch removes the need for
Dale's patch here.

I am concerned about my ability to work on this problem, since I don't
have access to the hardware involved, and I am finding it very time
consuming.  Time I spend on this problem is time I can't spend on
reviewing patches sent to gcc-patches.

I think the next step here is testing of my latest patch.  We need
performance testing which I can't do.  We also need some execution
testing to see if it fixes the bugs 9745 and 10021.  I could do 9745 by
building an uberbaum cross compiler, but that will take a lot of time. 
It would be faster if someone else could do this for me.  If this patch
works, then I think this is safe for the 3.3.1 branch and then we can
close the two PRs.  We still have long term issues with
loop_regs_update/record_base_value which will effect performance and
possibly also correctness.


Attachment: tmp.file.500
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]