This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH,RFC] Disallow reordering of x87 insns while scheduling
- From: Roger Sayle <roger at eyesopen dot com>
- To: "Vladimir N. Makarov" <vmakarov at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 18 Apr 2005 09:04:07 -0600 (MDT)
- Subject: Re: [PATCH,RFC] Disallow reordering of x87 insns while scheduling
On Mon, 18 Apr 2005, Vladimir N. Makarov wrote:
> I think the most important platform is Pentium-M right now. So I
> ran SPECFP95 on my 1.6Ghz Pentium-M notebook (sorry SPECFp2000 is too
> much for the notebook).
Cool! Many thanks for SPEC testing.
> I have mixed feeling with this patch. On one hand, it generates more
> compact code (see data below). On the other hand, the compiler runs
> slower with this patch.
It certainly isn't clear cut, and I'm shocked about the huge performance
swings both positive with fppp, and negative with wave5. For Pentium-M
my thinking was closer to RTH's, that there shouldn't be a large
performance impact as FXCH should almost be free. Clearly there's
some important affect here that GCC's not taking into account.
I'm also a little surprised by the compile-time impact. My changes
should have reduced the number of dependencies seen by the scheduler;
no longer do FP insns have multiple dependencies on their source FP
insns, but instead just one REG_DEP_TRUE on the immediately preceeding
x87 insn. It might be worth comparing profiles, because its almost
certainly not the four or five lines added by the patch that are taking
the extra cycles.
> I think you are working in a right direction trying to solve problem
> of separate passes in register allocation, insn scheduling and
> reg-stack. But this simple solution is probably not right one.
I won't disagree too much. But as GCC's scheduler guru, have you a
proposal for how this should be tackled? I'm not sure I yet have
a full enough understanding of the problem to rationalize your
results. Have you any thoughts about my proposal to move reg-stack
earlier and have it perform some scheduling which is preserved by
DFA?
Would you be opposed to adding this code with an optimize_size
check, if we can't find a better solution to the whole problem?
Thanks again for benchmarking. I'm sure this has now piqued your
curiosity, as it has mine. I'm wondering how other scheduled
CPUs, such as the Athlon, are affected by this. I'll investigate
the mysterious wave5 code growth phenomenon.
Roger
--