This is the mail archive of the
mailing list for the GCC project.
Re: patch to enable LRA for ppc
- From: Vladimir Makarov <vmakarov at redhat dot com>
- To: Michael Meissner <meissner at linux dot vnet dot ibm dot com>, David Edelsohn <dje dot gcc at gmail dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>, "Bergner, Peter" <bergner at vnet dot ibm dot com>
- Date: Mon, 21 Oct 2013 22:42:07 -0400
- Subject: Re: patch to enable LRA for ppc
- Authentication-results: sourceware.org; auth=none
- References: <524DDB71 dot 6040703 at redhat dot com> <CAGWvny=AmXxjALad=L0=Lgzp+4k9-SKXsOZNAcWcumcsOiViJw at mail dot gmail dot com> <526495E8 dot 70701 at redhat dot com> <20131021155144 dot GA19634 at ibm-tiger dot the-meissners dot org>
On 13-10-21 11:51 AM, Michael Meissner wrote:
Sure, if you know some LRA problems it should not be on default.
Moreover, if we still have the problems when releasing gcc4.9, I think
we should exclude any possibility for a user to use LRA for ppc. I
don't want to have GGC-4.9 users blaming LRA.
On Sun, Oct 20, 2013 at 10:48:08PM -0400, Vladimir Makarov wrote:
On 13-10-18 11:26 AM, David Edelsohn wrote:
On Thu, Oct 3, 2013 at 5:02 PM, Vladimir Makarov <email@example.com> wrote:
The following patch permits today trunk to use LRA for ppc by default.
To switch it off -mno-lra can be used.
The patch was bootstrapped on ppc64. GCC testsuite does not have
regressions too (in comparison with reload). The change in rs6000.md is
for fix LRA failure on a recently added ppc test.
I have not forgotten this patch. We are trying to figure out the right
timeframe to make this change. The patch does affect performance --
both positively and negatively; most are in the noise but not all. And
there still are some SPEC benchmarks that fail to build with the
patch, at least in Mike's tests. And Mike is implementing some patches
to utilize reload to improve use of VSX registers, which would need to
be mirrored in LRA for the equivalent functionality.
Thanks for informing me, David.
I am ready to work on any LRA ppc issues when it will be in the
trunk. It would be easier for me to work on LRA ppc if the patch is
committed to the trunk and of course LRA is used as non-default
I don't know what Mike is doing on reload to use VSX registers. I
guess it is usage of VSX regs as spilled locations for GENERAL regs
instead of memory. If it is so, it is 2 day work to add this
functionality in LRA (as it already has analogous functionality for
Intel processors and that gave a nice SPECFP2000 improvement for
them) and probably more work on resolving issues especially as I
have no power8.
I would say lets add -mlra, but make the default OFF for the time being. We
can always switch the default later.
But adding LRA to PPC on the trunk (switched OFF by default) earlier
could help me a lot to work on the issues.
Vladimir, I thought I included you in the list when I gave status. The big
thing is several of the Spec 2006 benchmarks don't work in 32-bit mode, and I
get a lot of Fortran errors, again in 32-bit. I also saw some decimal floating
No, I did not see the message (or may be missed). I need to check.
I completely understand. You are quite busy this time as me rushing
some stuff into gcc-4.9.
What I'm doing is adding secondary reload support so that up until reload time,
we can represent VSX addresses as reg+offset, and in secondary reload, create
the addition instructions to put the offset in a base register. I haven't made
any changes to the machine independent portions of the compiler. As long as
IRA uses the secondary reload interface, it should be ok. However, right now,
I need to focus most of my attention on getting the secondary reload support to
Ok. I guess LRA can be adapted to some new secondary_reload hook
returning two scratch registers.
One thing that I've asked for before, but to remind you, is I really, really
wish secondary reload could allocate two scratch registers if it is given an
insn that takes 4 arguments. Right now, I'm allocating a TFmode scratch, since
that gives 2 registers, but future changes will want TFmode to go into a single
vector register, and I will need to create another type, like V4DI that does
take 2 registers. The case that this is needed for is moving an item from GPRs
to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
64-bit mode, or 64-bit items in 32-bit mode. I need two registers to do the
move into, and then I will do the combine operation.