This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal
- From: Segher Boessenkool <segher at kernel dot crashing dot org>
- To: Michael Meissner <meissner at linux dot vnet dot ibm dot com>, Alan Lawrence <alan dot lawrence at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, David Edelsohn <dje dot gcc at gmail dot com>
- Date: Wed, 12 Nov 2014 03:26:35 -0600
- Subject: Re: [PATCH 10/11][RS6000] Migrate reduction optabs to reduc_..._scal
- Authentication-results: sourceware.org; auth=none
- References: <544A3E0B dot 2000803 at arm dot com> <544A40D1 dot 1040605 at arm dot com> <20141110223624 dot GA19330 at ibm-tiger dot the-meissners dot org> <20141111071001 dot GA15842 at gate dot crashing dot org> <20141112012722 dot GA5485 at ibm-tiger dot the-meissners dot org>
On Tue, Nov 11, 2014 at 08:27:22PM -0500, Michael Meissner wrote:
> > Before the patch, the final reduction used *vsx_reduc_splus_v2df; after
> > the patch, it is *vsx_reduc_plus_v2df_scalar. The former does a vector
> > add, the latter a float add. And it uses the same pseudoregister for the
> > accumulator throughout. IRA decides a register is more expensive than
> > memory for this, I suppose because it wants both V2DF and DF? It doesn't
> > seem to like the subreg very much.
>
> I haven't looked into in detail (I've been a little busy with th upper regs
> patch), but I suspect the problem is that 128-bit and 64-bit types cannot
> overlap (i.e. rs6000_cannot_change_mode_class returns true). This is due to
> the fact that scalars in VSX registers occupy the upper 64-bits, which would
> not match the compiler's notion of that it should be in the bottom 64-bits.
You suspect correctly. Hacking around that in cannot_change_mode_class
doesn't help, subreg_get_info disallows it next.
Changing the pattern so it does two extracts instead of an extract and
a subreg works (you get an fmr for the high part though, register alloc
doesn't know dest=src is for free here).
_Should_ the subreg thing work? Or should the patterns be fixed?
Segher