This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] fix arm neon ICE by widening tree_type's precision field

On Mon, Jun 08, 2009 at 10:18:17PM +0200, Jakub Jelinek wrote:
> On Mon, Jun 08, 2009 at 03:52:11PM -0400, Daniel Jacobowitz wrote:
> > In order to get good code out of these, I think we'd need to represent
> > early on that the single gimple operation set three different vectors
> > (SSA?  What SSA?)  Also we'd need to somehow do sensible register
> > allocation for these constraints.
> > 
> > Instead, we do not support these in the vectorizer; use unions for
> > the intrinsics (which do not get scalarized, so perhaps the new SRA
> > will help here), and fake it with these huge partial modes during RTL
> > expansion.  See the XImode patterns in for examples.
> > 
> > Any ideas? :-)
> Why do you need the big modes?  If the insn does load from memory into
> a couple of registers (or stores from couple of registers into memory),
> why can't you just use a parallel with all those stores in it?
> See e.g. for load_multiple and store_multiple...

Two reasons.  One, for the intrinsics, which use __builtin_neon_xi and
so forth.  The other, for the register allocation constraints.  I
don't see any way we could tell the register allocator "these three
operands must have consecutive register numbers" (or, even worse,
consecutive with stride 2).  S/390 handles this by generating
load_multiple directly for hard registers, only if reload_completed.

Daniel Jacobowitz

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]