This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] fix arm neon ICE by widening tree_type's precision field
On Mon, Jun 08, 2009 at 10:18:17PM +0200, Jakub Jelinek wrote:
> On Mon, Jun 08, 2009 at 03:52:11PM -0400, Daniel Jacobowitz wrote:
> > In order to get good code out of these, I think we'd need to represent
> > early on that the single gimple operation set three different vectors
> > (SSA? What SSA?) Also we'd need to somehow do sensible register
> > allocation for these constraints.
> > Instead, we do not support these in the vectorizer; use unions for
> > the intrinsics (which do not get scalarized, so perhaps the new SRA
> > will help here), and fake it with these huge partial modes during RTL
> > expansion. See the XImode patterns in neon.md for examples.
> > Any ideas? :-)
> Why do you need the big modes? If the insn does load from memory into
> a couple of registers (or stores from couple of registers into memory),
> why can't you just use a parallel with all those stores in it?
> See e.g. s390.md for load_multiple and store_multiple...
Two reasons. One, for the intrinsics, which use __builtin_neon_xi and
so forth. The other, for the register allocation constraints. I
don't see any way we could tell the register allocator "these three
operands must have consecutive register numbers" (or, even worse,
consecutive with stride 2). S/390 handles this by generating
load_multiple directly for hard registers, only if reload_completed.