This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Status of SSE builtins
> > You can compile any of the simd-*.c testcases from testsuite.
>
> If I happen to have a freshly build x86 compiler sitting around. Often
> I do, but not right now.
>
> > I've sent few emals about previously.
> > We generate instruction dealing with elements of the vector using
> > subregs, like (subreg:HI (reg:V4HI) 2) is expected to access and modify
> > only the second field of the vector. However the subregs gneerally
> > clobber whole word in GCC and are not allowed in such general forms.
>
> Reading a sub-word subreg is well-defined and doesn't clobber the register,
Reading is not problem, writing is.
For instance (set (subreg:HI (reg:SI) 0) (const_int 0)) is interpreted
as clear of whole register on i386.
> however, reading a subword subreg that is not the lowpart of a word is
> currently not implemented.
> Clobbering the whole register when you are going to write all the other
> parts subsequently is OK.
>
> I've fixed expand_vector_unop / expand_vector_binop to use extract_bit_field
> for non-constant input operands, and store_bit_field unless we write the first
> part of a word.
>
> rtl.texi says:
>
> Storing in a non-paradoxical @code{subreg} has undefined results for
> bits belonging to the same word as the @code{subreg}. This laxity makes
> it easier to generate efficient code for such instructions. To
> represent an instruction that preserves all the bits outside of those in
> the @code{subreg}, use @code{strict_low_part} around the @code{subreg}.
>
> Accordingly, the test to see if we can store directly into a subreg of the
> target uses UNITS_PER_WORD:
>
> if (GET_CODE (target) == REG
> && (BYTES_BIG_ENDIAN
> ? subsize < UNITS_PER_WORD
> : ((i * subsize) % UNITS_PER_WORD) != 0))
> t = NULL_RTX;
> else
> t = simplify_gen_subreg (submode, target, mode, i * subsize);
>
>
> > The SIMD support works for PPC/SPARC as such subregs always simplify to
> > specific register, but for SSE they not.
>
> I suppose the problem is that the SSE registers are larger than UNITS_PER_WORD,
> and you can't address individual words in the register?
Yes.
> Have you tried defining SECONDARY_*RELOAD_CLASS so that you go through general
> purpose registers in this case?
Not yet, however I think this is just part of the problem, as reload
will offload the register to memory, read it back and clobber the upper
part. I will check
Honza
>
> --
> --------------------------
> SuperH (UK) Ltd.
> 2410 Aztec West / Almondsbury / BRISTOL / BS32 4QX
> T:+44 1454 465658