This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Status of SSE builtins
- From: Joern Rennecke <joern dot rennecke at superh dot com>
- To: Jan Hubicka <jh at suse dot cz>
- Cc: gcc at gcc dot gnu dot org, rth at cygnus dot com, bernds at redhat dot com, aj at suse dot de
- Date: Wed, 30 Oct 2002 22:21:14 +0000
- Subject: Re: Status of SSE builtins
- Organization: SuperH UK Ltd.
- References: <3DC04D9A.2D379818@superh.com> <20021030214140.GA22590@kam.mff.cuni.cz>
> You can compile any of the simd-*.c testcases from testsuite.
If I happen to have a freshly build x86 compiler sitting around. Often
I do, but not right now.
> I've sent few emals about previously.
> We generate instruction dealing with elements of the vector using
> subregs, like (subreg:HI (reg:V4HI) 2) is expected to access and modify
> only the second field of the vector. However the subregs gneerally
> clobber whole word in GCC and are not allowed in such general forms.
Reading a sub-word subreg is well-defined and doesn't clobber the register,
however, reading a subword subreg that is not the lowpart of a word is
currently not implemented.
Clobbering the whole register when you are going to write all the other
parts subsequently is OK.
I've fixed expand_vector_unop / expand_vector_binop to use extract_bit_field
for non-constant input operands, and store_bit_field unless we write the first
part of a word.
rtl.texi says:
Storing in a non-paradoxical @code{subreg} has undefined results for
bits belonging to the same word as the @code{subreg}. This laxity makes
it easier to generate efficient code for such instructions. To
represent an instruction that preserves all the bits outside of those in
the @code{subreg}, use @code{strict_low_part} around the @code{subreg}.
Accordingly, the test to see if we can store directly into a subreg of the
target uses UNITS_PER_WORD:
if (GET_CODE (target) == REG
&& (BYTES_BIG_ENDIAN
? subsize < UNITS_PER_WORD
: ((i * subsize) % UNITS_PER_WORD) != 0))
t = NULL_RTX;
else
t = simplify_gen_subreg (submode, target, mode, i * subsize);
> The SIMD support works for PPC/SPARC as such subregs always simplify to
> specific register, but for SSE they not.
I suppose the problem is that the SSE registers are larger than UNITS_PER_WORD,
and you can't address individual words in the register?
Have you tried defining SECONDARY_*RELOAD_CLASS so that you go through general
purpose registers in this case?
--
--------------------------
SuperH (UK) Ltd.
2410 Aztec West / Almondsbury / BRISTOL / BS32 4QX
T:+44 1454 465658