[MIPS][LS2][2/5] Vector intrinsics
Maxim Kuvyrkov
maxim@codesourcery.com
Fri Jun 6 12:31:00 GMT 2008
Richard Sandiford wrote:
...
> Maxim Kuvyrkov <maxim@codesourcery.com> writes:
...
>> Loongson behaves as generic MIPS III in the respect to moves to and from
>> FP registers. So to handle new modes I added them to MOVE64 and SPLITF
>> mode_iterators and adjusted mips_split_doubleword_move() accordingly.
>
> This part looks good as far as it goes, thanks, but your vector move
> pattern still has an "f<-f" (aka "fmove") alternative. Like I say,
> that causes GCC to use a MOV.D instruction for something that is not a
> double-precision floating-point value. My main concern was that using
> MOV.D in this way is usually incorrect; see:
>
> - section 5.7 in volume 1 of the MIPS32/64 ISA spec
> - the documentation of MOV.FMT in volume 2 of the MIPS32/64 ISA spec
>
> for more details.
>
> Thus the standard ISA spec has two separate 64-bit move instructions:
> MOV.D and MOV.PS. Both instructions move one 64-bit FPR to another,
> but they are used for two different kinds of 64-bit data.
>
> There is no MOV.L instruction, so 64-bit integer moves must be done
> through a GPR. (This is true for both for 32-bit and 64-bit targets.)
> Using MOV.D or MOV.PS for 64-bit integers is incorrect.
>
> So when it comes to the new modes, I think there are two cases:
>
> (1) Loongson specifically exempts itself from this restriction.
> You can use MOV.D for any kind of data, regardless of how
> the source FPR has been used, or how the destination FPR
> will be used.
>
> (2) We need to move through GPRs for the new modes too.
>
> The current implementation falls between two stools. It provides
> an "fmove" alternative in the move pattern (suggesting (1)),
> but mips_mode_ok_for_mov_fmt_p still returns false (suggesting (2)).
It is the (2). I didn't spot "f" -> "f" alternative in
mov<mode>_internal when fixed the patch; this alternative should be
removed. I checked with Loongson designers and they confirmed that
MOV.D should not be used in this case.
...
> Couldn't you replace the unsigned_p argument with the test:
>
> if (TREE_CODE (type) == INTEGER_TYPE && TYPE_UNSIGNED (type))
>
> ? That'd avoid accidentally mixing the unsigned_p and type arguments.
Fixed.
>
>> +Also provided are helper functions for loading and storing values of the
>> +above 64-bit vector types to and from memory:
>> +
>> +@smallexample
>> +uint32x2_t vec_load_uw (uint32x2_t *src);
>> +uint16x4_t vec_load_uh (uint16x4_t *src);
>> +uint8x8_t vec_load_ub (uint8x8_t *src);
>> +int32x2_t vec_load_sw (int32x2_t *src);
>> +int16x4_t vec_load_sh (int16x4_t *src);
>> +int8x8_t vec_load_sb (int8x8_t *src);
>> +void vec_store_uw (uint32x2_t v, uint32x2_t *dest);
>> +void vec_store_uh (uint16x4_t v, uint16x4_t *dest);
>> +void vec_store_ub (uint8x8_t v, uint8x8_t *dest);
>> +void vec_store_sw (int32x2_t v, int32x2_t *dest);
>> +void vec_store_sh (int16x4_t v, int16x4_t *dest);
>> +void vec_store_sb (int8x8_t v, int8x8_t *dest);
>> +@end smallexample
>
> I assume this is an existing cross-compiler API you're implementing.
This API was developed at CodeSourcery and was signed off by ST
Microelectronics, the producers of Loongson CPUs. I speculate that the
design of the API is similar to arm_neon.h.
> Is it worth saying that plain C pointer dereferences would also work?
> Or do we explicitly want to steer the user away from that, because
> doing so doesn't conform to the API? I think we should say something
> either way.
>
> (For the record, I'm OK with keeping these functions, if you're
> implementing an existing API.)
How about "While it is possible to use plain C pointer dereferences, the
following helper functions provide stable interface for loading and
storing values of the above 64-bit vector types to and from memory" ?
>
> I believe:
>
> foo_t vec_load_bar (const volatile foo_t *src);
>
> would be more general; as it stands, I think using the functions
> on constant or volatile data would result in a warning. Likewise:
>
> void vec_store_bar (foo_t v, volatile foo_t *dest);
>
> But again, if you need to be precisely compatible with an API,
> I'm OK keeping it as-is.
"volatile" will probably kill CSE optimizations. "const" seems like a
good addition.
>
>> +;; Handle legitimized moves between values of vector modes.
>> +(define_insn "mov<mode>_internal"
>> + [(set (match_operand:VWHB 0 "nonimmediate_operand" "=m,f,f,d,f,d,m, f")
>> + (match_operand:VWHB 1 "move_operand" "f,m,f,f,d,d,YG,YG"))]
>> + "HAVE_LOONGSON_VECTOR_MODES"
>> + { return mips_output_move (operands[0], operands[1]); }
>> + [(set_attr "type" "fpstore,fpload,fmove,mfc,mtc,move,fpstore,mtc")
> ^^^^^^^
> Just "store"; this is a GPR store. The pattern is missing "m<-d",
> "d<-m" and "d<-YG", which you also need to handle. I think:
>
> (define_insn "mov<mode>_internal"
> [(set (match_operand:VWHB 0 "nonimmediate_operand" "=m,f,d,f,d,m,d")
> (match_operand:VWHB 1 "move_operand" "f,m,f,dYG,dYG,dYG,m"))]
> "HAVE_LOONGSON_VECTOR_MODES"
> { return mips_output_move (operands[0], operands[1]); }
> [(set_attr "type" "fpstore,fpload,mfc,mtc,move,store,load")
>
> would be correct for (2) above. Add the fmove alternative back for (1).
Fixed, no fmove alternative now. Thanks for pointing at this.
>
>> --- config/mips/mips.h (/local/gcc-trunk/gcc) (revision 373)
>> +++ config/mips/mips.h (/local/gcc-2/gcc) (revision 373)
>> @@ -266,6 +266,11 @@ enum mips_code_readable_setting {
>> || mips_tune == PROCESSOR_74KF3_2)
>> #define TUNE_20KC (mips_tune == PROCESSOR_20KC)
>>
>> +/* Whether vector modes and intrinsics for ST Microelectronics
>> + Loongson-2E/2F processors should be enabled. In o32 pairs of
>> + floating-point registers provide 64-bit values. */
>> +#define HAVE_LOONGSON_VECTOR_MODES TARGET_LOONGSON_2EF
>> +
>> /* True if the pre-reload scheduler should try to create chains of
>> multiply-add or multiply-subtract instructions. For example,
>> suppose we have:
>
> I think this needs to depend on TARGET_HARD_FLOAT as well.
Fixed.
--
Maxim
More information about the Gcc-patches
mailing list