Vector registers on MIPS arch

David Guillen Fandos david@davidgf.net
Tue Apr 5 22:51:00 GMT 2016



On 05/04/16 09:13, Ilya Enkovich wrote:
> 2016-04-05 1:59 GMT+03:00 David Guillen Fandos <david@davidgf.net>:
>>
>>
>> On 04/04/16 10:55, Ilya Enkovich wrote:
>>> 2016-04-02 3:32 GMT+03:00 David Guillen Fandos <david@davidgf.net>:
>>>> Hello there!
>>>>
>>>> I'm trying to add some vector registers to a MIPS arch (32 bit). This
>>>> arch has 32 x 128 bit registers that can essentially be seen as V4SF.
>>>> So far I'm using this test:
>>>>
>>>> volatile float foo __attribute__ ((vector_size (16)));
>>>> volatile float bar __attribute__ ((vector_size (16)));
>>>>
>>>> int main() {
>>>>         foo = foo + bar;
>>>> }
>>>>
>>>> Which produces the right SSE/AVX instructions for x86 but fails on my
>>>> mips cross compiler with my modifications.
>>>> The modifications I did so far are:
>>>>
>>>>  - Add 32 new regsiters, adding a register class, updating/adding bit
>>>> fields, updating also other macros that deal with reg allocation (like
>>>> caller saved and stuff). Also incremented the first pseudo reg value.
>>>>
>>>>  - Add 3 define_insn that load, store and add vectors.
>>>>
>>>>  - Tweak some things here and there to let the compiler know about the
>>>> V4SF type being available.
>>>>
>>>> So far the compiler goes back to scalar code, not working properly at
>>>> the veclower pass. My test.c.123t.veclower21 looks like:
>>>>
>>>>   <bb 2>:
>>>>   foo.0_2 ={v} foo;
>>>>   bar.1_3 ={v} bar;
>>>>   _6 = BIT_FIELD_REF <foo.0_2, 32, 0>;
>>>>   _7 = BIT_FIELD_REF <bar.1_3, 32, 0>;
>>>>   _8 = _6 + _7;
>>>>   _9 = BIT_FIELD_REF <foo.0_2, 32, 32>;
>>>>   _10 = BIT_FIELD_REF <bar.1_3, 32, 32>;
>>>>   _11 = _9 + _10;
>>>>   _12 = BIT_FIELD_REF <foo.0_2, 32, 64>;
>>>>   _13 = BIT_FIELD_REF <bar.1_3, 32, 64>;
>>>>   _14 = _12 + _13;
>>>>   _15 = BIT_FIELD_REF <foo.0_2, 32, 96>;
>>>>   _16 = BIT_FIELD_REF <bar.1_3, 32, 96>;
>>>>   _17 = _15 + _16;
>>>>   foo.2_4 = {_8, _11, _14, _17};
>>>>   foo ={v} foo.2_4;
>>>>   return 0;
>>>>
>>>>
>>>> Any ideas on what I'm missing and/or how to further debug this? I don't
>>>> really want autovectorization, just to be able to use vec registers
>>>> "manually".
>>>
>>> Hi.
>>>
>>> Can't say for sure since you didn't attach your patch.  But vector
>>> lowering happens for a vector statement which doesn't have corresponding
>>> entry in optab.  You must ensure your templates have proper names to
>>> get them added to optabs.
>>>
>>> Thanks,
>>> Ilya
>>>
>>>>
>>>> Thanks!
>>>> David
>>>>
>>
>> Hey Ilya, thanks for the response.
>>
>> My patterns look like this:
>>
>>
>> ;; Vector load.
>> (define_insn "load_v4sf"
>>   [(set (match_operand:V4SF 0 "register_operand" "=kv")
>>         (match_operand:V4SF 1 "memory_operand" "m") )]
>>   ""
>>   "lv.q\t%0,%1"
>> )
>> ;; Vector store.
>> (define_insn "store_v4sf"
>>   [(set (match_operand:V4SF 0 "memory_operand" "=m")
>>         (match_operand:V4SF 1 "register_operand" "kv") )]
>>   ""
>>   "sv.q\t%0,%1"
>> )
>>
>> ;; Add vector.
>> (define_insn "vadd4sf"
>>   [(set (match_operand:V4SF 0 "register_operand" "=kv")
>>         (plus:V4SF (match_operand:V4SF 1 "register_operand" "kv")
>>                    (match_operand:V4SF 2 "register_operand" "kv")))]
>>   ""
>>   "vadd.q\t%0,%1,%2"
>>   [(set_attr "type" "fadd")])
>>
>>
>> kv represents a constraint that maps to a vector register pool of registers.
>> Does it make sense to you?
> 
> Your pattern names don't match standard pattern names and therefore are not
> recognized by optabs.  It means these vector patterns can't be used by
> vectorizer
> and corresponding vector statements will be lowered into scalar ones.  Look into
> [1] for more details.  E.g. for 'add' pattern you should use name 'addv4sf3'.
> 
> BR
> Ilya
> 
> [1] https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html
> 
>>
>> Many thanks!
>> David

Thanks again Ilya,

That seems to help to solve the problem. Now I'm facing another issue.
It seems the tree-vec-generic pass is promoting my vector operations to
BLKmode and therefore the VECTOR_MODE_P macro evaluates to false,
falling back to scalar mode.
I thought I got it working for a moment when I forgot to fix the
HARD_MODE_REGNO_OK macro that evaluated to false for vector registers.
In that case I mange to dodge this issue but I see another issue
regarding register allocation (obviously! :P)

So the bottom line would be, how do I make sure that my "compute_type"
is V4SF instead of BLKmode? Where does this promotion happen?

Thanks a lot!
David



More information about the Gcc mailing list