This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Vector registers on MIPS arch
- From: David Guillen Fandos <david at davidgf dot net>
- To: Ilya Enkovich <enkovich dot gnu at gmail dot com>
- Cc: Gcc Mailing List <gcc at gcc dot gnu dot org>
- Date: Tue, 5 Apr 2016 23:50:46 +0100
- Subject: Re: Vector registers on MIPS arch
- Authentication-results: sourceware.org; auth=none
- References: <56FF1331 dot 9080103 at davidgf dot net> <CAMbmDYYGvvWWABBwXJ+yQdxvvrZbt4Gd6zsTaGRa1nGd8ndZPg at mail dot gmail dot com> <5702F1CA dot 5040605 at davidgf dot net> <CAMbmDYb7tMPbmf1OdxF+RmTGxWc1377wE0rqyeu1cDtdfmKEGg at mail dot gmail dot com>
On 05/04/16 09:13, Ilya Enkovich wrote:
> 2016-04-05 1:59 GMT+03:00 David Guillen Fandos <david@davidgf.net>:
>>
>>
>> On 04/04/16 10:55, Ilya Enkovich wrote:
>>> 2016-04-02 3:32 GMT+03:00 David Guillen Fandos <david@davidgf.net>:
>>>> Hello there!
>>>>
>>>> I'm trying to add some vector registers to a MIPS arch (32 bit). This
>>>> arch has 32 x 128 bit registers that can essentially be seen as V4SF.
>>>> So far I'm using this test:
>>>>
>>>> volatile float foo __attribute__ ((vector_size (16)));
>>>> volatile float bar __attribute__ ((vector_size (16)));
>>>>
>>>> int main() {
>>>> foo = foo + bar;
>>>> }
>>>>
>>>> Which produces the right SSE/AVX instructions for x86 but fails on my
>>>> mips cross compiler with my modifications.
>>>> The modifications I did so far are:
>>>>
>>>> - Add 32 new regsiters, adding a register class, updating/adding bit
>>>> fields, updating also other macros that deal with reg allocation (like
>>>> caller saved and stuff). Also incremented the first pseudo reg value.
>>>>
>>>> - Add 3 define_insn that load, store and add vectors.
>>>>
>>>> - Tweak some things here and there to let the compiler know about the
>>>> V4SF type being available.
>>>>
>>>> So far the compiler goes back to scalar code, not working properly at
>>>> the veclower pass. My test.c.123t.veclower21 looks like:
>>>>
>>>> <bb 2>:
>>>> foo.0_2 ={v} foo;
>>>> bar.1_3 ={v} bar;
>>>> _6 = BIT_FIELD_REF <foo.0_2, 32, 0>;
>>>> _7 = BIT_FIELD_REF <bar.1_3, 32, 0>;
>>>> _8 = _6 + _7;
>>>> _9 = BIT_FIELD_REF <foo.0_2, 32, 32>;
>>>> _10 = BIT_FIELD_REF <bar.1_3, 32, 32>;
>>>> _11 = _9 + _10;
>>>> _12 = BIT_FIELD_REF <foo.0_2, 32, 64>;
>>>> _13 = BIT_FIELD_REF <bar.1_3, 32, 64>;
>>>> _14 = _12 + _13;
>>>> _15 = BIT_FIELD_REF <foo.0_2, 32, 96>;
>>>> _16 = BIT_FIELD_REF <bar.1_3, 32, 96>;
>>>> _17 = _15 + _16;
>>>> foo.2_4 = {_8, _11, _14, _17};
>>>> foo ={v} foo.2_4;
>>>> return 0;
>>>>
>>>>
>>>> Any ideas on what I'm missing and/or how to further debug this? I don't
>>>> really want autovectorization, just to be able to use vec registers
>>>> "manually".
>>>
>>> Hi.
>>>
>>> Can't say for sure since you didn't attach your patch. But vector
>>> lowering happens for a vector statement which doesn't have corresponding
>>> entry in optab. You must ensure your templates have proper names to
>>> get them added to optabs.
>>>
>>> Thanks,
>>> Ilya
>>>
>>>>
>>>> Thanks!
>>>> David
>>>>
>>
>> Hey Ilya, thanks for the response.
>>
>> My patterns look like this:
>>
>>
>> ;; Vector load.
>> (define_insn "load_v4sf"
>> [(set (match_operand:V4SF 0 "register_operand" "=kv")
>> (match_operand:V4SF 1 "memory_operand" "m") )]
>> ""
>> "lv.q\t%0,%1"
>> )
>> ;; Vector store.
>> (define_insn "store_v4sf"
>> [(set (match_operand:V4SF 0 "memory_operand" "=m")
>> (match_operand:V4SF 1 "register_operand" "kv") )]
>> ""
>> "sv.q\t%0,%1"
>> )
>>
>> ;; Add vector.
>> (define_insn "vadd4sf"
>> [(set (match_operand:V4SF 0 "register_operand" "=kv")
>> (plus:V4SF (match_operand:V4SF 1 "register_operand" "kv")
>> (match_operand:V4SF 2 "register_operand" "kv")))]
>> ""
>> "vadd.q\t%0,%1,%2"
>> [(set_attr "type" "fadd")])
>>
>>
>> kv represents a constraint that maps to a vector register pool of registers.
>> Does it make sense to you?
>
> Your pattern names don't match standard pattern names and therefore are not
> recognized by optabs. It means these vector patterns can't be used by
> vectorizer
> and corresponding vector statements will be lowered into scalar ones. Look into
> [1] for more details. E.g. for 'add' pattern you should use name 'addv4sf3'.
>
> BR
> Ilya
>
> [1] https://gcc.gnu.org/onlinedocs/gccint/Standard-Names.html
>
>>
>> Many thanks!
>> David
Thanks again Ilya,
That seems to help to solve the problem. Now I'm facing another issue.
It seems the tree-vec-generic pass is promoting my vector operations to
BLKmode and therefore the VECTOR_MODE_P macro evaluates to false,
falling back to scalar mode.
I thought I got it working for a moment when I forgot to fix the
HARD_MODE_REGNO_OK macro that evaluated to false for vector registers.
In that case I mange to dodge this issue but I see another issue
regarding register allocation (obviously! :P)
So the bottom line would be, how do I make sure that my "compute_type"
is V4SF instead of BLKmode? Where does this promotion happen?
Thanks a lot!
David