This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC][mid-end] Support vectorization of complex numbers using machine instructions.
Tamar Christina <Tamar.Christina@arm.com> writes:
>> > so I'd need 5 parameters and then I'm guessing the other expressions
>> would be removed by DCE at some point?
>>
>> Are you planning to make the FCMLA behaviour directly available as an
>> internal function or provide a higher-level one that does a full complex
>> multiply, with the target lowering that into individual instructions where
>> necessary?
>
> I was planning on doing it as one internal function and leave it up to
> the target to expand it however it needs to.
OK, sounds good.
>> What to do with the intermediate results you don't need is an interesting
>> question :-). Like you say, I was hoping DCE would get rid of them later.
>> Does that not work?
>
> I haven't tried it yet 😊 But I assume it'll work too. I have complex
> add almost working, it generates the right code for the vectorized
> loop. The loads are also corrected and the permute is gone and I
> update all the data references for the two statements I replaced.
Not sure what you mean by the last bit. Why do you need to replace
data references rather than just use the existing ones?
> However for the scalar tail loop I have a problem since I only have
> vector versions of the instructions, and the scalar loop is created
> from the same SLP tree. So I end up with the builtins in the tail
> loop with nothing to expand them to and with no way to differentiate
> between the two calls to the internal fn.
>
> I would need to somehow undo this for the scalar part..
The epilogue loop should just be a copy of the basic block before
vectorisation is applied. The new calls shouldn't be in that,
just in the SLP tree. (This is how pattern statements work too:
they're never added to the basic block, they're just temporary
statements attached to internal vectoriser structures.)
Thanks,
Richard