This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: (R5900) Implementing Vector Support
- From: Woon yung Liu <ysai187 at yahoo dot com>
- To: Richard Henderson <rth at redhat dot com>, Gcc Mailing List <gcc at gcc dot gnu dot org>
- Date: Sat, 7 May 2016 07:28:00 +0000 (UTC)
- Subject: Re: (R5900) Implementing Vector Support
- Authentication-results: sourceware.org; auth=none
- References: <e7a9b3fd-4008-c321-3c6b-8532a4a5a8ec at redhat dot com>
- Reply-to: Woon yung Liu <ysai187 at yahoo dot com>
Hi,
>On Tuesday, May 3, 2016 1:41 AM, Richard Henderson <rth@redhat.com> wrote:
>On 04/29/2016 07:54 AM, Liu Woon Yung wrote:
>> I've done something like that, but GCC still doesn't select the pattern to use:
>> (define_insn "vec_cmp<MMI_VCMP_OP:code><MMI_VWHB:mode>"
>
>Because you've used the wrong name. The patterns are:
The version of GCC (v5.3.0) that I am working on, does not have the vec_cmp stuff under x86/sse.md.
I guess that I've spent so long working on these patches, that GCC has moved on again.
I'm now trying to implement vcond, given that it can be already tested with the current version of GCC that I am working on.
At a later time, I'll port my patches to work on the newest GCC version, for submission.
Regarding multiplication of vectors, is there a way to work with a multiplication operation that results in something like this (the result is spread across these 3 registers), without re-ordering any elements:
RD: A6xB6, A4xB4, A2xB2, A0xA0
LO: A7xB7, A6xB6, A3xB3, A2xA2
HI: A5xB5, A4xB4, A1xB1, A0xA0
A0-A7 and B0-B7 are the 8 elements of two V8HI vectors, which are multiplied together to produce a widened multiplication result.
It looks like the vector hi/lo multiplication pattern would work with the values in HI and LO, but the order of the elements don't seem to be in a way that GCC expects.
Assuming that it is possible to put this pattern to use, does GCC allow the vec_widen_smult_hi and
vec_widen_smult_lo patterns to be combined together? Like for the divmod (division + modulus) patterns.
The instruction described above (PMULTH) will result in calculation of both the hi and lo parts of the result, in one instruction. Hence combining the two patterns would be more efficient.
Once again, thank you for your time and for the detailed responses.
Thanks and regards,
-W Y