This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
- From: Cong Hou <congh at google dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Biener <rguenther at suse dot de>
- Cc: Xinliang David Li <davidxl at google dot com>
- Date: Wed, 9 Jul 2014 21:50:40 -0700
- Subject: Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.
- Authentication-results: sourceware.org; auth=none
- References: <CAK=A3=0jwnwHsyoOnPq1yDwSC=KJHfXwgpyMwvOaCLtqCTo_5g at mail dot gmail dot com> <alpine dot LNX dot 2 dot 00 dot 1311150945290 dot 4261 at zhemvz dot fhfr dot qr> <CAK=A3=3TAJNrivikE-NFAOYsyn49PxqkhYMfs0TzFtDqci2f1Q at mail dot gmail dot com> <alpine dot LNX dot 2 dot 00 dot 1311191034550 dot 4261 at zhemvz dot fhfr dot qr> <CAK=A3=1d3mOmC-cGNiGDjfyZOQ5NWzVQmzeSOwYSF7JoBHJWew at mail dot gmail dot com> <alpine dot LNX dot 2 dot 00 dot 1311200951200 dot 4261 at zhemvz dot fhfr dot qr> <CAK=A3=1Puwf0tkqMuuvRXxHk66a9+kkfL9x+-aqPvP=PdQOicw at mail dot gmail dot com> <CAK=A3=3_9_i_xPNPDxOHVc-+r4cnCu4tBSFPou0YYD=QNoBULQ at mail dot gmail dot com> <alpine dot DEB dot 2 dot 10 dot 1311220135180 dot 3637 at stedding dot saclay dot inria dot fr> <CAK=A3=2s+mVo7Cm3G-5pR_V4mzNTCf8_KFDP+kBQuA6R6sUBxA at mail dot gmail dot com> <alpine dot DEB dot 2 dot 10 dot 1311221239580 dot 5418 at laptop-mg dot saclay dot inria dot fr> <CAK=A3=0Jn-D4ao2NiFwsKL9xpouhJHkgXoYeXLU9uh=oLZOMAg at mail dot gmail dot com> <CAK=A3=0s75ne6wd=ATgkpN1czzF2pX_joD4HHEb5NB05jO+p-A at mail dot gmail dot com> <CAK=A3=3VgC4ySQiRBEZH-=mJMGWDkyddzas-n+4SK-SxVvpcLQ at mail dot gmail dot com> <CAAkRFZLVb9hYqfEvD3Lg0EKuWc6Sz5GCEBdSt1n8ZF2CiPOeKg at mail dot gmail dot com>
Ping?
thanks,
Cong
On Tue, Jul 8, 2014 at 8:23 PM, Xinliang David Li <davidxl@google.com> wrote:
> Cong, can you ping this patch again? There does not seem to be
> pending comments left.
>
> David
>
> On Tue, Dec 17, 2013 at 10:05 AM, Cong Hou <congh@google.com> wrote:
>> Ping?
>>
>>
>> thanks,
>> Cong
>>
>>
>> On Mon, Dec 2, 2013 at 5:02 PM, Cong Hou <congh@google.com> wrote:
>>> Any comment on this patch?
>>>
>>>
>>> thanks,
>>> Cong
>>>
>>>
>>> On Fri, Nov 22, 2013 at 11:40 AM, Cong Hou <congh@google.com> wrote:
>>>> On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>> On Thu, 21 Nov 2013, Cong Hou wrote:
>>>>>
>>>>>> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>>>>
>>>>>>> On Thu, 21 Nov 2013, Cong Hou wrote:
>>>>>>>
>>>>>>>> While I added the new define_insn_and_split for vec_merge, a bug is
>>>>>>>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ]
>>>>>>>> only takes one input, but the corresponding builtin functions have two
>>>>>>>> inputs, which are shown in i386.c:
>>>>>>>>
>>>>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2,
>>>>>>>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN,
>>>>>>>> (int)MULTI_ARG_2_SF },
>>>>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2,
>>>>>>>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN,
>>>>>>>> (int)MULTI_ARG_2_DF },
>>>>>>>>
>>>>>>>> In consequence, the ix86_expand_multi_arg_builtin() function tries to
>>>>>>>> check two args but based on the define_expand of xop_vmfrcz<mode>2,
>>>>>>>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be
>>>>>>>> incorrect (because it only needs one input).
>>>>>>>>
>>>>>>>> The patch below fixed this issue.
>>>>>>>>
>>>>>>>> Bootstrapped and tested on ax x86-64 machine. Note that this patch
>>>>>>>> should be applied before the one I sent earlier (sorry for sending
>>>>>>>> them in wrong order).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This is PR 56788. Your patch seems strange to me and I don't think it
>>>>>>> fixes the real issue, but I'll let more knowledgeable people answer.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank you for pointing out the bug report. This patch is not intended
>>>>>> to fix PR56788.
>>>>>
>>>>>
>>>>> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788
>>>>> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the
>>>>> associated builtin, which would solve your issue as well.
>>>>
>>>>
>>>> I agree. Then I will wait until your patch is merged to the trunk,
>>>> otherwise my patch could not pass the test.
>>>>
>>>>
>>>>>
>>>>>
>>>>>> For your function:
>>>>>>
>>>>>> #include <x86intrin.h>
>>>>>> __m128d f(__m128d x, __m128d y){
>>>>>> return _mm_frcz_sd(x,y);
>>>>>> }
>>>>>>
>>>>>> Note that the second parameter is ignored intentionally, but the
>>>>>> prototype of this function contains two parameters. My fix is
>>>>>> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have
>>>>>> three operands instead of two, to let it have the correct information
>>>>>> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to
>>>>>> match the type of the second parameter in the builtin function in
>>>>>> ix86_expand_multi_arg_builtin().
>>>>>
>>>>>
>>>>> I disagree that this is intentional, it is a bug. AFAIK there is no AMD
>>>>> documentation that could be used as a reference for what _mm_frcz_sd is
>>>>> supposed to do. The only existing documentations are by Microsoft (which
>>>>> does *not* ignore the second argument) and by LLVM (which has a single
>>>>> argument). Whatever we chose for _mm_frcz_sd, the builtin should take a
>>>>> single argument, and if necessary we'll use 2 builtins to implement
>>>>> _mm_frcz_sd.
>>>>>
>>>>
>>>>
>>>> I also only found the one by Microsoft.. If the second argument is
>>>> ignored, we could just remove it, as long as there is no "standard"
>>>> that requires two arguments. Hopefully it won't break current projects
>>>> using _mm_frcz_sd.
>>>>
>>>> Thank you for your comments!
>>>>
>>>>
>>>> Cong
>>>>
>>>>
>>>>> --
>>>>> Marc Glisse