This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR 59390


On Wed, Dec 11, 2013 at 2:42 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Dec 11, 2013 at 10:15 AM, Bernhard Reutner-Fischer
> <rep.dot.nop@gmail.com> wrote:
>
>>>> Patch updated with two more tests to check if the vfmadd insn is being
>>>> produced when possible.
>>>>
>>>> Thanks
>>>> Sri
>>>>
>>>> On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> Hi,
>>>>>
>>>>>     I have attached a patch to fix
>>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review.
>>>>>
>>>>> Here is the problem. GCC adds target-specific builtins on demand. The
>>>>> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via
>>>>> this declaration:
>>>>>
>>>>> void fun() __attribute__((target("fma")));
>>>>>
>>>>> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when
>>>>> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree
>>>>> when processing this target attribute.
>>>>>
>>>>> Now, when the vectorizer is processing the builtin "__builtin_fma" in
>>>>> function other_fn(), it checks to see if this function is vectorizable
>>>>> and calls ix86_builtin_vectorized_function in i386.c. That returns the
>>>>> builtin stored here:
>>>>>
>>>>>
>>>>> case BUILT_IN_FMA:
>>>>> if (out_mode == DFmode && in_mode == DFmode)
>>>>> {
>>>>>  if (out_n == 2 && in_n == 2)
>>>>>    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>>>           ....
>>>>>
>>>>> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE
>>>>> had the builtin not been added by the previous target attribute. That
>>>>> is why the code works if we remove the previous declaration.
>>>>>
>>>>> The fix is to not just return the builtin but to also check if the
>>>>> current function's isa allows the use of the builtin. For instance,
>>>>> this patch would solve the problem:
>>>>>
>>>>> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>>>>>        if (out_mode == DFmode && in_mode == DFmode)
>>>>>   {
>>>>>    if (out_n == 2 && in_n == 2)
>>>>> -    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>>> +    {
>>>>> +      if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa
>>>>> +  & global_options.x_ix86_isa_flags)
>>>>> +        return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>>> +      else
>>>>> + return NULL_TREE;
>>>>> +    }
>>>>>
>>>>>
>>>>> but there are many instances of this usage in
>>>>> ix86_builtin_vectorized_function. This patch covers all the cases.
>>>
>>>>     PR target/59390
>>>>     * gcc.target/i386/pr59390.c: New test.
>>>>     * gcc.target/i386/pr59390_1.c: New test.
>>>>     * gcc.target/i386/pr59390_2.c: New test.
>>>>     * config/i386/i386.c (get_builtin): New function.
>>>>     (ix86_builtin_vectorized_function): Replace all instances of
>>>>     ix86_builtins[...] with get_builtin(...).
>>>>     (ix86_builtin_reciprocal): Ditto.
>>>
>>> OK, with a couple of nits:
>>>
>>> +static tree get_builtin (enum ix86_builtins code)
>>>
>>> Please name this function ix86_get_builtin.
>>>
>>> +  if (current_function_decl)
>>> +    target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
>>> +  if (target_tree)
>>> +    opts = TREE_TARGET_OPTION (target_tree);
>>> +  else
>>> +    opts = TREE_TARGET_OPTION (target_option_default_node);
>>> +
>>>
>>> opts = TREE_TARGET_OPTION (target_tree ? target_tree :
>>> target_option_default_node);
>>
>> Just curious:
>>> +static tree get_builtin (enum ix86_builtins code)
>>> +{
>> []
>>> +>
>> []
>>> @@ -33677,9 +33701,9 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>>>        if (out_mode == DFmode && in_mode == DFmode)
>>> {
>>>   if (out_n == 2 && in_n == 2)
>>> -    return ix86_builtins[IX86_BUILTIN_SQRTPD];
>>> +    get_builtin (IX86_BUILTIN_SQRTPD);
>>>   else if (out_n == 4 && in_n == 4)
>>> -    return ix86_builtins[IX86_BUILTIN_SQRTPD256];
>>> +    get_builtin (IX86_BUILTIN_SQRTPD256);
>>> }
>>>        break;
>>
>> I must be missing something?
>> Don't you have to return ix86_get_builtin(...) ?

Yes, I noticed it last evening when the vect tests broke. I fixed it
with adding the return (which was a typo when I did a bulk edit) and
running tests. I will include Uros's changes and commit the patch.

Thanks
Sri

>
> Huh, I didn't even notice this mistake.
>
> Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]