[PATCH] Fix PR 59390

Uros Bizjak ubizjak@gmail.com
Wed Dec 11 10:42:00 GMT 2013


On Wed, Dec 11, 2013 at 10:15 AM, Bernhard Reutner-Fischer
<rep.dot.nop@gmail.com> wrote:

>>> Patch updated with two more tests to check if the vfmadd insn is being
>>> produced when possible.
>>>
>>> Thanks
>>> Sri
>>>
>>> On Fri, Dec 6, 2013 at 11:12 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Hi,
>>>>
>>>>     I have attached a patch to fix
>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59390. Please review.
>>>>
>>>> Here is the problem. GCC adds target-specific builtins on demand. The
>>>> FMA target-specific builtin __builtin_ia32_vfmaddpd gets added via
>>>> this declaration:
>>>>
>>>> void fun() __attribute__((target("fma")));
>>>>
>>>> Specifically, the builtin __builtin_ia32_vfmaddpd gets added when
>>>> ix86_add_new_builtins is called from ix86_valid_target_attribute_tree
>>>> when processing this target attribute.
>>>>
>>>> Now, when the vectorizer is processing the builtin "__builtin_fma" in
>>>> function other_fn(), it checks to see if this function is vectorizable
>>>> and calls ix86_builtin_vectorized_function in i386.c. That returns the
>>>> builtin stored here:
>>>>
>>>>
>>>> case BUILT_IN_FMA:
>>>> if (out_mode == DFmode && in_mode == DFmode)
>>>> {
>>>>  if (out_n == 2 && in_n == 2)
>>>>    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>>           ....
>>>>
>>>> ix86_builtins[IX86_BUILTIN_VFMADDPD] would have contained NULL_TREE
>>>> had the builtin not been added by the previous target attribute. That
>>>> is why the code works if we remove the previous declaration.
>>>>
>>>> The fix is to not just return the builtin but to also check if the
>>>> current function's isa allows the use of the builtin. For instance,
>>>> this patch would solve the problem:
>>>>
>>>> @@ -33977,7 +33977,13 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>>>>        if (out_mode == DFmode && in_mode == DFmode)
>>>>   {
>>>>    if (out_n == 2 && in_n == 2)
>>>> -    return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>> +    {
>>>> +      if (ix86_builtins_isa[IX86_BUILTIN_VFMADDPD].isa
>>>> +  & global_options.x_ix86_isa_flags)
>>>> +        return ix86_builtins[IX86_BUILTIN_VFMADDPD];
>>>> +      else
>>>> + return NULL_TREE;
>>>> +    }
>>>>
>>>>
>>>> but there are many instances of this usage in
>>>> ix86_builtin_vectorized_function. This patch covers all the cases.
>>
>>>     PR target/59390
>>>     * gcc.target/i386/pr59390.c: New test.
>>>     * gcc.target/i386/pr59390_1.c: New test.
>>>     * gcc.target/i386/pr59390_2.c: New test.
>>>     * config/i386/i386.c (get_builtin): New function.
>>>     (ix86_builtin_vectorized_function): Replace all instances of
>>>     ix86_builtins[...] with get_builtin(...).
>>>     (ix86_builtin_reciprocal): Ditto.
>>
>> OK, with a couple of nits:
>>
>> +static tree get_builtin (enum ix86_builtins code)
>>
>> Please name this function ix86_get_builtin.
>>
>> +  if (current_function_decl)
>> +    target_tree = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
>> +  if (target_tree)
>> +    opts = TREE_TARGET_OPTION (target_tree);
>> +  else
>> +    opts = TREE_TARGET_OPTION (target_option_default_node);
>> +
>>
>> opts = TREE_TARGET_OPTION (target_tree ? target_tree :
>> target_option_default_node);
>
> Just curious:
>> +static tree get_builtin (enum ix86_builtins code)
>> +{
> []
>> +>
> []
>> @@ -33677,9 +33701,9 @@ ix86_builtin_vectorized_function (tree fndecl, tre
>>        if (out_mode == DFmode && in_mode == DFmode)
>> {
>>   if (out_n == 2 && in_n == 2)
>> -    return ix86_builtins[IX86_BUILTIN_SQRTPD];
>> +    get_builtin (IX86_BUILTIN_SQRTPD);
>>   else if (out_n == 4 && in_n == 4)
>> -    return ix86_builtins[IX86_BUILTIN_SQRTPD256];
>> +    get_builtin (IX86_BUILTIN_SQRTPD256);
>> }
>>        break;
>
> I must be missing something?
> Don't you have to return ix86_get_builtin(...) ?

Huh, I didn't even notice this mistake.

Uros.



More information about the Gcc-patches mailing list