This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Function multiversioning question


Hi,

I'm coming back to this after some experiments. If one compiles the
attached example with

gcc -c archtest1.c

one gets the output

archtest1.c:4:2: warning: #warning outer file: AVX512F not defined [-Wcpp]
 #warning outer file: AVX512F not defined
  ^~~~~~~
In file included from archtest1.c:8:
archtest2.c:2:2: warning: #warning inner file: AVX512F defined [-Wcpp]
 #warning inner file: AVX512F defined

which seems to contradict what Jonathan said about macros not being
influenced by the #pragmas.

However, if I compile the same code with clang, I get

martin@debian:~/tmp$ clang-7 -c archtest1.c
archtest1.c:4:2: warning: outer file: AVX512F not defined [-W#warnings]
#warning outer file: AVX512F not defined
 ^
In file included from archtest1.c:8:
./archtest2.c:4:2: warning: inner file: AVX512F not defined [-W#warnings]
#warning inner file: AVX512F not defined
 ^
2 warnings generated.

So the compilers behave differently, even though clang tries to emulate
the GCC pragma.

My question is now: is the fact that gcc defines the __AVX512F__ macro
in the included file a bug, or is this working as intended?

Thanks,
  Martin


On 10/25/18 2:50 PM, Marc Glisse wrote:
> On Thu, 25 Oct 2018, Martin Reinecke wrote:
> 
>> Hi Jonathan,
>>
>> thanks for the quick reply!
>>
>>> Macros are defined during preprocessing, and the preprocessor doesn't
>>> know anything about the target_clones attribute. When the compiler
>>> sees the attribute it can't go back in time and alter the result of
>>> earlier preprocessing.
>>
>> I feared as much.
>> This creates a nasty asymmetry in the sense that gcc's own optimizations
>> will be able to use all target features (because the compiler knows that
>> it is OK to use specific features like AVX instructions) whereas the
>> user has no way to hand-optimize where this becomes necessary. At least
>> not using this nice mechanism.
>>
>>>> Is there a way to achieve what I have in mind?
>>>
>>> If you want three different implementations of the function I think
>>> you need three different clones. Or do runtime checks for the CPU
>>> features inside the function, but that seems suboptimal.
>>
>> I guess I'll just put all functions in question in a separate file and
>> compile this with different flags and name prefixes.
> 
> target_clones does nothing magic, you can also look at target and ifunc.
> https://gcc.gnu.org/wiki/FunctionMultiVersioning
> 
#ifdef __AVX512F__
#warning outer file: AVX512F defined
#else
#warning outer file: AVX512F not defined
#endif

#pragma GCC target("avx512f")
#include "archtest2.c"
#ifdef __AVX512F__
#warning inner file: AVX512F defined
#else
#warning inner file: AVX512F not defined
#endif

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]