target_clones constexpr
Schulz, Roland
roland.schulz@intel.com
Thu Jan 11 19:38:00 GMT 2018
From: Victor Rodriguez [mailto:vm.rod25@gmail.com]
> On Thu, Jan 11, 2018 at 3:10 AM, Mason <slash.tmp@free.fr> wrote:
> > On 11/01/2018 03:44, Roland Schulz wrote:
> >
> >> Is it possible to have a "if constexpr" inside a __target_clones__
> >> multiversioned function to have certain parts of the function depend
> >> on the target? Outside of multiversioned function one normally would
> >> use preprocessor defines to query the SIMD support (e.g. __AVX__).
> >> But this doesn't work inside target_clones given that preprocessor
> >> variables don't depend on the target. __builtin_cpu_supports doesn't
> >> work because it isn't constexpr (and is meant for runtime detection).
> >> One could extract the target specific part into its own function
> >> (using target rather than target_clones attribute) but that doesn't
> >> work with different types. Is there some other way to query the
> >> target which does work inside a multiversioned function?
> Are you looking for simething like this ?
>
>
> #include <stdio.h>
> #include <immintrin.h>
>
> #define MAX 1000000
> int a[256], b[256], c[256];
>
> __attribute__((target_clones("avx2","arch=atom","default")))
> void foo(){
> int i,x;
>
> for (x=0; x<MAX; x++){
> for (i=0; i<256; i++){
> a[i] = b[i] + c[i];
> }
> }
> }
I would like to do something like:
__attribute__((target_clones("avx","default")))
void foo(){
constexpr int width = []() {
if (__builtin_cpu_supports ("avx"))
return 8;
else
return 4;
}();
typedef int vec __attribute__ ((vector_size (width)));
//use vec for something
}
int main() {
foo();
}
I would like to use width to do manual strip-mining or instantiate SIMD C++ templates. The problem is that this doesn't compile because __builtin_cpu_supports isn't constexpr. For every clone of foo, the compiler knows the SIMD support at compile time. I'm looking for a builtin which lets me query that in a constexpr way.
Inside foo with __((target_clones("avx","default"))) where default=sse2, __builtin_cpu_supports ("avx512f")) cannot be constexpr, because it is only known at runtime whether the CPU support AVX512. But if I have the two clones for AVX and SSE, it is known at compile time that the supported SIMD width is =4 for the sse2 clone and >=8 for the AVX clone. My question is whether there is a constexpr way to query that information.
Roland
More information about the Gcc-help
mailing list