Summary: | _Generic Feature Expansion | ||
---|---|---|---|
Product: | gcc | Reporter: | Srinath Parvathaneni <srinath.parvathaneni> |
Component: | c | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | jakub |
Priority: | P3 | ||
Version: | 9.0 | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | ||
Attachments: | Preprocessor output |
It is unclear what you are complaining about, for the preprocessor, _Generic is a token like any other, it is up to the user to use preprocessor in a sane way to avoid creating too large output, just watch p0 being used in reg macro 7 times and x in _TYPE macro twice, so you can easily do the math. I'm using _Generic to create polymorphic implementations of MVE intrinsics. MVE have more than 50 data types (combinations) and intrinsics with more upto 5 arguments. So on nesting a call to just two intrinsics is expanding to more than 4000 lines. The nesting of calls increases pre-processor output exponentially and slowing down the compiler. I'm expecting "_Generic" could expand only to exactly matching combination which would optimize the pre-processor output file size. Then perhaps you want to use C++ instead of C? The way the preprocessor works and the way _Generic works is defined in the C standard, we can't handle it just differently from what the standard says, and the type decisions for _Generic can't be done during preprocessing, because that requires syntactic analysis of the source. E.g. glibc tgmath.h implementation using _Generic was suffering similarly and lead into the introduction of compiler builtin that handles what tgmath.h needs. . |
Created attachment 46983 [details] Preprocessor output On compiling the following test.c with gcc and checking the preprocessor file. In C, for a macro using "_Generic" feature, "_Generic" expands completely but not just for the matching case. This expansion is increasing the pre-processor file size enormously in case of nested macros and slowing down the compiler. $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.11' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11) $ cat test.c enum { type_int_t = 1, type_char_t, type_float_t, type_int_p_t, type_float_p_t, type_char_p_t, type_int_16, type_int_32 }; float bb; char cc; #define _TYPE(x) _Generic(x, \ int *: type_int_p_t, \ float *: type_float_p_t, \ char *: type_char_p_t, \ default: _Generic(x, \ short: type_int_16, \ long: type_int_32,\ int: type_int_t, \ char: type_char_t, \ float: type_float_t)) extern void *__undef; #define cast(param, type) \ _Generic(param, type: param, default: *(type *)__undef) #define reg(p0) _Generic( \ (int (*)[_TYPE(p0)])0, \ int (*)[type_int_t]: reg_s8(cast(p0, int)), \ int (*)[type_char_t]: reg_f8(cast(p0, char)), \ int (*)[type_float_t]: reg_c8(cast(p0, float)), \ int (*)[type_int_p_t]: reg_sp(cast(p0, int *)), \ int (*)[type_float_p_t]: reg_fp(cast(p0, float *)), \ int (*)[type_char_p_t]: reg_cp(cast(p0, char *))) float reg_s8(int a) { return bb; } void reg_f8(float a){} void reg_c8(char a){} void reg_sp(int *a){} void reg_fp(float *a){} void reg_cp(char *a){} int main() { int a; float b; char c; reg(a); reg(reg(a)); // more levels of nesting, more complex preprocessor output. return 0; } $ gcc test.c --save-temps