[Bug middle-end/82329] New: #pragma GCC target/optimize incurs high compilation time cost
amonakov at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Tue Sep 26 18:09:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82329
Bug ID: 82329
Summary: #pragma GCC target/optimize incurs high compilation
time cost
Product: gcc
Version: 7.0
Status: UNCONFIRMED
Keywords: compile-time-hog
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: amonakov at gcc dot gnu.org
Target Milestone: ---
Translation units that include "umbrella" x86 intrinsic files, i.e. x86intrin.h
or immintrin.h are noticeably slow to compile:
$ time echo '#include <x86intrin.h>' | gcc -xc - -S -o /dev/null -Os
real 0m0.162s
user 0m0.150s
sys 0m0.010s
This is because directives like '#pragma GCC target("sse3")' in included files
cause ~8600 intrinsic declarations to parse very slowly. The pragma causes a
'target' attribute to be added to each declaration in the beginning of
attribs.c:decl_attributes, and then the loop over attributes goes into
lookup_scoped_attribute_spec and later on into handle_target_attribute and
ix86_valid_target_attribute_p, all of which seem fairly inefficient.
It probably would have been better to appropriately memoize and reuse tree
nodes instead of looking up the same two items in the hash over and over again.
On a testcase below isolating just this issue, perf shows
10.52% cc1 cc1 [.] cl_option_hasher::hash
9.52% cc1 cc1 [.] cl_optimization_save
5.17% cc1 libc-2.24.so [.] __strcmp_sse2_unaligned
3.80% cc1 cc1 [.] iterative_hash_host_wide_int
3.43% cc1 libc-2.24.so [.] _int_malloc
2.47% cc1 libc-2.24.so [.] _int_free
2.31% cc1 libc-2.24.so [.] malloc
2.28% cc1 libc-2.24.so [.] malloc_consolidate
2.09% cc1 cc1 [.] ggc_internal_alloc
1.91% cc1 cc1 [.]
ix86_valid_target_attribute_tree
#define x10(x, a) \
x(a##0) x(a##1) x(a##2) x(a##3) x(a##4) x(a##5) x(a##6) x(a##7) x(a##8) x(a##9)
#define x100(x, a) \
x10(x, a##0) x10(x, a##1) x10(x, a##2) x10(x, a##3) x10(x, a##4) \
x10(x, a##5) x10(x, a##6) x10(x, a##7) x10(x, a##8) x10(x, a##9)
#define x1000(x, a) \
x100(x, a##0) x100(x, a##1) x100(x, a##2) x100(x, a##3) x100(x, a##4) \
x100(x, a##5) x100(x, a##6) x100(x, a##7) x100(x, a##8) x100(x, a##9)
#define x10000(x, a) \
x1000(x, a##0) x1000(x, a##1) x1000(x, a##2) x1000(x, a##3) x1000(x, a##4) \
x1000(x, a##5) x1000(x, a##6) x1000(x, a##7) x1000(x, a##8) x1000(x, a##9)
#define x(a) void a(void);
#pragma GCC target("sse3")
x10000(x, a)
More information about the Gcc-bugs
mailing list