[Bug middle-end/82329] New: #pragma GCC target/optimize incurs high compilation time cost

amonakov at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Sep 26 18:09:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82329

            Bug ID: 82329
           Summary: #pragma GCC target/optimize incurs high compilation
                    time cost
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Keywords: compile-time-hog
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

Translation units that include "umbrella" x86 intrinsic files, i.e. x86intrin.h
or immintrin.h are noticeably slow to compile:

$ time echo '#include <x86intrin.h>' | gcc -xc - -S -o /dev/null -Os

real    0m0.162s
user    0m0.150s
sys     0m0.010s

This is because directives like '#pragma GCC target("sse3")' in included files
cause ~8600 intrinsic declarations to parse very slowly. The pragma causes a
'target' attribute to be added to each declaration in the beginning of
attribs.c:decl_attributes, and then the loop over attributes goes into
lookup_scoped_attribute_spec and later on into handle_target_attribute and
ix86_valid_target_attribute_p, all of which seem fairly inefficient.

It probably would have been better to appropriately memoize and reuse tree
nodes instead of looking up the same two items in the hash over and over again.

On a testcase below isolating just this issue, perf shows

    10.52%  cc1      cc1                      [.] cl_option_hasher::hash
     9.52%  cc1      cc1                      [.] cl_optimization_save
     5.17%  cc1      libc-2.24.so             [.] __strcmp_sse2_unaligned
     3.80%  cc1      cc1                      [.] iterative_hash_host_wide_int
     3.43%  cc1      libc-2.24.so             [.] _int_malloc
     2.47%  cc1      libc-2.24.so             [.] _int_free
     2.31%  cc1      libc-2.24.so             [.] malloc
     2.28%  cc1      libc-2.24.so             [.] malloc_consolidate
     2.09%  cc1      cc1                      [.] ggc_internal_alloc
     1.91%  cc1      cc1                      [.]
ix86_valid_target_attribute_tree


#define x10(x, a) \
x(a##0) x(a##1) x(a##2) x(a##3) x(a##4) x(a##5) x(a##6) x(a##7) x(a##8) x(a##9)
#define x100(x, a) \
x10(x, a##0) x10(x, a##1) x10(x, a##2) x10(x, a##3) x10(x, a##4) \
x10(x, a##5) x10(x, a##6) x10(x, a##7) x10(x, a##8) x10(x, a##9)
#define x1000(x, a) \
x100(x, a##0) x100(x, a##1) x100(x, a##2) x100(x, a##3) x100(x, a##4) \
x100(x, a##5) x100(x, a##6) x100(x, a##7) x100(x, a##8) x100(x, a##9)
#define x10000(x, a) \
x1000(x, a##0) x1000(x, a##1) x1000(x, a##2) x1000(x, a##3) x1000(x, a##4) \
x1000(x, a##5) x1000(x, a##6) x1000(x, a##7) x1000(x, a##8) x1000(x, a##9)

#define x(a) void a(void);

#pragma GCC target("sse3")
x10000(x, a)


More information about the Gcc-bugs mailing list