Bug 84402 - [meta] GCC build system: parallelism bottleneck
Summary: [meta] GCC build system: parallelism bottleneck
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: bootstrap (show other bugs)
Version: unknown
: P3 normal
Target Milestone: 10.0
Assignee: Not yet assigned to anyone
URL:
Keywords: build, meta-bug
Depends on: 78288 87832
Blocks:
  Show dependency treegraph
 
Reported: 2018-02-15 09:45 UTC by Martin Liška
Modified: 2019-11-07 13:58 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-09-07 00:00:00


Attachments
make all-host -j8 on 8 core Haswell machine (16.73 KB, image/svg+xml)
2018-02-15 09:46 UTC, Martin Liška
Details
make all-host -j128 on 128 core EPYC machine (14.37 KB, image/svg+xml)
2018-02-15 09:47 UTC, Martin Liška
Details
make (for configure --disable-boostrap) -j128 on 128 core EPYC machine (19.67 KB, image/svg+xml)
2018-02-15 09:47 UTC, Martin Liška
Details
wall time report: make (for configure --disable-boostrap) on Haswell machine (system compiler -O2 -g) (8.14 KB, text/plain)
2018-02-15 09:48 UTC, Martin Liška
Details
wall time report: boostrap stage1 on Haswell machine (5.13 KB, text/plain)
2018-02-15 09:49 UTC, Martin Liška
Details
wall time report: boostrap stage2 on Haswell machine (7.49 KB, text/plain)
2018-02-15 09:49 UTC, Martin Liška
Details
wall time report: boostrap stage3 on Haswell machine (9.24 KB, text/plain)
2018-02-15 09:49 UTC, Martin Liška
Details
Parallel build of make all-host on 8 core Haswell machine (19.34 KB, image/svg+xml)
2018-02-15 13:17 UTC, Martin Liška
Details
Parallel build of make all-host on 8 core Haswell machine (19.37 KB, image/svg+xml)
2018-02-15 18:27 UTC, Martin Liška
Details
Parallel build of make all-host on 8 core Haswell machine (23.04 KB, image/svg+xml)
2018-02-16 09:13 UTC, Martin Liška
Details
Parallel build of make all-host on 128 core EPYC machine (20.19 KB, image/svg+xml)
2018-02-16 09:14 UTC, Martin Liška
Details
-ftime-report for most time consuming files on Haswell machine (13.72 KB, text/plain)
2018-02-21 08:45 UTC, Martin Liška
Details
-ftime-report for most time consuming files on Haswell machine (17.30 KB, text/plain)
2018-02-21 14:00 UTC, Martin Liška
Details
Parallel build of make all-host on 128 core EPYC machine (log file) (11.50 KB, text/plain)
2018-02-23 09:01 UTC, Martin Liška
Details
make -j 64 all-gcc, with --disable-bootstrap, on 64-cores. Blue means dependency to gimple-match. (91.20 KB, application/pdf)
2019-02-07 14:02 UTC, Giuliano Belinassi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Liška 2018-02-15 09:45:05 UTC
As discussed yesterday on IRC, current build of GCC has various issues that make it not fully parallelable on machines with higher number of CPUs.

I've did a hack to make where I recorded timestamp when a target is triggered and finished:
https://github.com/marxin/make/tree/timestamp

Then I built GCC with -j1 and used following parser to generate reports:
https://github.com/marxin/script-misc/blob/master/parse-make-log.py

I prepared various reports that I'm going to add as attachments.
Comment 1 Martin Liška 2018-02-15 09:46:20 UTC
Created attachment 43420 [details]
make all-host -j8 on 8 core Haswell machine
Comment 2 Martin Liška 2018-02-15 09:47:05 UTC
Created attachment 43421 [details]
make all-host -j128 on 128 core EPYC machine
Comment 3 Martin Liška 2018-02-15 09:47:36 UTC
Created attachment 43422 [details]
make (for configure --disable-boostrap) -j128 on 128 core EPYC machine
Comment 4 Martin Liška 2018-02-15 09:48:56 UTC
Created attachment 43423 [details]
wall time report: make (for configure --disable-boostrap) on Haswell machine (system compiler -O2 -g)
Comment 5 Martin Liška 2018-02-15 09:49:19 UTC
Created attachment 43424 [details]
wall time report: boostrap stage1 on Haswell machine
Comment 6 Martin Liška 2018-02-15 09:49:30 UTC
Created attachment 43425 [details]
wall time report: boostrap stage2 on Haswell machine
Comment 7 Martin Liška 2018-02-15 09:49:49 UTC
Created attachment 43426 [details]
wall time report: boostrap stage3 on Haswell machine
Comment 8 Martin Liška 2018-02-15 10:08:54 UTC
I forgot to note that minimum time threshold is 0.5s for the wall time reports.
Comment 9 Martin Liška 2018-02-15 13:17:26 UTC
Created attachment 43428 [details]
Parallel build of make all-host on 8 core Haswell machine
Comment 10 Martin Liška 2018-02-15 18:27:52 UTC
Created attachment 43432 [details]
Parallel build of make all-host on 8 core Haswell machine
Comment 11 Martin Liška 2018-02-15 18:40:40 UTC
(In reply to Martin Liška from comment #10)
> Created attachment 43432 [details]
> Parallel build of make all-host on 8 core Haswell machine

This was generated with a slightly modified make (being able to run fully in parallel):
https://github.com/marxin/make/tree/timestamp-v2

And output is then parsed and 'stacked' graph is generated:
https://github.com/marxin/script-misc/blob/master/parse-make-log-parallel.py
Comment 12 Martin Liška 2018-02-16 09:13:42 UTC
Created attachment 43439 [details]
Parallel build of make all-host on 8 core Haswell machine
Comment 13 Martin Liška 2018-02-16 09:14:24 UTC
Created attachment 43440 [details]
Parallel build of make all-host on 128 core EPYC machine
Comment 14 Martin Liška 2018-02-21 08:45:49 UTC
Created attachment 43478 [details]
-ftime-report for most time consuming files on Haswell machine
Comment 15 Segher Boessenkool 2018-02-21 11:58:10 UTC
This is a -O0 build?  That's what that time report shows afaics.
Comment 16 Martin Liška 2018-02-21 14:00:56 UTC
Created attachment 43482 [details]
-ftime-report for most time consuming files on Haswell machine

Properly generated with -O2 which was missing in previous version.
Comment 17 Tom Tromey 2018-02-22 14:46:13 UTC
The results in comment #13 seem to be missing some compilations --
I would have expected to see more files from libcpp in there.
As it is I only see directives.o and line-map.o.
Comment 18 Martin Liška 2018-02-23 09:01:40 UTC
Created attachment 43492 [details]
Parallel build of make all-host on 128 core EPYC machine (log file)
Comment 19 Martin Liška 2018-02-23 09:02:29 UTC
(In reply to Tom Tromey from comment #17)
> The results in comment #13 seem to be missing some compilations --
> I would have expected to see more files from libcpp in there.
> As it is I only see directives.o and line-map.o.

There was a minimum threshold of 0.5s, please take a look at log file in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402#c18
Comment 20 Martin Liška 2018-04-04 12:48:29 UTC
For the libsanitizer/*/*_interceptors I make a quick patch:
https://github.com/marxin/gcc/commit/5ce658230db567474997fa411f23ac78366487ce
which basically splits asan_interceptors.cc and sanitizer_common_interceptors.inc and moves implementation of string functions to a separate compile unit.
This shrinks time from 38->34s for asan_interceptors.cc being built with enabled checking stage1 compiler.

I believe splitting the interceptors to couple of logical sub-files will make it very fast. List of interceptors grepped from sanitizer_common_interceptors.inc:
I can imagine splitting that to components like string, stdio, time, process, thread, math,..

INTERCEPTOR(SIZE_T, strlen, const char *s) {
INTERCEPTOR(SIZE_T, strnlen, const char *s, SIZE_T maxlen) {
INTERCEPTOR(char*, strndup, const char *s, uptr size) {
INTERCEPTOR(char*, __strndup, const char *s, uptr size) {
INTERCEPTOR(char*, textdomain, const char *domainname) {
INTERCEPTOR(int, strcmp, const char *s1, const char *s2) {
INTERCEPTOR(int, strncmp, const char *s1, const char *s2, uptr size) {
INTERCEPTOR(int, strcasecmp, const char *s1, const char *s2) {
INTERCEPTOR(int, strncasecmp, const char *s1, const char *s2, SIZE_T size) {
INTERCEPTOR(char*, strstr, const char *s1, const char *s2) {
INTERCEPTOR(char*, strcasestr, const char *s1, const char *s2) {
INTERCEPTOR(char*, strtok, char *str, const char *delimiters) {
INTERCEPTOR(void*, memmem, const void *s1, SIZE_T len1, const void *s2,
INTERCEPTOR(char*, strchr, const char *s, int c) {
INTERCEPTOR(char*, strchrnul, const char *s, int c) {
INTERCEPTOR(char*, strrchr, const char *s, int c) {
INTERCEPTOR(SIZE_T, strspn, const char *s1, const char *s2) {
INTERCEPTOR(SIZE_T, strcspn, const char *s1, const char *s2) {
INTERCEPTOR(char *, strpbrk, const char *s1, const char *s2) {
INTERCEPTOR(void *, memset, void *dst, int v, uptr size) {
INTERCEPTOR(void *, memmove, void *dst, const void *src, uptr size) {
INTERCEPTOR(void *, memcpy, void *dst, const void *src, uptr size) {
INTERCEPTOR(int, memcmp, const void *a1, const void *a2, uptr size) {
INTERCEPTOR(void*, memchr, const void *s, int c, SIZE_T n) {
INTERCEPTOR(void*, memrchr, const void *s, int c, SIZE_T n) {
INTERCEPTOR(double, frexp, double x, int *exp) {
INTERCEPTOR(float, frexpf, float x, int *exp) {
INTERCEPTOR(long double, frexpl, long double x, int *exp) {
INTERCEPTOR(SSIZE_T, read, int fd, void *ptr, SIZE_T count) {
INTERCEPTOR(SIZE_T, fread, void *ptr, SIZE_T size, SIZE_T nmemb, void *file) {
INTERCEPTOR(SSIZE_T, pread, int fd, void *ptr, SIZE_T count, OFF_T offset) {
INTERCEPTOR(SSIZE_T, pread64, int fd, void *ptr, SIZE_T count, OFF64_T offset) {
INTERCEPTOR_WITH_SUFFIX(SSIZE_T, readv, int fd, __sanitizer_iovec *iov,
INTERCEPTOR(SSIZE_T, preadv, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(SSIZE_T, preadv64, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(SSIZE_T, write, int fd, void *ptr, SIZE_T count) {
INTERCEPTOR(SIZE_T, fwrite, const void *p, uptr size, uptr nmemb, void *file) {
INTERCEPTOR(SSIZE_T, pwrite, int fd, void *ptr, SIZE_T count, OFF_T offset) {
INTERCEPTOR(SSIZE_T, pwrite64, int fd, void *ptr, OFF64_T count,
INTERCEPTOR_WITH_SUFFIX(SSIZE_T, writev, int fd, __sanitizer_iovec *iov,
INTERCEPTOR(SSIZE_T, pwritev, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(SSIZE_T, pwritev64, int fd, __sanitizer_iovec *iov, int iovcnt,
INTERCEPTOR(int, prctl, int option, unsigned long arg2,
INTERCEPTOR(unsigned long, time, unsigned long *t) {
INTERCEPTOR(__sanitizer_tm *, localtime, unsigned long *timep) {
INTERCEPTOR(__sanitizer_tm *, localtime_r, unsigned long *timep, void *result) {
INTERCEPTOR(__sanitizer_tm *, gmtime, unsigned long *timep) {
INTERCEPTOR(__sanitizer_tm *, gmtime_r, unsigned long *timep, void *result) {
INTERCEPTOR(char *, ctime, unsigned long *timep) {
INTERCEPTOR(char *, ctime_r, unsigned long *timep, char *result) {
INTERCEPTOR(char *, asctime, __sanitizer_tm *tm) {
INTERCEPTOR(char *, asctime_r, __sanitizer_tm *tm, char *result) {
INTERCEPTOR(long, mktime, __sanitizer_tm *tm) {
INTERCEPTOR(char *, strptime, char *s, char *format, __sanitizer_tm *tm) {
INTERCEPTOR(int, vscanf, const char *format, va_list ap)
INTERCEPTOR(int, vsscanf, const char *str, const char *format, va_list ap)
INTERCEPTOR(int, vfscanf, void *stream, const char *format, va_list ap)
INTERCEPTOR(int, __isoc99_vscanf, const char *format, va_list ap)
INTERCEPTOR(int, __isoc99_vsscanf, const char *str, const char *format,
INTERCEPTOR(int, __isoc99_vfscanf, void *stream, const char *format, va_list ap)
INTERCEPTOR(int, scanf, const char *format, ...)
INTERCEPTOR(int, fscanf, void *stream, const char *format, ...)
INTERCEPTOR(int, sscanf, const char *str, const char *format, ...)
INTERCEPTOR(int, __isoc99_scanf, const char *format, ...)
INTERCEPTOR(int, __isoc99_fscanf, void *stream, const char *format, ...)
INTERCEPTOR(int, __isoc99_sscanf, const char *str, const char *format, ...)
INTERCEPTOR(int, vprintf, const char *format, va_list ap)
INTERCEPTOR(int, vfprintf, __sanitizer_FILE *stream, const char *format,
INTERCEPTOR(int, vsnprintf, char *str, SIZE_T size, const char *format,
INTERCEPTOR(int, vsnprintf_l, char *str, SIZE_T size, void *loc,
INTERCEPTOR(int, snprintf_l, char *str, SIZE_T size, void *loc,
INTERCEPTOR(int, vsprintf, char *str, const char *format, va_list ap)
INTERCEPTOR(int, vasprintf, char **strp, const char *format, va_list ap)
INTERCEPTOR(int, __isoc99_vprintf, const char *format, va_list ap)
INTERCEPTOR(int, __isoc99_vfprintf, __sanitizer_FILE *stream,
INTERCEPTOR(int, __isoc99_vsnprintf, char *str, SIZE_T size, const char *format,
INTERCEPTOR(int, __isoc99_vsprintf, char *str, const char *format,
INTERCEPTOR(int, printf, const char *format, ...)
INTERCEPTOR(int, fprintf, __sanitizer_FILE *stream, const char *format, ...)
INTERCEPTOR(int, sprintf, char *str, const char *format, ...) // NOLINT
INTERCEPTOR(int, snprintf, char *str, SIZE_T size, const char *format, ...)
INTERCEPTOR(int, asprintf, char **strp, const char *format, ...)
INTERCEPTOR(int, __isoc99_printf, const char *format, ...)
INTERCEPTOR(int, __isoc99_fprintf, __sanitizer_FILE *stream, const char *format,
INTERCEPTOR(int, __isoc99_sprintf, char *str, const char *format, ...)
INTERCEPTOR(int, __isoc99_snprintf, char *str, SIZE_T size,
INTERCEPTOR(int, ioctl, int d, unsigned long request, ...) {
INTERCEPTOR(__sanitizer_passwd *, getpwnam, const char *name) {
INTERCEPTOR(__sanitizer_passwd *, getpwuid, u32 uid) {
INTERCEPTOR(__sanitizer_group *, getgrnam, const char *name) {
INTERCEPTOR(__sanitizer_group *, getgrgid, u32 gid) {
INTERCEPTOR(int, getpwnam_r, const char *name, __sanitizer_passwd *pwd,
INTERCEPTOR(int, getpwuid_r, u32 uid, __sanitizer_passwd *pwd, char *buf,
INTERCEPTOR(int, getgrnam_r, const char *name, __sanitizer_group *grp,
INTERCEPTOR(int, getgrgid_r, u32 gid, __sanitizer_group *grp, char *buf,
INTERCEPTOR(__sanitizer_passwd *, getpwent, int dummy) {
INTERCEPTOR(__sanitizer_group *, getgrent, int dummy) {
INTERCEPTOR(__sanitizer_passwd *, fgetpwent, void *fp) {
INTERCEPTOR(__sanitizer_group *, fgetgrent, void *fp) {
INTERCEPTOR(int, getpwent_r, __sanitizer_passwd *pwbuf, char *buf,
INTERCEPTOR(int, fgetpwent_r, void *fp, __sanitizer_passwd *pwbuf, char *buf,
INTERCEPTOR(int, getgrent_r, __sanitizer_group *pwbuf, char *buf, SIZE_T buflen,
INTERCEPTOR(int, fgetgrent_r, void *fp, __sanitizer_group *pwbuf, char *buf,
INTERCEPTOR(void, setpwent, int dummy) {
INTERCEPTOR(void, endpwent, int dummy) {
INTERCEPTOR(void, setgrent, int dummy) {
INTERCEPTOR(void, endgrent, int dummy) {
INTERCEPTOR(int, clock_getres, u32 clk_id, void *tp) {
INTERCEPTOR(int, clock_gettime, u32 clk_id, void *tp) {
INTERCEPTOR(int, clock_settime, u32 clk_id, const void *tp) {
INTERCEPTOR(int, getitimer, int which, void *curr_value) {
INTERCEPTOR(int, setitimer, int which, const void *new_value, void *old_value) {
INTERCEPTOR(int, glob, const char *pattern, int flags,
INTERCEPTOR(int, glob64, const char *pattern, int flags,
INTERCEPTOR_WITH_SUFFIX(int, wait, int *status) {
INTERCEPTOR_WITH_SUFFIX(int, waitid, int idtype, long long id, void *infop,
INTERCEPTOR_WITH_SUFFIX(int, waitid, int idtype, int id, void *infop,
INTERCEPTOR_WITH_SUFFIX(int, waitpid, int pid, int *status, int options) {
INTERCEPTOR(int, wait3, int *status, int options, void *rusage) {
INTERCEPTOR(int, __wait4, int pid, int *status, int options, void *rusage) {
INTERCEPTOR(int, wait4, int pid, int *status, int options, void *rusage) {
INTERCEPTOR(char *, inet_ntop, int af, const void *src, char *dst, u32 size) {
INTERCEPTOR(int, inet_pton, int af, const char *src, void *dst) {
INTERCEPTOR(int, inet_aton, const char *cp, void *dst) {
INTERCEPTOR(int, pthread_getschedparam, uptr thread, int *policy, int *param) {
INTERCEPTOR(int, getaddrinfo, char *node, char *service,
INTERCEPTOR(int, getnameinfo, void *sockaddr, unsigned salen, char *host,
INTERCEPTOR(int, getsockname, int sock_fd, void *addr, int *addrlen) {
INTERCEPTOR(struct __sanitizer_hostent *, gethostbyname, char *name) {
INTERCEPTOR(struct __sanitizer_hostent *, gethostbyaddr, void *addr, int len,
INTERCEPTOR(struct __sanitizer_hostent *, gethostent, int fake) {
INTERCEPTOR(struct __sanitizer_hostent *, gethostbyname2, char *name, int af) {
INTERCEPTOR(int, gethostbyname_r, char *name, struct __sanitizer_hostent *ret,
INTERCEPTOR(int, gethostent_r, struct __sanitizer_hostent *ret, char *buf,
INTERCEPTOR(int, gethostbyaddr_r, void *addr, int len, int type,
INTERCEPTOR(int, gethostbyname2_r, char *name, int af,
INTERCEPTOR(int, getsockopt, int sockfd, int level, int optname, void *optval,
INTERCEPTOR(int, accept, int fd, void *addr, unsigned *addrlen) {
INTERCEPTOR(int, accept4, int fd, void *addr, unsigned *addrlen, int f) {
INTERCEPTOR(double, modf, double x, double *iptr) {
INTERCEPTOR(float, modff, float x, float *iptr) {
INTERCEPTOR(long double, modfl, long double x, long double *iptr) {
INTERCEPTOR(SSIZE_T, recvmsg, int fd, struct __sanitizer_msghdr *msg,
INTERCEPTOR(SSIZE_T, sendmsg, int fd, struct __sanitizer_msghdr *msg,
INTERCEPTOR(int, getpeername, int sockfd, void *addr, unsigned *addrlen) {
INTERCEPTOR(int, sysinfo, void *info) {
INTERCEPTOR(__sanitizer_dirent *, opendir, const char *path) {
INTERCEPTOR(__sanitizer_dirent *, readdir, void *dirp) {
INTERCEPTOR(int, readdir_r, void *dirp, __sanitizer_dirent *entry,
INTERCEPTOR(__sanitizer_dirent64 *, readdir64, void *dirp) {
INTERCEPTOR(int, readdir64_r, void *dirp, __sanitizer_dirent64 *entry,
INTERCEPTOR(uptr, ptrace, int request, int pid, void *addr, void *data) {
INTERCEPTOR(char *, setlocale, int category, char *locale) {
INTERCEPTOR(char *, getcwd, char *buf, SIZE_T size) {
INTERCEPTOR(char *, get_current_dir_name, int fake) {
INTERCEPTOR(INTMAX_T, strtoimax, const char *nptr, char **endptr, int base) {
INTERCEPTOR(INTMAX_T, strtoumax, const char *nptr, char **endptr, int base) {
INTERCEPTOR(SIZE_T, mbstowcs, wchar_t *dest, const char *src, SIZE_T len) {
INTERCEPTOR(SIZE_T, mbsrtowcs, wchar_t *dest, const char **src, SIZE_T len,
INTERCEPTOR(SIZE_T, mbsnrtowcs, wchar_t *dest, const char **src, SIZE_T nms,
INTERCEPTOR(SIZE_T, wcstombs, char *dest, const wchar_t *src, SIZE_T len) {
INTERCEPTOR(SIZE_T, wcsrtombs, char *dest, const wchar_t **src, SIZE_T len,
INTERCEPTOR(SIZE_T, wcsnrtombs, char *dest, const wchar_t **src, SIZE_T nms,
INTERCEPTOR(SIZE_T, wcrtomb, char *dest, wchar_t src, void *ps) {
INTERCEPTOR(int, tcgetattr, int fd, void *termios_p) {
INTERCEPTOR(char *, realpath, const char *path, char *resolved_path) {
INTERCEPTOR(char *, canonicalize_file_name, const char *path) {
INTERCEPTOR(SIZE_T, confstr, int name, char *buf, SIZE_T len) {
INTERCEPTOR(int, sched_getaffinity, int pid, SIZE_T cpusetsize, void *mask) {
INTERCEPTOR(int, sched_getparam, int pid, void *param) {
INTERCEPTOR(char *, strerror, int errnum) {
INTERCEPTOR(int, strerror_r, int errnum, char *buf, SIZE_T buflen) {
INTERCEPTOR(char *, strerror_r, int errnum, char *buf, SIZE_T buflen) {
INTERCEPTOR(int, __xpg_strerror_r, int errnum, char *buf, SIZE_T buflen) {
INTERCEPTOR(int, scandir, char *dirp, __sanitizer_dirent ***namelist,
INTERCEPTOR(int, scandir64, char *dirp, __sanitizer_dirent64 ***namelist,
INTERCEPTOR(int, getgroups, int size, u32 *lst) {
INTERCEPTOR(int, poll, __sanitizer_pollfd *fds, __sanitizer_nfds_t nfds,
INTERCEPTOR(int, ppoll, __sanitizer_pollfd *fds, __sanitizer_nfds_t nfds,
INTERCEPTOR(int, wordexp, char *s, __sanitizer_wordexp_t *p, int flags) {
INTERCEPTOR(int, sigwait, __sanitizer_sigset_t *set, int *sig) {
INTERCEPTOR(int, sigwaitinfo, __sanitizer_sigset_t *set, void *info) {
INTERCEPTOR(int, sigtimedwait, __sanitizer_sigset_t *set, void *info,
INTERCEPTOR(int, sigemptyset, __sanitizer_sigset_t *set) {
INTERCEPTOR(int, sigfillset, __sanitizer_sigset_t *set) {
INTERCEPTOR(int, sigpending, __sanitizer_sigset_t *set) {
INTERCEPTOR(int, sigprocmask, int how, __sanitizer_sigset_t *set,
INTERCEPTOR(int, backtrace, void **buffer, int size) {
INTERCEPTOR(char **, backtrace_symbols, void **buffer, int size) {
INTERCEPTOR(void, _exit, int status) {
INTERCEPTOR(int, pthread_mutex_lock, void *m) {
INTERCEPTOR(int, pthread_mutex_unlock, void *m) {
INTERCEPTOR(__sanitizer_mntent *, getmntent, void *fp) {
INTERCEPTOR(__sanitizer_mntent *, getmntent_r, void *fp,
INTERCEPTOR(int, statfs, char *path, void *buf) {
INTERCEPTOR(int, fstatfs, int fd, void *buf) {
INTERCEPTOR(int, statfs64, char *path, void *buf) {
INTERCEPTOR(int, fstatfs64, int fd, void *buf) {
INTERCEPTOR(int, statvfs, char *path, void *buf) {
INTERCEPTOR(int, fstatvfs, int fd, void *buf) {
INTERCEPTOR(int, statvfs64, char *path, void *buf) {
INTERCEPTOR(int, fstatvfs64, int fd, void *buf) {
INTERCEPTOR(int, initgroups, char *user, u32 group) {
INTERCEPTOR(char *, ether_ntoa, __sanitizer_ether_addr *addr) {
INTERCEPTOR(__sanitizer_ether_addr *, ether_aton, char *buf) {
INTERCEPTOR(int, ether_ntohost, char *hostname, __sanitizer_ether_addr *addr) {
INTERCEPTOR(int, ether_hostton, char *hostname, __sanitizer_ether_addr *addr) {
INTERCEPTOR(int, ether_line, char *line, __sanitizer_ether_addr *addr,
INTERCEPTOR(char *, ether_ntoa_r, __sanitizer_ether_addr *addr, char *buf) {
INTERCEPTOR(__sanitizer_ether_addr *, ether_aton_r, char *buf,
INTERCEPTOR(int, shmctl, int shmid, int cmd, void *buf) {
INTERCEPTOR(int, random_r, void *buf, u32 *result) {
INTERCEPTOR_PTHREAD_ATTR_GET(detachstate, sizeof(int))
INTERCEPTOR_PTHREAD_ATTR_GET(guardsize, sizeof(SIZE_T))
INTERCEPTOR_PTHREAD_ATTR_GET(schedparam, struct_sched_param_sz)
INTERCEPTOR_PTHREAD_ATTR_GET(schedpolicy, sizeof(int))
INTERCEPTOR_PTHREAD_ATTR_GET(scope, sizeof(int))
INTERCEPTOR_PTHREAD_ATTR_GET(stacksize, sizeof(SIZE_T))
INTERCEPTOR(int, pthread_attr_getstack, void *attr, void **addr, SIZE_T *size) {
INTERCEPTOR_PTHREAD_ATTR_GET(inheritsched, sizeof(int))
INTERCEPTOR(int, pthread_attr_getaffinity_np, void *attr, SIZE_T cpusetsize,
INTERCEPTOR_PTHREAD_MUTEXATTR_GET(pshared, sizeof(int))
INTERCEPTOR_PTHREAD_MUTEXATTR_GET(type, sizeof(int))
INTERCEPTOR_PTHREAD_MUTEXATTR_GET(protocol, sizeof(int))
INTERCEPTOR_PTHREAD_MUTEXATTR_GET(prioceiling, sizeof(int))
INTERCEPTOR_PTHREAD_MUTEXATTR_GET(robust, sizeof(int))
INTERCEPTOR_PTHREAD_MUTEXATTR_GET(robust_np, sizeof(int))
INTERCEPTOR_PTHREAD_RWLOCKATTR_GET(pshared, sizeof(int))
INTERCEPTOR_PTHREAD_RWLOCKATTR_GET(kind_np, sizeof(int))
INTERCEPTOR_PTHREAD_CONDATTR_GET(pshared, sizeof(int))
INTERCEPTOR_PTHREAD_CONDATTR_GET(clock, sizeof(int))
INTERCEPTOR_PTHREAD_BARRIERATTR_GET(pshared, sizeof(int)) // !mac !android
INTERCEPTOR(char *, tmpnam, char *s) {
INTERCEPTOR(char *, tmpnam_r, char *s) {
INTERCEPTOR(int, ttyname_r, int fd, char *name, SIZE_T namesize) {
INTERCEPTOR(char *, tempnam, char *dir, char *pfx) {
INTERCEPTOR(int, pthread_setname_np, uptr thread, const char *name) {
INTERCEPTOR(void, sincos, double x, double *sin, double *cos) {
INTERCEPTOR(void, sincosf, float x, float *sin, float *cos) {
INTERCEPTOR(void, sincosl, long double x, long double *sin, long double *cos) {
INTERCEPTOR(double, remquo, double x, double y, int *quo) {
INTERCEPTOR(float, remquof, float x, float y, int *quo) {
INTERCEPTOR(long double, remquol, long double x, long double y, int *quo) {
INTERCEPTOR(double, lgamma, double x) {
INTERCEPTOR(float, lgammaf, float x) {
INTERCEPTOR(long double, lgammal, long double x) {
INTERCEPTOR(double, lgamma_r, double x, int *signp) {
INTERCEPTOR(float, lgammaf_r, float x, int *signp) {
INTERCEPTOR(long double, lgammal_r, long double x, int *signp) {
INTERCEPTOR(int, drand48_r, void *buffer, double *result) {
INTERCEPTOR(int, lrand48_r, void *buffer, long *result) {
INTERCEPTOR(int, rand_r, unsigned *seedp) {
INTERCEPTOR(SSIZE_T, getline, char **lineptr, SIZE_T *n, void *stream) {
INTERCEPTOR(SSIZE_T, __getdelim, char **lineptr, SIZE_T *n, int delim,
INTERCEPTOR(SSIZE_T, getdelim, char **lineptr, SIZE_T *n, int delim,
INTERCEPTOR(SIZE_T, iconv, void *cd, char **inbuf, SIZE_T *inbytesleft,
INTERCEPTOR(__sanitizer_clock_t, times, void *tms) {
INTERCEPTOR(void *, __tls_get_addr, void *arg) {
INTERCEPTOR(uptr, __tls_get_addr_internal, void *arg) {
INTERCEPTOR(SSIZE_T, listxattr, const char *path, char *list, SIZE_T size) {
INTERCEPTOR(SSIZE_T, llistxattr, const char *path, char *list, SIZE_T size) {
INTERCEPTOR(SSIZE_T, flistxattr, int fd, char *list, SIZE_T size) {
INTERCEPTOR(SSIZE_T, getxattr, const char *path, const char *name, char *value,
INTERCEPTOR(SSIZE_T, lgetxattr, const char *path, const char *name, char *value,
INTERCEPTOR(SSIZE_T, fgetxattr, int fd, const char *name, char *value,
INTERCEPTOR(int, getresuid, void *ruid, void *euid, void *suid) {
INTERCEPTOR(int, getresgid, void *rgid, void *egid, void *sgid) {
INTERCEPTOR(int, getifaddrs, __sanitizer_ifaddrs **ifap) {
INTERCEPTOR(char *, if_indextoname, unsigned int ifindex, char* ifname) {
INTERCEPTOR(unsigned int, if_nametoindex, const char* ifname) {
INTERCEPTOR(int, capget, void *hdrp, void *datap) {
INTERCEPTOR(int, capset, void *hdrp, const void *datap) {
INTERCEPTOR(void *, __aeabi_memmove, void *to, const void *from, uptr size) {
INTERCEPTOR(void *, __aeabi_memmove4, void *to, const void *from, uptr size) {
INTERCEPTOR(void *, __aeabi_memmove8, void *to, const void *from, uptr size) {
INTERCEPTOR(void *, __aeabi_memcpy, void *to, const void *from, uptr size) {
INTERCEPTOR(void *, __aeabi_memcpy4, void *to, const void *from, uptr size) {
INTERCEPTOR(void *, __aeabi_memcpy8, void *to, const void *from, uptr size) {
INTERCEPTOR(void *, __aeabi_memset, void *block, uptr size, int c) {
INTERCEPTOR(void *, __aeabi_memset4, void *block, uptr size, int c) {
INTERCEPTOR(void *, __aeabi_memset8, void *block, uptr size, int c) {
INTERCEPTOR(void *, __aeabi_memclr, void *block, uptr size) {
INTERCEPTOR(void *, __aeabi_memclr4, void *block, uptr size) {
INTERCEPTOR(void *, __aeabi_memclr8, void *block, uptr size) {
INTERCEPTOR(void *, __bzero, void *block, uptr size) {
INTERCEPTOR(int, ftime, __sanitizer_timeb *tp) {
INTERCEPTOR(void, xdrmem_create, __sanitizer_XDR *xdrs, uptr addr,
INTERCEPTOR(void, xdrstdio_create, __sanitizer_XDR *xdrs, void *file, int op) {
INTERCEPTOR(int, xdr_bytes, __sanitizer_XDR *xdrs, char **p, unsigned *sizep,
INTERCEPTOR(int, xdr_string, __sanitizer_XDR *xdrs, char **p,
INTERCEPTOR(void *, tsearch, void *key, void **rootp,
INTERCEPTOR(int, __uflow, __sanitizer_FILE *fp) {
INTERCEPTOR(int, __underflow, __sanitizer_FILE *fp) {
INTERCEPTOR(int, __overflow, __sanitizer_FILE *fp, int ch) {
INTERCEPTOR(int, __wuflow, __sanitizer_FILE *fp) {
INTERCEPTOR(int, __wunderflow, __sanitizer_FILE *fp) {
INTERCEPTOR(int, __woverflow, __sanitizer_FILE *fp, int ch) {
INTERCEPTOR(__sanitizer_FILE *, fopen, const char *path, const char *mode) {
INTERCEPTOR(__sanitizer_FILE *, fdopen, int fd, const char *mode) {
INTERCEPTOR(__sanitizer_FILE *, freopen, const char *path, const char *mode,
INTERCEPTOR(__sanitizer_FILE *, fopen64, const char *path, const char *mode) {
INTERCEPTOR(__sanitizer_FILE *, freopen64, const char *path, const char *mode,
INTERCEPTOR(__sanitizer_FILE *, open_memstream, char **ptr, SIZE_T *sizeloc) {
INTERCEPTOR(__sanitizer_FILE *, open_wmemstream, wchar_t **ptr,
INTERCEPTOR(__sanitizer_FILE *, fmemopen, void *buf, SIZE_T size,
INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack *obstack, int sz,
INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack *obstack, int sz,
INTERCEPTOR(void, _obstack_newchunk, __sanitizer_obstack *obstack, int length) {
INTERCEPTOR(int, fflush, __sanitizer_FILE *fp) {
INTERCEPTOR(int, fclose, __sanitizer_FILE *fp) {
INTERCEPTOR(void*, dlopen, const char *filename, int flag) {
INTERCEPTOR(int, dlclose, void *handle) {
INTERCEPTOR(char *, getpass, const char *prompt) {
INTERCEPTOR(int, timerfd_settime, int fd, int flags, void *new_value,
INTERCEPTOR(int, timerfd_gettime, int fd, void *curr_value) {
INTERCEPTOR(int, mlock, const void *addr, uptr len) {
INTERCEPTOR(int, munlock, const void *addr, uptr len) {
INTERCEPTOR(int, mlockall, int flags) {
INTERCEPTOR(int, munlockall, void) {
INTERCEPTOR(__sanitizer_FILE *, fopencookie, void *cookie, const char *mode,
INTERCEPTOR(int, sem_init, __sanitizer_sem_t *s, int pshared, unsigned value) {
INTERCEPTOR(int, sem_destroy, __sanitizer_sem_t *s) {
INTERCEPTOR(int, sem_wait, __sanitizer_sem_t *s) {
INTERCEPTOR(int, sem_trywait, __sanitizer_sem_t *s) {
INTERCEPTOR(int, sem_timedwait, __sanitizer_sem_t *s, void *abstime) {
INTERCEPTOR(int, sem_post, __sanitizer_sem_t *s) {
INTERCEPTOR(int, sem_getvalue, __sanitizer_sem_t *s, int *sval) {
INTERCEPTOR(int, pthread_setcancelstate, int state, int *oldstate) {
INTERCEPTOR(int, pthread_setcanceltype, int type, int *oldtype) {
INTERCEPTOR(int, mincore, void *addr, uptr length, unsigned char *vec) {
INTERCEPTOR(SSIZE_T, process_vm_readv, int pid, __sanitizer_iovec *local_iov,
INTERCEPTOR(SSIZE_T, process_vm_writev, int pid, __sanitizer_iovec *local_iov,
INTERCEPTOR(char *, ctermid, char *s) {
INTERCEPTOR(char *, ctermid_r, char *s) {
INTERCEPTOR(SSIZE_T, recv, int fd, void *buf, SIZE_T len, int flags) {
INTERCEPTOR(SSIZE_T, recvfrom, int fd, void *buf, SIZE_T len, int flags,
INTERCEPTOR(SSIZE_T, send, int fd, void *buf, SIZE_T len, int flags) {
INTERCEPTOR(SSIZE_T, sendto, int fd, void *buf, SIZE_T len, int flags,
INTERCEPTOR(int, eventfd_read, int fd, u64 *value) {
INTERCEPTOR(int, eventfd_write, int fd, u64 value) {
INTERCEPTOR(int, stat, const char *path, void *buf) {
INTERCEPTOR(int, __xstat, int version, const char *path, void *buf) {
INTERCEPTOR(int, __xstat64, int version, const char *path, void *buf) {
INTERCEPTOR(int, __lxstat, int version, const char *path, void *buf) {
INTERCEPTOR(int, __lxstat64, int version, const char *path, void *buf) {
INTERCEPTOR(void *, getutent, int dummy) {
INTERCEPTOR(void *, getutid, void *ut) {
INTERCEPTOR(void *, getutline, void *ut) {
INTERCEPTOR(void *, getutxent, int dummy) {
INTERCEPTOR(void *, getutxid, void *ut) {
INTERCEPTOR(void *, getutxline, void *ut) {
INTERCEPTOR(int, getloadavg, double *loadavg, int nelem) {
INTERCEPTOR(int, mcheck, void (*abortfunc)(int mstatus)) {
INTERCEPTOR(int, mcheck_pedantic, void (*abortfunc)(int mstatus)) {
INTERCEPTOR(int, mprobe, void *ptr) {
INTERCEPTOR(SIZE_T, wcslen, const wchar_t *s) {
INTERCEPTOR(SIZE_T, wcsnlen, const wchar_t *s, SIZE_T n) {
INTERCEPTOR(wchar_t *, wcscat, wchar_t *dst, const wchar_t *src) {
INTERCEPTOR(wchar_t *, wcsncat, wchar_t *dst, const wchar_t *src, SIZE_T n) {
Comment 21 rguenther@suse.de 2018-04-04 13:12:37 UTC
On Wed, 4 Apr 2018, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
> 
> --- Comment #20 from Martin Liška <marxin at gcc dot gnu.org> ---
> For the libsanitizer/*/*_interceptors I make a quick patch:
> https://github.com/marxin/gcc/commit/5ce658230db567474997fa411f23ac78366487ce
> which basically splits asan_interceptors.cc and
> sanitizer_common_interceptors.inc and moves implementation of string functions
> to a separate compile unit.
> This shrinks time from 38->34s for asan_interceptors.cc being built with
> enabled checking stage1 compiler.
> 
> I believe splitting the interceptors to couple of logical sub-files will make
> it very fast. List of interceptors grepped from
> sanitizer_common_interceptors.inc:
> I can imagine splitting that to components like string, stdio, time, process,
> thread, math,..

The question is of course _why_ it is this slow.  It's not that this
is 10000s of functions or very large ones...
Comment 22 Martin Liška 2018-04-04 13:17:40 UTC
(In reply to rguenther@suse.de from comment #21)
> On Wed, 4 Apr 2018, marxin at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84402
> > 
> > --- Comment #20 from Martin Liška <marxin at gcc dot gnu.org> ---
> > For the libsanitizer/*/*_interceptors I make a quick patch:
> > https://github.com/marxin/gcc/commit/5ce658230db567474997fa411f23ac78366487ce
> > which basically splits asan_interceptors.cc and
> > sanitizer_common_interceptors.inc and moves implementation of string functions
> > to a separate compile unit.
> > This shrinks time from 38->34s for asan_interceptors.cc being built with
> > enabled checking stage1 compiler.
> > 
> > I believe splitting the interceptors to couple of logical sub-files will make
> > it very fast. List of interceptors grepped from
> > sanitizer_common_interceptors.inc:
> > I can imagine splitting that to components like string, stdio, time, process,
> > thread, math,..
> 
> The question is of course _why_ it is this slow.  It's not that this
> is 10000s of functions or very large ones...

It's analyzed here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78288
Comment 23 Martin Liška 2018-04-26 07:51:25 UTC
I can easily split insn-emit.c. Once we know which was a split should be done, I can prepare patch for that.
Comment 24 Eric Gallager 2018-09-07 03:23:59 UTC
(In reply to Martin Liška from comment #23)
> I can easily split insn-emit.c. Once we know which was a split should be
> done, I can prepare patch for that.

Confirmed, please do this!
Comment 25 Martin Liška 2018-09-10 11:49:11 UTC
Let me assign it.
Comment 26 Giuliano Belinassi 2019-02-07 14:02:16 UTC
Created attachment 45630 [details]
make -j 64 all-gcc, with --disable-bootstrap, on 64-cores. Blue means dependency to gimple-match.

Since gimple-match.c takes so long to compile, I was wondering if it might be possible to reorder the compilation so we can push its compilation early in the dependency graph.

I did the following steps: 
 1) 'configure --disable-bootstrap'
 2) 'make -j 64 all-gcc'
 3) 'make clean'. 
 4) 'make gimple-match.o' using a wrapper[1] that I created to log all files required by gimple-match, and plotted the attached graphic. Here, blue means dependency and the largest bar is the 'gimple-match.c' itself.

I used a 64 cores AMD Opteron 6376 in the process.

Any ideas?

[1] https://github.com/giulianobelinassi/gcc-timer-analysis
Comment 27 Martin Liška 2019-02-07 14:28:39 UTC
> Since gimple-match.c takes so long to compile, I was wondering if it might
> be possible to reorder the compilation so we can push its compilation early
> in the dependency graph.

No, the proper fix would be to split the generated files and compile them in parallel. Similarly for all the insn-*.c generated files. That would the proper fix.

Anyway, I like the graph you made :)
Comment 28 Segher Boessenkool 2019-02-07 15:10:05 UTC
But what version of GCC is this graph, with what exact configuration?
Comment 29 Giuliano Belinassi 2019-02-07 16:17:18 UTC
> No, the proper fix would be to split the generated files and compile them in parallel. Similarly for all the insn-*.c generated files. That would the proper fix.

Indeed. However, I am working on parallelizing the compilation with threads. This may lead to a solution, but may not be the best for this scenario.

> Anyway, I like the graph you made :)

Thank you.

> But what version of GCC is this graph, with what exact configuration?

* This is the gcc that I used to build: *

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 8.2.0-14' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 8.2.0 (Debian 8.2.0-14) 

* The gcc that I built: *

Using built-in specs.
COLLECT_GCC=./xgcc
Target: x86_64-pc-linux-gnu
Configured with: /home/giulianob/gcc_svn/trunk//configure --disable-checking --disable-bootstrap
Thread model: posix
gcc version 9.0.1 20190205 (experimental) (GCC)
Comment 30 Martin Liška 2019-05-07 12:09:48 UTC
A possible solution can be usage of '-flinker-output=nolto-rel -r' for huge files.
Comment 31 Eric Gallager 2019-11-07 05:38:10 UTC
I think this came up at Cauldron, but I forget what exactly people said about it...
Comment 32 Giuliano Belinassi 2019-11-07 13:58:28 UTC
(In reply to Eric Gallager from comment #31)
> I think this came up at Cauldron, but I forget what exactly people said
> about it...

Actually this PR comes before Cauldron 2019. One way to fix this issue is to make the match.pd parser output several smaller gimple-match.c, and add these to the Makefile. Also repeat this procedure to other big files.

Another solution is to parallelize GCC internals and make GCC communicate with Make somehow so that when a CPU is idle, it starts compiling some files in parallel.