Created attachment 31026 [details] Benchmark files Putting #pragma once in a header file, with or without additional classic include guards, makes GCC very slow. I've created a small benchmark that I'm attaching, with the following files: * common.h: a common file to be included 10,000 times, protected with guards or the pragma. In the tarball there are three versions to be tested. * header0000.h to header9999.h: header files that include common.h * all.h: header file that includes all header????.h files. * main.c: main program, includes common.h and all.h Building that program with gcc -o main main.c takes 1.5 seconds with #pragma once, while merely using include guards takes only 70 milliseconds.
Created attachment 33017 [details] Log of running the attached tests I have added an attachment of me running the test on GCC 4.8.2, I can run on a newer version if needed, it was just what I had at the time.
The GCC preprocessor is very optimized for include guards. Perhaps the same optimizations are not applied for "#pragma once". Someone will need to investigate where the two code paths differ and what could be the reason for the slow-down. Unfortunately, there are very few developers working on the preprocessor and very few users of "#pragma once", so this will likely have very low priority unless one of those users takes the lead to fix it.
I think the difference is the code in libcpp/files.c:should_stack_file that starts: /* Now we've read the file's contents, we can stack it if there are no once-only files. */ if (!pfile->seen_once_only) return true; Any use of "#pragma once" sets this flag. This then leads to a loop over all headers, so n^2 behavior. I think the rationale for this code is that the #pragma must prevent a second inclusion, even if done by a different file name; whereas #ifdef exclusion doesn't suffer from this issue. One possible fix might be to use a hash table rather than a linked list for finding potential duplicates.
(In reply to Tom Tromey from comment #3) > I think the rationale for this code is that the #pragma must > prevent a second inclusion, even if done by a different file > name; whereas #ifdef exclusion doesn't suffer from this issue. Sorry, I don't understand this. The #ifdef include-guards do prevent a second inclusion, even if done by a different file name, no?
(In reply to Manuel López-Ibáñez from comment #4) > Sorry, I don't understand this. The #ifdef include-guards do prevent a > second inclusion, even if done by a different file name, no? No, including the file via a different filename (e.g. via a symlink) will re-open the file and preprocess it again. The include guards mean you don't get re-definition of the file contents, but the file still gets included twice. You can confirm that with strace. With #pragma once the file is not even re-opened, so the compiler needs to do extra checking.
The issue still exists in gcc7. Some numbers from the 3*10000 file inclusion benchmark here: https://tinodidriksen.com/2011/08/cpp-include-speed/ $ /usr/bin/time g++ -c guards-only/main.cpp 0:00.28 $ /usr/bin/time g++ -c pragma-only/main.cpp 0:01.46 Compared to clang, gcc's guard implementation is pretty fast: $ /usr/bin/time clang++ -c guards-only/main.cpp 0:00.92 $ /usr/bin/time clang++ -c pragma-only/main.cpp 0:00.87
(In reply to Tom Tromey from comment #3) > One possible fix might be to use a hash table rather than a > linked list for finding potential duplicates. I'm adding the "easyhack" keyword, because Tom's suggestion above seems sensible and probably isn't too hard to do.
(See the thread at https://inbox.sourceware.org/gcc-patches/6027e3bb-99f9-573b-ff5e-ea1a48882df0@acm.org/.)