This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Logging data structure field accesses


To figure out which fields of struct tree_decl were used by constant declarations, I instrumented the compiler to log every access to fields of the data structure; I then summarized the data in a table listing the number of accesses for each field for each kind of declaration. Folks here at Apple found the results interesting, and suggested sharing 'em.

To recap, I added this instrumentation because I wanted to find which fields were truly used by each kind of declaration -- not just what the header files say are used. The number of accesses to each field also gives us some hints about how the fields are used, and which might indicate performance bottlenecks:

* If the number of accesses is small relative to the number of objects of that kind, then we know the field is only used intermittently, and might be worth hiding in a hash table to cut the size of declaration data structures.

* If the number of accesses is equal to the number of objects of that kind, then we can guess that the values may only be initialized or are only touched once, and thus might not actually be needed.

* If the number of accesses is a large multiple of the number of objects of that kind, it might indicate that we're inappropriately traversing every object multiple times.


Adding the logging was pretty quick -- only an hour or two of typing and debugging -- on a 3.4 compiler. I hooked into the DECL_CHECK routine, as it's called for every field access, adding a second parameter naming the field being accessed in the code immediately after the check:


#define DECL_BUILT_IN_CLASS(NODE) \
(FUNCTION_DECL_CHECK (DECL_CHECK(NODE,"built_in_class"))->decl.built_in_class)


DECL_CHECK passed the field name (or NULL) into TREE_CLASS_CHECK; I added additional code to TREE_CLASS_CHECK so if the field name wasn't NULL, the routine would print the tree code (kind of object), tree code class (data structure used), field name, source file, and source line to standard error. I then used grep and awk to grab all the accesses for a given kind of declaration, and sort and count accesses for each field name.

One problem I didn't bother to address was spewing the logging records when the current compiler's used to build portions of the compile at the end of the build; for now, I just make sure to send output to /dev/null when building the instrumented compiler. I tried adding similar instrumentation to a 3.3-compiler, but the node check before the field accesses were often done with FUNCTION_DECL_CHECK and other macros created by gencheck; these don't go through TREE_CLASS_CHECK but through TREE_CHECK, and made the changes a bit more involved.


Here's two examples of the produced data. Each represents the fields of declarations accessed when compiling a program that only includes the <Carbon/Carbon.h> header file (which brings in about 100K lines of header files.) The first web page shows the results when compiled with gcc, the second with g++. (I'd done these measurements to understand why the C compiler only took 1 second to compile the headers, but the C++ compiler took two seconds.) The count of accesses to the uid field indicates the number of declarations of each type, as the field is only accessed during creation of the declaration node.


http://home.earthlink.net/~bowdidge/c-carbon.html
http://home.earthlink.net/~bowdidge/cpp-carbon.html


Robert



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]