This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Reorder some tree codes


Giovanni Bajo wrote:
Nathan Sidwell <nathan@codesourcery.com> wrote:


This patch reorders the _TYPE codes and a few others to allow
range comparison to determine certain kinds of types.  This showed
a 0.25% speed improvement on darwin, and a 2% text size reduction in
cc1plus.



I do not like this because we lose clarity in the definition of the macros, and thus we make debugging much harder. For instance:

I'm of two minds.


I'm amazed that this change made a measurable difference, especially since it did not decrease the number of memory locations that were accessed. (It may decrease the number of accesses to memory, but one would think that the subsequent accesses would hit in cache.) I suspect that the reason it made a big difference is that we avoided a lot of branches in the compiler, which would explain the 2% text side reduction. So, we probably get better branch prediction, better icache use, and fewer stalls.

I'm not really concerned that this makes the macros harder to understand, as I'm not convinced that's true. On the other hand, I am concerned that people might move/insert tree codes and thereby break the macros. Thus, I'm concerned that about maintainability, even if I'm not concerned about readability. I'd certainly suggest that the comments in *.def be expanded upon to say not just that order is important, but that there are macros which check a range of tree codes.

One could add an assertion in the start-up code that partially verifies that the new formulation of the macros matches the old formulation. For example:

  #define INTEGRAL_TYPE_CODE_P(CODE) \
     IN_RANGE ((CODE), ENUMERAL_TYPE, INTEGER_TYPE)

  #define INTEGRAL_TYPE_P(TYPE) \
     INTEGRAL_TYPE_CODE_P (TREE_CODE ((TYPE)))

and then:

  gcc_assert (INTEGRAL_TYPE_CODE_P (ENUMERAL_TYPE));
  gcc_assert (INTEGRAL_TYPE_CODE_P (BOOLEAN_TYPE));
  ...

One could even check the negative cases (e.g., that REAL_TYPE was not an INTEGRAL_TYPE_CODE_P) by looping over all other codes. That solves the maintainability problem, at the cost that updating these macros means also updating the assertions. I think that's not so bad.

This is clearly the kind of change that (if it's an otherwise good change) is acceptable in Stage 3; small, but with a non-trivial impact on compile-time performance.

I'm sympathetic to the idea that it would be cooler to improve the middle end so that it did this optimization itself, but I don't know how hard that would be. Does anyone have an idea about that? (The documentation issue remains in this case, in that someone might reorder the codes, and then we have a performance degradation, if it's no longer possible to collapse the checks into a range check.)

--
Mark Mitchell
CodeSourcery, LLC
mark@codesourcery.com
(916) 791-8304


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]