Differences between revisions 13 and 14
Revision 13 as of 2017-09-13 11:34:38
Size: 10148
Revision 14 as of 2017-11-22 14:18:33
Size: 10142
Deletions are marked like this. Additions are marked like this.
Line 46: Line 46:
=== (C++FE) Compact operator name enumeration ===
 * Operator functions directly use the TREE_CODE enumeration, leading to wanting a 16 bit field. There are not that many C++ operator functions, and with preprocessor trickery compile-time mapping could be compressed, reducing the size of lang_decl_fn::operator_code. (Suggested by Nathan Sidwell)
Line 65: Line 62:

=== Compress DECL flags ===
 * tree-core defines a number of bit flags (DECL_IS_MALLOC, DECL_IS_OPERATOR_NEW, DECL_CONSTRUCTOR, DECL_STATIC_CONSTRUCTOR, etc) that are mutually exclusive. It would be better to use some kind of enumeration, rather than individual flags. (We've run out of bits). (Suggested by Richard Biener & Nathan Sidwell)

GCC Improvement Projects

This page lists projects related to the re-organization of the code base in accordance with GCC's Architectural Goals. Everyone with wiki access is welcome to add new projects to this page.

Please observe the following conventions:

  • The projects listed here are exclusively geared towards improving GCC's code base.

  • Add new entries into one of the major categories. Feel free to define a new major category if the project does not fit any of the existing ones. If all else fails, use the Miscellaneous category, but please try not to abuse it.

  • Each entry in this page is a link to a separate page dealing with that specific project.
  • Some projects are self-contained enough that they can be described here, but if you find yourself writing more than a few paragraphs or lists, please move the project to a separate page and link it from here.


Transition to C++

New template-based API for vectors

Unification of debugging dumps

Simplify GIMPLE generation

Alternatives to GC

Make GCC more modular

Front Ends

Make "convert" a langhook

  • convert is a legacy magic-name langhook not present in the langhooks structure. Uses in the language-independent compiler should be changed to fold_convert unless they really need language-specific semantics. Once no longer called there, the prototype should move from tree.h to the front ends. It would also be appropriate to eliminate cases of multiple front ends defining the same function and the various cases where a langhook is used but the default is for a particular name (rather than a particular default implementation) to be used for that langhook.
  • Probably much of convert.c should move into c-family code (it does checks for invalid conversions and gives errors for them, which is clearly something that belongs in front ends). Non-C-family front ends defining and using their own "convert" functions may well not need semantics from them that fold_convert lacks. Residual generic conversion logic should give ICEs not errors if asked to do a conversion it doesn't know how to do.

Move FE optimizations to middle-end

  • Various front-end optimizations should move to middle-end code; in general, optimization in the front ends should be kept to a minimum where the same can reasonably be accomplished in language-independent code. Some specific comments on shorten_compare are at http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01308.html.

  • For issues around "extended types that behave much like integer and floating-point types", especially for C++, see PR 43622, and the references therein.

(C++FE) Make access-specifier an enumeration

  • Currently special tree-nodes denote private/public/protected [as well as TREE_PUBLIC etc]. It would be good to replace with an enumeration. A side effect of such a change is memory use reduction. (Suggested by Nathan Sidwell)

(C++FE) Compact tree structures

  • The C++ FE's additional tree structures and extensions layout neatly on a 32-bit host. But have extraneous alignment padding on a 64-bit host. It would be good to make the layout 64-bit friendly. Making the access-specifier an enum would be a good first step. (Suggested by Nathan Sidwell)

Middle End

Gimple Front End

Middle End Array Expressions

Make C undefined overflow semantics explicit in the IL


  • TREE_LIST should die. TREE_LIST is the part of static typing of trees most accessible to incremental conversion, although identifiers may also be one of the earlier steps.
  • More generally, TREE_CHAIN should die. Containers should be used instead.

Compress DECL flags

  • tree-core defines a number of bit flags (DECL_IS_MALLOC, DECL_IS_OPERATOR_NEW, DECL_CONSTRUCTOR, DECL_STATIC_CONSTRUCTOR, etc) that are mutually exclusive. It would be better to use some kind of enumeration, rather than individual flags. (We've run out of bits). (Suggested by Richard Biener & Nathan Sidwell)

Tuplify gimple operands: types and decls

Replace ad-hoc flexible arrays with VEC()

Stop abusing GCC_VERSION

  • Scattering GCC_VERSION conditionals across the source tree (a few places also have GNUC conditionals) is bad style. It would be better to define inline functions, or macros describing if a language feature is supported, in one place (with appropriate conditionals in their definitions) and use them everywhere. Suitable places for these definitions include system.h and hwint.h.

Make initial GIMPLE independent of any -f, -m and -O options

Back End

general backend cleanup

Gimple Back End

OpenMP Support

Integer overflow and saturation

  • See PR 48580 for discussion of possible C-source-level interfaces. See a paper of Bik, Girkar, Grey and Tian <http://saluc.engr.uconn.edu/refs/compiler/bik02idioms.pdf> regarding how to detect saturating operations. See <http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00846.html> regarding lowering of fixed-point operations to generic types.

  • -ftrapv (broken), -fwrapv and -fstrict-overflow relate only to source code, GENERIC and GIMPLE semantics, and do not affect RTL which is always modulo. In future such options should also stop affecting GENERIC and GIMPLE semantics (all semantics should go in the IR, not in global option state); see the no-undefined-overflow branch. PR 30484 discusses the question of semantics for division and modulo operations for INT_MIN and -1.

Profiling options

Build System

Top Level Libgcc Migration

Automatic Makefile dependency generation

Toplevel configuration and build system

  • libiberty should not be installed unless specifically requested by configure options.
  • The config-ml.in special handling of particular targets and configure options for them seems ill-conceived, since the right way to configure multilibs is for the relevant configure options to affect the MULTILIB_* settings used when building GCC, not for config-ml.in to have ad hoc code looking at configure options. (Some of this support is deprecated in GCC 4.6; the rest should be reimplemented inside the gcc/ directory.)
  • Toplevel handles unsupported_languages in a suboptimal way. What it should mean is that the languages don't get enabled by default (or by "all" in --enable-languages) but can still be enabled by specifying them manually in --enable-languages - whereas at present it forces the language to be disabled even if the user enables it explicitly.

Macros describing where code in GCC is built

  • There are far too many defines used to condition target code (USED_FOR_TARGET, IN_LIBGCC2, IN_TARGET_LIBS, IN_RTS) in one place or another, plus IN_GCC which "distinguishes between code compiled into GCC itself and other programs built during a bootstrap" according to the makefile comment (but which does nothing of the sort - it's used for other programs such as gcov, and for generator programs, and for target code), plus GENERATOR_FILE which actually has a meaningful use.
  • This set of defines should be cut down - it should be possible to have just USED_FOR_TARGET and GENERATOR_FILE, plus IN_CONFIGURE_TEST or similar to deal with the IN_GCC conditional in system.h.
  • Note that IN_GCC is used in ansidecl.h. Everything in ansidecl.h dealing with compatibility with pre-ISO C should be considered obsolete and removed after removing all uses in the GCC and src repositories; that will allow removing the IN_GCC conditionals. Despite the comments on ansidecl.h claiming to be from the GNU C Library, the glibc copy was removed in 1997 so it's purely a libiberty header now.


Run vectorizer tests multiple times

  • We should work out how to get the various vectorizer testsuites to run multiple times, with each vector ISA variant that's available on the target architecture (so you'd test SSE; 128-bit AVX; 256-bit AVX; and maybe other variants - each variant tested with execution testing if there's hardware support, compile testing otherwise), like the torture testsuites run each test multiple times with different options. Though that certainly complicates all the effective target tests for vectorization support, since the results may depend on the options as well as the target.

Canonicalize test case names

  • The set of testcase names - the things after "PASS: " or "FAIL: " or other statuses - should not depend on the results of the tests.

Implement a unit test framework

Development Tools

Scripts for testing compile time and memory consumption

Patch Tracking


Internal documentation

Compile Time

Speedup areas

Proper GCC Memory Management


Beginner Projects

Finalize Partial Transitions

Bugzilla Stats

None: ImprovementProjects (last edited 2017-11-22 14:18:33 by NathanSidwell)