Summary
- Branch svn://gcc.gnu.org/svn/gcc/branches/lto
Status
What's Currently Broken
- Nothing really works yet. However we are close to getting C working, at least single file at a time.
- C++ and Fortran will require more progress on langhook removal as well as additional coverage of the structures being serialized,
How to play
Checkout the lto branch.
Configure with--enable-languages=lto --disable-bootstrap
- To compile test.c and produce an object file with a serialized function:
./xgcc -B. -flto -c test.c
./xgcc -B. -O2 -flto test.o -o out
Note that currently, the -flto flag always enables -O2. This will be fixed later.
Issues left to address
If you are interested in working on any of these issues, please add your name to the item you are interested in and send mail to the list.
Fix types_compatibles_p langhook hack.
- This langhook allows the front ends to communicate language specific aliasing information to the middle and back ends. For some languages this is quite useful, for others it is less so. Since LTO is not supposed to know about the language it is compiling, this kind of call back is problematic.
The issue here is that we don't merge struct/union/enumeration types in LTO mode. So LTO abuses types_compatible_p to do a very weak form of structural equality testing. We need to come up with some mechanism where this kind of information is encoded directly into the gimple representation rather than relying on some out of line call to provide the information.
- This langhook allows the front ends to communicate language specific aliasing information to the middle and back ends. For some languages this is quite useful, for others it is less so. Since LTO is not supposed to know about the language it is compiling, this kind of call back is problematic.
- Properly merge record/union/enumeration types when combining files in LTO
If you have struct foo {int bar;} defined/used in two different files and those files are fed into LTO, LTO will have distinct types for the instances in each file. This is Not Good; the types_compatible_p hack is necessary to prevent ICEs and having distinct types like this may lead to code quality issues.
Fix use of LANG_HOOKS_NAME in lto-lang.c and/or rs6000.c
rs6000.c uses LANG_HOOKS_NAME to determine what to write into its (rs6000-specific) unwind tables on some platforms. Other backends may use LANG_HOOKS_NAME in a similar fashion. There needs to be a plan in place to handle cases like this in a sane manner.
- Make nested functions work properly.
We do the obvious thing of properly serializing the DECL_CONTEXT for functions. However, the function bodies for the nested functions wind up in sections like .gnu_lto.foo.1729 whereas the LTO reader looks for .gnu_lto.foo and fails. It may be enough to make the DWARF output DECL_ASSEMBLER_NAME for such functions and use that to read in the function data.
Improve debug information for -flto programs
- If you compile a program with -flto, the debugging information is awful.
We're not sure if the quality of the debug information is affected with -flto; some investigation is needed here.
- If you compile a program with -flto, the debugging information is awful.
-flto eschews use of .debug_string
-flto turns off .debug_string, instead encoding strings directly into the .debug_info section. This is suboptimal and will impair usability of LTO with large programs.
LTO sticks its fingers all over dwarf2out.c
In some ways, this is unavoidable. But maybe we could refactor dwarf2out.c a bit to make this less ugly. Or redo LTO to not require DWARF. Serializing the global statics and the external variables using the existing gimple code is not more than a few day's work, because we already serialize the local statics. However, serializing the types means adding a lot more cases to the {output|input}_expr_operand to handle all of the hair associated with types.
- Attributes on types and functions are not serialized.
- Fix unused-but-externally visible variable problem
If you have int foo in a file, but you never use it, LTO doesn't do the right thing with it, leading to problems down the road.
- Test on other platforms.
- The current system was developed on x86-64 and generally works. There has been some testing on ppc and x86-32 and nothing on any other platform.
- Address problems with varargs struct/field detection.
tree-stdarg.c doesn't work right with LTO because LTO comes up some new trees rather than using "standard" GCC ones when dealing with varargs. A patch exists, but needs testing. Contact Nathan Froyd.
- Review and commit Paolo Bonzini's patch for eliminating LANG_HOOK_REDUCE_BIT_FIELD_OPERATIONS
Figure out whether UNSIGNEDP to lto_find_integral_type should be passed to the langhook function.
Find out whether DW_AT_subrange_type can deal with unsigned types instead of signed
- Fix debug information and/or reading it in for flexible arrays in structs
- We currently do not have a mechanism for the types that is encoded in DWARF to access the information that is output in gimple.
- Fix warnings from an LTO compilation
- There are some codegen changes relating to alignment of branch targets and prologue/epilogue code selection on x86-64 at least. It would be good to figure out what information we need to save/restore/recreate to make those differences go away.
- Get rid of non file-at-a-time mode.
- Static initializers are not gimplified early enough. This causes an occasional problem when the pre gimplified form contains language specific tree codes. (There are assertion checks for this in the gimple writer).
- There are still places where the front ends may generate some rtl directly. LTO does not serialize rtl. In particular, the C++ FE still has several instances of this.
- Aliasing information needs to be propagated all the way through LTO.
- LTO needs to read/write/merge alias information. We currently say that everything aliases everything else, which is suboptimal.
- LTO only knows as much as the DWARF tells it.
- LTO does the wrong thing in cases like:
struct foo { unsigned int a : 4; unsigned int : 4; unsigned int b : 4; }which is a GNU extension, because the DWARF does not record the presence of the invisible bitfield, although it does correctly record the offset of B. Either the LTO reader needs to become smarter about such cases or the DWARF needs to be enhanced with information about the missing bitfield.
- LTO does the wrong thing in cases like:
Tasks
Summary of CALL_EXPR changes
- Arguments are now stored directly as operands, rather than in a list.
Do not use build3 as a constructor for CALL_EXPRs. Instead, use one of these primitives:
tree build_call_list (enum tree_code code, tree return_type, tree fn, tree arglist) tree build_call_nary (enum tree_code code, tree return_type, tree fn, int nargs, ...) tree build_call_valist (enum tree_code code, tree return_type, tree fn, int nargs, va_list args) tree build_call_array (enum tree_code code, tree return_type, tree fn, int nargs, tree* args)
build_function_call_expr is still around, but it's better to use the new variadic version, build_call_expr, wherever possible.
Question: Which order should the args come in to the new build_call_* functions in reverse like build_function_call_expr or in the correct order? Answer: All the functions, including build_function_call_expr, take the arguments in left-to-right order. There are places in the Java front end that construct CALL_EXPRs with backwards argument lists and later reverse them; those haven't been changed.
- New accessor macros:
CALL_EXPR_FN (exp) CALL_EXPR_STATIC_CHAIN (exp) CALL_EXPR_ARG0 (exp) /* etc, up to CALL_EXPR_ARG2 */ CALL_EXPR_ARG (exp, i)
call_expr_nargs (exp) returns the number of arguments.
- Iterator:
call_expr_arg_iterator iter;
tree arg;
FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
/* ARG is bound to successive arguments of EXP */
...;
To Do List for Additional Memory Use Reductions
* The C and C++ Parsers should be changed to store arguments in a stack-allocated array instead of building a temporary TREE_LIST as calls are parsed. Make the corresponding change to both the C and C++ versions of build_function_call, and fix all the other callers. The C++ overloading resolution functions that now have an arglist parameter also need to be modified to take an array. (Conceptually simple change, potentially large memory savings since it affects all calls seen by the parser.)
* Fix format argument checking in c-format.c to work directly on the argument array that is now passed in, instead of converting it to a list. (Minor savings, only affects calls to format.)
* Get rid of the complexity slot from struct tree_exp. (Potentially large savings since it affects all expression nodes.)
* Share instances TREE_VEC containing parameter lists. Currently, it's possible that gcc has two instances of TREE_VEC representing the same parameter list. See build_function_type' and build_function_type_list'. We could start out with avoiding duplicate instances of TREE_VEC identical to void_vec_node.
* Look around in the parsers for other language constructs where temporary TREE_LISTs are being constructed and then discarded. There seems to be a lot of list-building going on while parsing ObjC and C++ constructs, in particular.
* The Java parser needs a lot of work as it is still constructing TREE_LISTs for CALL_EXPRs in many places. Java people are reportedly already doing some unrelated major restructuring of this code so check with them first.