Adding Incremental Linking to Gold
Goal
The goal of incremental linking is to decrease link time by modifying an existing executable or shared library rather than creating one from scratch. This is most effective when only a small number of input files have changed since the output file was first linked.
Incremental linking is only designed to work when each link has the same view of the file system, and file system timestamps are consistent. This will be true on a single machine, and should normally be true when using various different machines with the same NFS mounts. Moving the incrementally linked output file to a machine with a different filesystem view, and then running another incremental link there, will generate correct output but will typically do a full link rather than an incremental one.
Interface
The incremental linking interface is simple. When the option --incremental is passed to the linker, it checks whether the output file already exists. If it does, and if it contains incremental linking information (described below), then the linker will check the timestamp of the output file and of all input files. The linker will then only link in those input files which are newer than the executable. Whether the output file existed beforehand or not, the linker will insert incremental linking information so that future incremental links will succeed.
Thus during development it will be reasonable to always use the --incremental option. This should behave well in all cases, and should significantly reduce link times when only a few input files have changed.
For a release it will normally be desirable to do a full link without the --incremental option. You will not want to release a binary which contains the incremental linking information, which can be sizeable. Also, an incremental link produces output which is inherently non-repeatable. When it is desirable to get exactly the same output for the same inputs, it is necessary to not use --incremental.
The goal is explicitly that whether or not you use --incremental on the link line will not change the way the program behaves at runtime, except for very minor performance differences.
That said, the --incremental option is incompatible with some options: -r, --strip, --emit-relocs, --gc-sections, --icf, --relax, possibly others.
For additional flexibility, the options --incremental-changed, --incremental-unchanged, and --incremental-unknown may be used. Every input file which appears after --incremental-unchanged will be assumed to have not changed since the last incremental link. After --incremental-changed it will be assumed to have changed. After --incremental-unknown (the default) the linker will check the timestamp. This may be used by build systems which have extra knowledge about which files have changed.
Internal implementation
We will implement gold to work efficiently in the normal case, and to fallback to a normal complete link in less common cases. We will fall back to a normal link in these cases:
The command line changes (except for uses of --incremental-changed, --incremental-unchanged, and --incremental-unknown).
- A linker script changes.
- A global symbol which was previously defined in one input file is now defined in a different input file.
Note in particular that adding an input file to the link, or removing one, will cause a fall back.
These restrictions could be sometimes avoided with some additional work. However, they are not the most important case, and the algorithm is considerably simplified if we do not have to consider changes to symbol overriding.
The linker will parse the command line as usual and build a list of input files. It will look for the output file, and extract the list of input files from the last link. If any linker script is specified with the --script option, it will be treated as the first input file.
gold will walk through the input files as usual in the first pass. For each input file, gold will check the timestamp (or rely on the --incremental-unchanged or --incremental-changed option).
If input file is new or output file is dropped:
- Recompute everything.
If timestamps identical:
- If object or shared library:
- Read file's global symbols from output file.
- If archive:
- All members will be unchanged; skip them all.
- If linker script:
- Read linker script from input file.
If timestamps differ:
- If object or shared library:
- If file changed from object to shared library or vice-versa, recompute everything.
- Read symbol table from input file.
- Read file's global symbols from output file.
- For any newly defined symbols:
- If the symbol was previously defined by a later input file, recompute everything.
- For any symbols which used to be defined by this file but no longer are:
- If the symbol was ever referenced, recompute everything.
- COMDAT symbols will require special handling.
- It's OK to stick with the same object file for the definition of a COMDAT symbol as before, even if a newer object file defines it.
- Similarly, it's OK to drop the definition of a COMDAT symbol, as long as some other object file defines it.
- If archive:
- Treat each member as a changed file, or check the timestamps stored in the archive file.
- If linker script:
- Recompute everything.
After this process, we have a list of input files which are changed, and we have a list of global symbols which changed (because they are defined by an input file which changed).
The next gold pass is to scan relocations. gold must scan the relocations of any input object files which changed.
Next is layout. Unchanged input files keep the same layout. For changed files we walk through the input sections. For each section we are keeping, we use the same layout if there is room available. Otherwise we must allocate new space in an appropriate segment.
In the final pass gold will process the relocations for every file which changed, and for every file which refers to a symbol which changed. For speed of handling changed symbols, for each global symbol, we record a list of relocations which must be recalculated.
Handling debugging information which changes size may require modifications to gdb to support non-contiguous debugging information. (Or maybe we will just always put the debugging information at the end of the file).
To implement this algorithm, we need the following information:
- A list of all input files with timestamps.
- For each input object file or shared library:
- A list of global symbols.
- For each input object file:
- A list of where input sections are in the output file, and how large they are.
- For each global symbol:
- A list of relocations to apply if the value changes.
- For each GOT entry:
- The type of GOT entry.
- The symbol it refers to (if global), or the input file containing the symbol it refers to (if local).
- For each PLT entry:
- The symbol it refers to.
- For each segment:
- Available space.
To find free space in each segment, the incremental linker will look for gaps between input sections within segments. The incremental linker will normally allocate some extra space for each input section when possible, in order to easily accomodate small increases in section size. When the incremental linker first creates an output file it will also allocate some amount of space in such a section at the end of each segment. The incremental linker will create new segments as needed to get more space; it may be necessary to disable this feature on certain operating systems, in which case it will fallback to a full relink.
In general the incremental linker should always compute essentially the same .dynamic, .dynsym, and .dynstr sections.
Data storage
.gnu_incremental_inputs
The section .gnu_incremental_inputs will list the input files used for the link. It will have section type SHT_GNU_INCREMENTAL_INPUTS. The SHF_ALLOC flag will be clear. The sh_link field will refer to a section .gnu_incremental_strtab which will have type SHT_STRTAB and will be a normal string table, again with SHF_ALLOC clear.
Header:
- 4 byte version number
- 4 byte input file count
- In the order in which the files are included in the link.
If --script is used, the first input file is the script.
4 byte offset of command line options in .gnu_incremental_strtab
- 4 byte padding
For each input file:
4 byte offset of file name in .gnu_incremental_strtab section
- This is the file name which the linker opens.
- 4 byte offset of data in this section
- offset from start of section to data for this file.
- 8 byte modification time seconds
The value of the st_mtime field of a stat of the file (zero extended to eight bytes if necessary).
- 4 byte modification time nanoseconds
- If available.
- 2 byte type of file
- object, archive library member, shared library, archive, or linker script.
- 2 byte padding
The data found at the offset in this section:
For a linker script:
- Nothing additional. Offset may be zero.
For an object file (or archive member):
- 4 byte count of input sections
- 4 byte count of global symbols
- For each input section:
4 byte offset to section name in .gnu_incremental_strtab section
- 4 byte output section index
- If input section is omitted, this is zero.
- 4 or 8 byte offset of input section in output section
- 4 or 8 byte input section size
- For each global symbol:
- 4 byte index into output global symbol table
- 4 byte offset to next linked list entry
- Linked list starts from global symbol list.
- Offset is to entry in input file list.
- 4 byte count of relocations for this file
4 byte offset to relocations in .gnu_incremental_relocs section
When reading the incremental information we will create a list of files for each symbol. The start of each list is pointed by a hash table indexed by the symbol name. When reading the information for an object file or a shared library the name of each symbol can be found in the global symbol table. When reading the information about the unused members of an archive the name is found in .gnu_incremental_strtab.
For an archive:
- 4 byte count of members
- 4 byte count of unused symbols
- For each member:
- 4 byte offset of member in input file list
- For each symbol:
4 byte offset to symbol name in .gnu_incremental_strtab section
For a shared library:
- 4 byte count of global symbols
- For each symbol:
- 4 byte index into output global symbol table
.gnu_incremental_symtab
The section .gnu_incremental_symtab will list the global symbols. It will have section type SHT_GNU_INCREMENTAL_SYMTAB. The SHF_ALLOC flag flag will be clear. The sh_link field will point to the .gnu_incremental_inputs section. The contents will exactly parallel the global symbols in the output file symbol table.
For each global symbol:
4 byte offset in .gnu_incremental_inputs section
- This is the start of a linked list of object files.
- Each object file which mentions the symbol will be on the list.
.gnu_incremental_relocs
The section .gnu_incremental_relocs will list relocations. It will have section type SHT_GNU_INCREMENTAL_RELOCS. The SHF_ALLOC flag will be clear. The sh_link field will point to the .gnu_incremental_inputs section. The contents will be largely processor dependent. Each object which refers to a global symbol will have a relocation count and a relocation offset into this section, as described above. Typical section contents will depend on whether this is a 32-bit or a 64-bit target.
For each relocation entry:
- 4 byte relocation type
- 4 byte output section index
- 4 or 8 byte offset in output section
- 4 or 8 byte addend
The typical relocation computation for a REL relocation will be to first write the addend field into the output section contents, replacing the current value, and then run the usual relocation routine. The typical relocation computation for a RELA relocation will be unchanged. However, specific processors may adopt different implementations where appropriate. The machine independent part of the linker only needs to know the count and size of the relocation information.
.gnu_incremental_got_plt
The section .gnu_incremental_got_plt will list the GOT and PLT entries. It will have section type SHT_GNU_INCREMENTAL_GOT_PLT. The SHF_ALLOC flag will be clear. The sh_link field will point to the .gnu_incremental_inputs section.
- 4 byte count of GOT entries
- 4 byte count of PLT entries
- For each GOT entry:
- 1 byte GOT entry type (target-dependent).
- The high-order bit (0x80) is set for a local symbol.
- The value 0xff indicates a reserved entry (e.g., the second of a pair).
- 1 byte GOT entry type (target-dependent).
- Padding to a 4-byte boundary
- For each GOT entry:
- 4 byte value:
offset to an input file entry in .gnu_incremental_inputs section (for a local symbol)
- index into output global symbol table (for a global symbol)
- 0 for a reserved entry
- 4 byte value:
- For each PLT entry:
- 4 byte index into output global symbol table
Interaction with concurrent linking
In concurrent linking the linker does not assume that input files are available until notified by an external process. An incremental concurrent link will only affect the first pass, in which the linker determines what has changed. The linker will not complete the first pass until all input files are available.
Link Glossary
see LinkGlossary