Section Anchor Optimisations
- This optimisation will allow GCC to access more than one object from the same symbolic address. For example, suppose a section contains
two variables x and y and x and y are close together. The optimisation will create a common anchor point -- let's call it A -- and allow both x and y to be accessed from A The optimisation will of course be subject to the usual binding rules and will need to be aware of special cases like mergeable constants. At the moment, GCC makes no assumptions about the relative positions of static variables and constants, writing them out in a more-or-less arbitrary order. A major part of the project will therefore be to assign specific positions to objects and to write them out appropriately. The new infrastructure will also allow GCC to reorder objects within a section. Such a reordering might try to reduce the number of anchors or improve cache locality. However, this aspect of the project will be open-ended and isn't going to be part of the initial patch.
Personnel
Richard Sandiford and others from CodeSourcery
Delivery Date
- This optimisation should be ready by the end of stage 1.
Benefits
- The benefits of the optimisation should be twofold:
- It will reduce the number of GOT entries and GOT accesses.
- In some cases, it will help to reduce code size.
Dependencies
- None.
Modifications Required
- An initial outline of the design is given below.
Data Structures
- Sections will be represented by a new structure:
struct section {
/* The name of the section. */
const char *name;
/* The set of SECTION_* flags that apply to this section. */
unsigned int flags;
/* The minimum alignment of the first byte in the section,
measured in bits. */
HOST_WIDE_INT alignment;
/* The objects contained in this section, sorted by offset from the
start of the section. */
VEC(section_object) *objects;
/* The anchors associated with this section, sorted by offset from
the start of the section. */
VEC(section_object) *anchors;
};
typedef struct section section;Section objects will be an extension of a SYMBOL_REF rtx:
struct section_object {
/* The section that contains this object. */
section *section;
/* The offset of this object from the start of the section. */
HOST_WIDE_INT offset;
/* The object's SYMBOL_REF. */
struct rtx_def symbol;
}An object will only have an associated section_object if we need
- to know the object's position within its section. This is never
true for unoptimised or -fno-unit-at-a-time compilation, so gcc will never use <tt>section_object</tt>s in those modes. See also the notes on -fno-unit-at-a-time below.
- to know the object's position within its section. This is never
There will be a new SYMBOL_REF flag called =SYMBOL_FLAG_SECTION=
to mark symbols that are contained in a section_object structure.
Each anchor will be represented as a section_object
A new SYMBOL_REF_SECTION macro will return a pointer to the
symbol's section. It will return NULL for symbols without the new SYMBOL_FLAG_SECTION flag.
A new SYMBOL_REF_OFFSET macro will return the offset of the
- symbol from the start of its section. It will return -1 for
symbols without the new SYMBOL_FLAG_SECTION flag.
- symbol from the start of its section. It will return -1 for
Altered Target Hooks
- =section *select_section (tree exp, int reloc, unsigned HOST_WIDE_INT alignment)=
This hook will replace the existing asm.select_section hook and will take the same arguments. It will return the section that should be used for decl (unlike the current hook, which emits assembly code instead). If =DECL_SECTION_NAME (decl)= is non-null, the hook should return a section with that name and with whatever flags are appropriate.
- =section *select_rtx_section (enum machine_mode mode, rtx x, unsigned HOST_WIDE_INT align)=
This hook will replace asm.select_rtx_section in the same way.
Removed Target Hooks
The select_type_flags hook will be removed as the information it
- currently provides will now be available from the sections returned
by select_section Doing this satifies the following target.h fixme:
- currently provides will now be available from the sections returned
/* ??? Should be merged with SELECT_SECTION and UNIQUE_SECTION. */
New Variables
- =section *text_section=
- =section *unlikely_text_section=
- =section *data_section=
- =section *readonly_data_section=
- =section *bss_section=
- These variables will replace the functions of the same name and will
provide section objects for commonly-used sections. In keeping with existing gcc practice, these sections will be distinct from normal named sections and their names will therefore be null. Note that we need to keep these sections separate from named sections in order to honor target-specific oddities. For example, some targets have no named sections at all, and others (like SH)
sometimes put their code into sections other than .text Such variations are currently handled by having a separate target macro for each well-known section. After making this change, implementations of the =section_section=
and select_rtx_section hooks will be able to return pointers like text_section in cases where they would previously call functions of the same name.
- These variables will replace the functions of the same name and will
- =section *in_section=
This variable will replace the static varasm.c variable of the same name and will be made globally visible. It will point to the current section, or null if the current section isn't known.
Removed Functions
- =int in_text_section (void)=
- =int in_unlikely_text_section (void)=
- =int in_data_section (void)=
- These functions will no longer be needed. Code can simply compare
in_section against the new section * variables.
- These functions will no longer be needed. Code can simply compare
New Target Hooks
- =HOST_WIDE_INT min_anchor_offset=
- =HOST_WIDE_INT max_anchor_offset=
- The smallest and largest byte offsets that can be applied to an anchor symbol. The defaults will be 0, which will disable the anchor optimisations altogether.
- =bool use_anchor (tree decl)=
This hook will return true if decl can be accessed using anchors. The default implementation will apply basic target-independent rules and will be enough for most backends. However, some targets may require tighter checks, such as to support target-specific attributes.
New Functions
- =section *get_section (const char *name, unsigned int flags, HOST_WIDE_INT alignment, tree decl)=
- Return a section with the given name and flags. The section will be
aligned to at least alignment bits. Null names are allowed and will cause the function to create a new, unnamed section. If the name is nonnull, and no existing section has that name, the function will create and return a new section. When there is an existing section with the same name, the function will return it after performing the following checks:
- Return a section with the given name and flags. The section will be
If decl is nonnull and has a section attribute, the function
will check that flags matches the existing section's flags. It will report an error against decl if not.
If decl is null or does not have a section attribute, the function
will abort when flags is different from the existing flags.
The function will increase the section's alignment to alignment if
- the current alignment is smaller than that.
- =void switch_to_section (section *sect)=
Emit code to switch to section sect if the current section is different. Record the new section in in_section
sect can be one of the special sections listed above. In this case, the function will use the assembly directive normally associated with that section. For example, if sect is text_section the function will use the =TEXT_SECTION_ASM_OP= target macro to switch sections.
- =rtx section_anchor (section *sect, HOST_WIDE_INT offset)=
Return an anchor that can be used to reach byte offset from the start of section sect The offset of the anchor will be no smaller than offset + min_anchor_offset and no larger than offset + max_anchor_offset If no existing anchors are suitable, the function will create
a new one and add it to the section's anchors field.
- =section *decl_section_object (tree decl)=
Do nothing if decl already has a section_object associated with it. Otherwise use the select_section target hook to select a section and place the decl at the end of that section. Update the section's objects field accordingly.
- =rtx anchored_decl_rtl (decl)=
If decl can be safely accessed using an anchor, return an rtx for a MEM that uses such an anchor, otherwise behave like DECL_RTL
If this function has not seen decl before, it will use the new use_anchor hook to see whether anchored accesses are allowed. If they are, it will use =decl_section_object= to get the decl's section object and use =section_anchor= to get an appropriate anchor for it. When using anchors, the function will force the anchor's address
into a register and return a MEM whose address has the form plus_constant (anchor_reg, offset)
- =void cgraph_optimize_section_layout (void)=
- This function acts as an unbrella for section layout optimisations. It is called before any function or variable has been written. For example, if this function wants to rearrange anchored
variables, it would first call decl_section_object for every queued variable for which the new use_anchor hook returns true. It could then rearrange the section contents as it sees fit. Use of this function will be optional. It will still be possible to apply anchor optimisations when the function isn't called.
- This function acts as an unbrella for section layout optimisations. It is called before any function or variable has been written. For example, if this function wants to rearrange anchored
Other Changes
cgraph_optimize will call cgraph_optimize_section_layout before
- generating code for functions or variables. The call will be under the control of a command-line flag.
The tree->rtl expansion code ('expand_expr' and its subroutines)
will use anchored_decl_rtl instead of DECL_RTL Back-ends will therefore see anchored addresses where appropriate.
force_const_mem will use select_rtx_section to assign a
- section to the constant. Where possible, it will return an anchored address, in much the same way as the =anchored_decl_rtl= function described above.
Functions like assemble_variable and =output_constant_def=
- will not output any code for trees that have a =section_object= associated with them. Note that this change has no effect when gcc is not creating any
<tt>section_object</tt>s at all. As mentioned above, this includes non-optimised and -fno-unit-at-a-time compilation.
- will not output any code for trees that have a =section_object= associated with them. Note that this change has no effect when gcc is not creating any
- At the end of compilation, gcc will process each section that has
at least one section_object associated with it. It will first emit asm code to select that section and will then output the following items for each object within the section:
- A directive for the padding between the object and the previous one.
A label for the object's SYMBOL_REF
- The contents of the object. In the case of constant
- and variable objects, the contents can be found using
SYMBOL_REF_DECL In the case of pooled rtx constants, it can be found using the constant pool structures.
- It will also use output a definition of each anchor symbol.
- and variable objects, the contents can be found using
Implementation Notes
- Anchor optimisations will only work in unit-at-a-time mode.
- In non-unit-at-a-time mode, a decl might be redeclared after it has been used, and with information that would cause it to be put in a separate section. For example:
int x;
int f (void) { return x; }
int x __attribute__ ((section (".foo")));- Non-unit-at-a-time compilation could not use anchor optimisations
for the access to x in f
Layout optimisations in cgraph_optimize will be incompatible with
cgraph_varpool_remove_unreferenced_decls which tries to remove variables after functions have been written.
Adding anchor optimisations to force_mem will be incompatible
- with gcc's attempts to avoid unused rtx constants. However, this restriction shouldn't matter much in practice, since most unused constants should now be removed by the tree-level optimisers.
named_section already creates a structure per section so that
it can record the associated flags. The new section structure would subsume this functionality, so having one instance per section should not significantly increase memory consumption.
- The approach described above means that we only create
<tt>section_object</tt>s for objects that we want to access using anchors. There would be no per-object memory overhead for other objects.
<tt>VEC</tt>s seemed like a good choice for the section =objects=
- field because:
- Objects will initially be added to the end of a section
and <tt>VEC</tt>s provide amortised O(1) back insertion.
- A binary search would provide O(log n) look-up by offset.
<tt>VEC</tt>s have very low memory overhead when compared to competing
- structures like splay trees.
The symbol field of the section_object structure must come
last because struct rtx_def ends with a variable-length array.
- One (rejected) alternative was to keep the existing =text_section=
functions and use more verbose names like text_section_info for the new section * variables. text_section would then be a short-hand for switch_to_section (text_section_info) However, it seemed confusing to have two section-switching interfaces, and giving the variables obvious names like text_section should improve code clarity.
- In the long run, it might be better to remove target hooks like
TEXT_SECTION_ASM_OP and treat all sections alike. However, such a change would affect some fairly obscure and difficult-to-test targets and would be something of a diversion from the main focus of this project.
It might make sense to replace DECL_SECTION_NAME (a tree STRING_CST
- with a pointer to a section object. The user-selected section name
would then become a parameter to the select_section hook. However, this is no immediate need to do this, and it would not affect the optimisations being considered here.
- with a pointer to a section object. The user-selected section name
Order Of Implementation
The first step would be to introduce the section type and convert
- gcc's interfaces to use it. This work would include the changes to
the select_section and select_rtx_section hooks and everything that depends on them. There would be no need to add the =objects= and anchors fields at this stage.
- gcc's interfaces to use it. This work would include the changes to
- Once gcc has been converted to use an ADT for sections, it will be
- possible to add the anchor optimisations themselves. This step would include the aforementioned changes to =force_const_mem=
and expand_expr and all the work that depends on them. gcc would then be able to use anchor optimisations but would not try to reorder data to reduce the number of anchors.
- possible to add the anchor optimisations themselves. This step would include the aforementioned changes to =force_const_mem=
The next step would be to implement cgraph_optimize_section_layout