See also the thread started with: http://gcc.gnu.org/ml/gcc/2006-05/msg00304.html shorten_branches and, where applicable branch splitting (like in the sh_reorg), should be combined with machine-dependent constant pool placement and small-scale hot/cold partitioning. constant pool placement should take account of branch probabilities. I'll use neg_pool_range / pool_range in the following to describe the ranges for constants as they are used in the arm machine description. Place taken up by constants falls into two categories: - per-constant space, which is easily calculable from each individual constant. That space is known to be needed at most once per constant reference (it might be less if constants are shared) - per-pool space. Some constants need alignments, but the worst cost in any one pool from this requirement should be the maximum alignment minus the minimum size alignment. If a pool can't be placed after an existing barrier, a new one has to be artifically created, i.e. a jump around the pool needs to be insterted. If pools are placed such as to have the minimum number of pools, they must be spaced by at least the sum of the minumum neg_pool_range and the minimum pool_range of constants inside. If other considerations are used for constant pool placement, e.g. cost lengthened branches and their execution frequency, more, smaller, pools can result. The worst case would be to have one pool for every constant reference, but no more than one per basic block plus number of ranges crossed. Some simple rules for minimum pool size can also help to put a more sensible upper bound for pool overhead. I.e. you could say to put additional constants within range into a pool till the overhead is at most, say, 25%, or no more constants are in range. While some improvements to the estimates can be made by observing when the same constant loaded several times in short sucession, I wouldn't expect this to be a significant factor, since cse would reduce the amount of such loads. Thus, for size estimates, we can pretend constants can't be commoned. After doing one insn address calculating pass with optimistic assumptions, i.e. assume all branches are short and no constant pools need to be placed, for each constant, we can specify a maximum range of instructions within which it has to be placed, starting at the address of the reference minus neg_pool_range, and extending up to the address of the reference plus pool_range. Thus, to calculate an upper bound of the size of constant pools within an offset range (e.g. branch site to branch destination), we take the size of all constants that might have been placed up to the larger address, subtract the size of all the constants that are known to have been placed up to the lower address, and add the pool overhead. Assuming we have adopted the 25% rule above, there is no need to keep track of details of alignments and individual constant sizes; all we need is a tally of the constant sizes. We calculate the difference of the tallies, add a quarter (truncated), and calculate a preliminary range; then we take this range, divide it by the pool spacing PS (1) (rounded up), and multiply it by the maximum pool overhead (alignment plus jump); this product is then added to the preliminary range to get the actual upper bound for the range. This calculation can be integrated into the align_fuzz comuptation. (1) PS is calculated here as minimum neg_pool_range plus minimum pool_range minus maximum pool overhead. In order to integrate constants used for branches, each insn has to keep three addresses: the overly optimistic one calculated at the start, which is used to get bounds for the range in which a constant might be placed, the one from the last relaxation pass, which is used for branch offset calculations, and the one that is comupted at some time during the current relaxation pass. The latter two can be kept in arrays that are swapped around to avoid unnecessary copying. Or there might be an array of structs describing each insn, and the struct contains an array with the addresses, and which to use as current and previous changes with each pass. A branch is considered needing a constant according to the offset calculation using the addresses from the previous relaxation pass (that is no constant if the offset can be encoded into the instruction). For the previous relaxation pass (if any), there are two tally entries for each insn: the total constant size known to have been placed so, and the total constant size that might have been placed so far. For the current relaxation pass, there are corresponding entries, but they give the differential (first derivative) of these values; they area zero-initialized at the start of the pass, and at the end consecutevely summed up to give the total tallies which are referenced in the next pass as the two tallies described before. When an insn is encountered that references a constant (that might also be a branch which its offset, AFAICS these are the only ones that are not pass-invariant), the size of the constant is added to the two differential tallies; for the one for values that might have been placed, it is added at the (constant placement) address smaller then the current one by neg_pool_offset, and for the one for constants that are known to be placed, it is added at the address larger than the current one by pool_offset. last-used-pointers to the various (neg_)pool_offsets can be kept to ensure O(n) time complexity (with n being the number of instructions) for each pass. A further compilation time optimization is possible by keeping the differential values from the previous pass, and adjusting them only when a branch changes offset requirements; thus, there is no need to scan every insn in every pass.
Looks related to PR 9702 and maybe PR 9703