Serious code size regression from 3.0.2 to now

Joern Rennecke
Fri Jul 19 09:15:00 GMT 2002

Joern Rennecke wrote:
> @@ -877,6 +880,12 @@ align_fuzz (start, end, known_align_log,
>        fuzz += (-align_addr ^ growth) & (new_align - known_align);
>        known_align = new_align;
>      }
> +  if (max_skip)
> +    {
> +      if (start_align > new_align)
> +       start_align = new_align;
> +      fuzz += start_align - (1 << length_unit_log);
> +    }
>    return fuzz;
>  }

Hmm, its actually more complicated than that.  The above will give us a
valid upper bound for the length, but that is not independent of the
start address.  Hence we have no true relaxation process, and the
branch shortining might not converge.
A lot of the arithmetic that you can do with hard alignments just
doesn't apply to max_skip alignments.  In particular, there is no
upper bound on the number of alignment instructions that are relevant
for a given branch.
So we have a choice of

- use simple data structures and a medium-complexity code algorithm that
  is O(n^2) in time per iteration, giving O(n^3) worst time overall.
- Use a balanced tree of lookup tables for all alignments, each table having
  as many entries as the size of the largest alignment is in unit lengths,
  giving O(n*lg(n)) time per iteration.
- Transform the alignments into either a (possibly smaller) alignment or
  a constant filler length for branch shortening purposes, and use the
  existing algorithm on it.
- Use the non-stable upper bound calculation, and add extra logic so that
  instruction sizes (other than alignments) monotonically increase (when
  starting with minimum size and doing as many iterations as it takes to
  get good optimization) or decrease (when starting with the maximum size).

The first and second solutions can give us optimal solutions, at different,
but substantial costs in code / data and time complexity.

The third and fourth solution give sub-optimal solutions with simpler code
in O(n) time per iteration.  The third can be off by a factor of two, while
the fourth can be off by a the twice the cost of the largest alignment
The choice is further complicated by the fact that we'll want to change
the max_skips depending on the progress of the branch shortening.  That
makes the third solution above rather impractical.
The second solution seems to be the best if all we care about is branch
The fourth seems to be the most pragmatic in getting the most benefit for
reasonable effort.
2430 Aztec West / Almondsbury / BRISTOL / BS32 4AQ
T:+44 1454 462330

More information about the Gcc-bugs mailing list