# Serious code size regression from 3.0.2 to now

Joern Rennecke joern.rennecke@superh.com
Fri Jul 19 09:15:00 GMT 2002

```Joern Rennecke wrote:
> @@ -877,6 +880,12 @@ align_fuzz (start, end, known_align_log,
>        fuzz += (-align_addr ^ growth) & (new_align - known_align);
>        known_align = new_align;
>      }
> +  if (max_skip)
> +    {
> +      if (start_align > new_align)
> +       start_align = new_align;
> +      fuzz += start_align - (1 << length_unit_log);
> +    }
>    return fuzz;
>  }
>
>

Hmm, its actually more complicated than that.  The above will give us a
valid upper bound for the length, but that is not independent of the
start address.  Hence we have no true relaxation process, and the
branch shortining might not converge.
A lot of the arithmetic that you can do with hard alignments just
doesn't apply to max_skip alignments.  In particular, there is no
upper bound on the number of alignment instructions that are relevant
for a given branch.
So we have a choice of

- use simple data structures and a medium-complexity code algorithm that
is O(n^2) in time per iteration, giving O(n^3) worst time overall.
- Use a balanced tree of lookup tables for all alignments, each table having
as many entries as the size of the largest alignment is in unit lengths,
giving O(n*lg(n)) time per iteration.
- Transform the alignments into either a (possibly smaller) alignment or
a constant filler length for branch shortening purposes, and use the
existing algorithm on it.
- Use the non-stable upper bound calculation, and add extra logic so that
instruction sizes (other than alignments) monotonically increase (when
starting with minimum size and doing as many iterations as it takes to
get good optimization) or decrease (when starting with the maximum size).

The first and second solutions can give us optimal solutions, at different,
but substantial costs in code / data and time complexity.

The third and fourth solution give sub-optimal solutions with simpler code
in O(n) time per iteration.  The third can be off by a factor of two, while
the fourth can be off by a the twice the cost of the largest alignment
involved.
The choice is further complicated by the fact that we'll want to change
the max_skips depending on the progress of the branch shortening.  That
makes the third solution above rather impractical.
The second solution seems to be the best if all we care about is branch
shortening.
The fourth seems to be the most pragmatic in getting the most benefit for
reasonable effort.
--
--------------------------
SuperH
2430 Aztec West / Almondsbury / BRISTOL / BS32 4AQ
T:+44 1454 462330

```