[PR rtl-optimization/39077] [4.3/4.4/4.5/4.6 Regression] GCSE-optimization causes enormous binary size increase (~20 times !)

Jeff Law law@redhat.com
Thu Jan 13 05:42:00 GMT 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So the problem here isn't abnormal edges at all (except that they create
a large, complex CFG).  Attempts to fix this problem by filtering out
expressions on abnormal edges/blocks is ultimately doomed to failure.

What we're running into here is expressions which are evaluated on a
small number of paths, but which have to be inserted on a huge number of
paths to make them fully redundant.

I grabbed an expression at random (Index #999 to make it easy to find).
 It's computed in a whopping 3 blocks, two of which are partially
redundant.  The expression is transparent virtually everywhere (it's
just an address computation).   The expression needs to be inserted on >
100 edges to make 2 of the 3 original evaluations fully redundant.

I saw that scenario play out over and over and over.  Filtering out
based on abnormal edges isn't sufficient as many of the insertions are
bubbling up through the CFG via normal edges.

The test we need to do is build a ratio of the number of locations where
an expression evaluation must be inserted to make a number of original
locations fully redundant (and thus removable).

This is relatively easy to do after the call to pre_edge_lcm.  That
routine gives us a vector of sbitmaps which tell us which expressions
need to be inserted on a given edge and the routine gives us a vector of
sbitmaps which tell us which blocks have evaluations of the expression
which can then be deleted.

So, for each expression, we compute how many insertions are required to
delete some number of original evaluations.  If that ratio gets big, we
remove the insertion & deletion points and go on our merry way.

Right now I've set the limit so that if we need more than 20 insertions
per deletion then the PRE of that particular expression is cancelled.  I
haven't spent any time trying to tune that parameter.  A ratio of 20 is
enough to eliminate the huge bloat from PRE (2000% down to 6%).  An
interested party could certainly tune this further.

It'd be nice if building the counts of insertions/deletions wasn't so
cumbersome, but it is what it is.  pre_edge_lcm can't do it any better
because it computes the appropriate sbitmaps in parallel.

Also note that one could tune this to make PRE -Os safe as well by
making this parameter suitably small for -Os.  Extra credit for suitably
interested parties.

Also note there was a minor buglet where pre_edge_lcm didn't properly
initialize its resulting sbitmaps.  The uninitialized entries didn't
cause problems until we wanted to iterate over the sbitmap using
EXECUTE_IF_SET_IN_SBITMAP.  That bug is fixed by this patch.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for
mainline?


Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNLnsZAAoJEBRtltQi2kC7gc8H/RCzeGK5tPHBQ04VzZQHwOnA
Nz3IFKEqiL+brk8mLjLpTAUE+yLDkBtUno8sxvwm3uD42vaPvNQs+Q2RtI9Gl0V5
5YDYxewK8nYkAu0Zwuk4n41Ca0iyBrQ7CV/rD35WmvzJs/TAyECp+ApeL+xpHj8h
dIrZo/66ZhUpchNlosQGun/zNLNvicqJwnZinBRKHY1booXwJhT2rXPW0O9hSv6i
394oWOtXhijx7DQSjgxEK0vaLJ8aQK63piOWAtVsApQJ684g+VJgmWZprGrNGVNP
lTdflucv4y7j0eWl9ZdMY51FnbpTn0FYr0EU9iNtp4+hx2dBpYZvb8lYw5AcnFQ=
=WNaI
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20110113/86e74447/attachment.ksh>


More information about the Gcc-patches mailing list