This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fixes to estimate_num_insns from pretty-ipa branch


Hi,
also haydn finished testing today.  Patch is pretty much SPEC neutral.
There are overall code size savings for -Os and for C++ benchmarks.
There are however also two important regressions in performance:
tramp3d http://gcc.opensuse.org/c++bench-haydn/tramp3d/
and some of botan benchmarks
http://gcc.opensuse.org/c++bench-haydn/botan/

The problem is that inliner accounts now loads and stores and on tramp3d
a lot of loads+stores are optimized away only post-inlining.  On
pretty-ipa this problem is solved within inliner heuristics by preticate
function deciding whether given statement is probably going to be
optimized.  This predicate results true for all reads/writes of objects
pointed to by pointers passed to function (this objects) and all
reads/writes of of non-gimple-temporary function parameters.  This
"guesses" that after inlining they will be SRAed away.

Pretty-ipa has more than this in inliner heursitics.  In particular it
kills completelly inline metrics in estumate_num_insns and computes
"benefits" based on how much function execution time is supposed to
improve by inlining compared to how much program will grow.

I don't want to track this all in single mega patch.  So I would suggest

1) Merge the code size changes and mark the two tests as xfail.  
There don't seem to be any performance regressions related to these
problems, we really just change compette unrolling decisions in way
testsuite does not expect.
2) I will look into unroller heruistics - as I understand it now, the
off-byone error is basially from the fact that if exit condition is on
the top of loop, last iteration will be optimized away.  If it is on the
bottom of loop, it will remain.  So I need to account this as well as
write similar predicate as in inliner to decide that all IV arithmetic
and constant array reads will go away after full inlining.
3) I will continue merging pretty-ipa changes supposed to reduce number
of read/writes seen in early optimizations.  Inliner substitution and we
also can reorganize optimization queue and make use of alias analysis
4) then I will re-tune inliner heruistics for new early optimization
queue.

I will also do bit of proof-of-concept testing on pretty-ipa to see if
inliner heuristics can be tune to produce both smaller and faster code
at the average now.  It used to be tuned this way but improvements in
early optimizations drive it to inline more and more so we now produce
bigger binaries in C++ testers on pretty-ipa that should be easy to
reduce again.

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]