This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/38401] TreeSSA-PRE load after store missed optimization
- From: "amylaar at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 2 Feb 2009 20:02:38 -0000
- Subject: [Bug tree-optimization/38401] TreeSSA-PRE load after store missed optimization
- References: <bug-38401-16914@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #24 from amylaar at gcc dot gnu dot org 2009-02-02 20:02 -------
(In reply to comment #22)
> If you post a patch to add the option to enable/disable partial-PRE I will
> happily review and approve it for 4.4.
I experimented using Seteven Bosscher's patch as a starting point and
augmenting the test in do_regular_insertion with a speed based heuristic
to throttle the calls to insert_into_preds_of_block. That was worse than
turning off partial-PRE altogether. Then I added the heuristic also in
do_partial_insertion, which worked better. Then I tried to remove the speed
heuristoc from do_regular_insertion, and taht change only very tiny, although
overall beneficial, effects.
To get meaningful results we had to modify the linking a bit to reduce
instruction cache effects: the most needed libgcc function were pulled out
early and placed next to the core benchmark objects.
applying heuristic only to partial-partial vs. not applying it at all is...
automotive: 6.55389% faster
consumer: 0.00048% worse
networking: 0.03793% faster
office: 0.07269% worse
telecom: 0.00000% faster
applying heuristic only to partial-partial vs. applying it in general is...
automotive: 0.00674% faster
consumer: 0.00076% worse
networking: 0.01746% faster
office: 0.00440% worse
telecom: 0.00002% worse
Unfortunately, there is still no word from the FSF on what they did with our
Copyright Assignment.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38401