[Bug rtl-optimization/64164] [4.9/5 Regression] one more stack slot used due to one less inlining level
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Dec 3 10:03:00 GMT 2014
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Last reconfirmed|2014-12-03 00:00:00 |
Component|middle-end |rtl-optimization
Target Milestone|--- |4.9.3
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The difference is in whether there are extra user-named variables in the end
and thus SSA coalescing decision differences:
stm_load (volatile stm_word_t * addr)
{
- stm_word_t l;
- stm_word_t value;
stm_word_t version;
stm_word_t l;
struct r_entry_t * r;
- stm_word_t now;
...
+ size_t _32;
+ size_t _33;
+ size_t _34;
...
Conflict graph:
+1: 3
+3: 1
After sorting:
Sorted Coalesce list:
+(16610) _30 <-> _33
(651) _10 <-> _30
...
-Coalesce list: (10)_10 & (30)_30 [map: 1, 2] : Success -> 1
+Coalesce list: (30)_30 & (33)_33 [map: 2, 3] : Success -> 2
+Coalesce list: (10)_10 & (30)_30 [map: 1, 2] : Fail due to conflict
So it turns out the different coalescing ends up generating worse code.
It would be interesting to see why we decide that coalescing _30 and _33
is so much more beneficial than coalescing _10 and _30.
Ah, it simply uses EDGE_FREQUENCY... and for some reason we predicted
that _33 & 1 != 0 is 10% taken only.
So ... the theory is that the version is faster on the important path?
More information about the Gcc-bugs
mailing list