This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
- From: "hubicka at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 09 Apr 2015 04:08:14 +0000
- Subject: [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
- Auto-submitted: auto-generated
- References: <bug-65701-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701
--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
The profile difference is:
52.31% facerec facerec [.] MAIN__.lto_priv.3
ï
16.68% facerec facerec [.] topcostfct.3487.lto_priv.4
ï
8.28% facerec facerec [.] __gaborroutines_MOD_gabortrafo
ï
7.91% facerec facerec [.] cfftb_
ï
7.20% facerec libgfortran.so.3 [.] _gfortrani_cshift0_r4
ï
2.76% facerec facerec [.] __fft2d_MOD_fft2db
ï
1.54% facerec facerec [.]
__graphroutines_MOD_graphsimfct.constprop.0
ï
0.53% facerec libc-2.13.so [.] __memcpy_ssse3
ï
(mainline) WRT
59.16% facerec facerec [.] MAIN__.lto_priv.3
ï
10.95% facerec facerec [.] __gaborroutines_MOD_gabortrafo
ï
10.51% facerec facerec [.] cfftb1_
ï
9.33% facerec libgfortran.so.3 [.] _gfortrani_cshift0_r4
ï
3.64% facerec facerec [.] __fft2d_MOD_fft2db
ï
2.07% facerec facerec [.]
__graphroutines_MOD_graphsimfct.constprop.0
ï
0.67% facerec libc-2.13.so [.] __memcpy_ssse3
ï
0.57% facerec libgfortran.so.3 [.] _gfortrani_read_radix
ï
0.43% facerec libgcc_s.so.1 [.] __udivti3
ï
0.36% facerec libgfortran.so.3 [.] formatted_transfer
ï
patch reverted. I wonder if we don't want to iline udivti... I suppose the
problem is that we no longer inline topcostfct which we do not inline
because...
not inlinable: localmove.constprop/304 -> topcostfct/208, --param
large-function-growth limit reached
while patched tree suceeds:
Inlining topcostfct size 1393.
Called once from localmove.constprop 740 insns.
Accounting size:1132.00, time:12187.80 on predicate:(true)
Bumping the large-function-insns limit up to 4000 makes the function to be
inlined but curiously enough causes further degradation. The profile is now:
66.35% facerec facerec [.] MAIN__.lto_priv.3
ï
8.93% facerec facerec [.] __gaborroutines_MOD_gabortrafo
ï
8.72% facerec facerec [.] cfftb_
ï
7.77% facerec libgfortran.so.3 [.] _gfortrani_cshift0_r4
ï
2.96% facerec facerec [.] __fft2d_MOD_fft2db
ï
1.68% facerec facerec [.]
__graphroutines_MOD_graphsimfct.constprop.0
ï
0.55% facerec libc-2.13.so [.] __memcpy_ssse3
ï
0.47% facerec libgfortran.so.3 [.] _gfortrani_read_radix
ï
0.34% facerec libgcc_s.so.1 [.] __udivti3
ï
0.30% facerec libgfortran.so.3 [.] formatted_transfer
ï
0.22% facerec libgfortran.so.3 [.] next_format0
ï
0.22% facerec facerec [.] cfftf_
ï
0.20% facerec libgfortran.so.3 [.] _gfortrani_read_block_form
ï
so basically identical except that mainline inlines cfftb1_ and the patched
tree inlines cfftb_ which is a wrapper. Perhaps the wrapper heuristics may be
generalized for this.