Drop frequencies from basic blocks
Markus Trippelsdorf
markus@trippelsdorf.de
Sun Nov 5 09:53:00 GMT 2017
On 2017.11.03 at 16:48 +0100, Jan Hubicka wrote:
> this is updated patch which I have comitted after profiledbootstrapping x86-64
Unfortunately, compiling tramp3d-v4.cpp is 6-7% slower after this patch.
This happens with an LTO/PGO bootstrapped gcc using --enable-checking=release.
On X86_64:
Before:
Performance counter stats for 'g++ -w -Ofast tramp3d-v4.cpp' (4 runs):
25040.360183 task-clock (msec) # 1.000 CPUs utilized ( +- 0.25% )
650 context-switches # 0.026 K/sec ( +- 76.87% )
2 cpu-migrations # 0.000 K/sec ( +- 28.87% )
268,141 page-faults # 0.011 M/sec ( +- 0.01% )
80,210,085,167 cycles # 3.203 GHz ( +- 0.26% ) (66.67%)
21,061,765,388 stalled-cycles-frontend # 26.26% frontend cycles idle ( +- 0.37% ) (66.67%)
24,699,976,439 stalled-cycles-backend # 30.79% backend cycles idle ( +- 0.57% ) (66.68%)
69,167,169,243 instructions # 0.86 insn per cycle
# 0.36 stalled cycles per insn ( +- 0.05% ) (66.68%)
15,230,229,662 branches # 608.227 M/sec ( +- 0.06% ) (66.68%)
986,612,296 branch-misses # 6.48% of all branches ( +- 0.07% ) (66.68%)
25.046439011 seconds time elapsed ( +- 0.25% )
After:
Performance counter stats for 'g++ -w -Ofast tramp3d-v4.cpp' (4 runs):
26710.577065 task-clock (msec) # 1.000 CPUs utilized ( +- 0.27% )
199 context-switches # 0.007 K/sec ( +- 21.12% )
2 cpu-migrations # 0.000 K/sec ( +- 14.29% )
267,676 page-faults # 0.010 M/sec ( +- 0.01% )
85,561,962,974 cycles # 3.203 GHz ( +- 0.26% ) (66.66%)
19,581,827,643 stalled-cycles-frontend # 22.89% frontend cycles idle ( +- 0.30% ) (66.66%)
26,056,535,726 stalled-cycles-backend # 30.45% backend cycles idle ( +- 0.65% ) (66.68%)
77,222,167,966 instructions # 0.90 insn per cycle
# 0.34 stalled cycles per insn ( +- 0.04% ) (66.68%)
17,471,652,187 branches # 654.110 M/sec ( +- 0.05% ) (66.69%)
1,082,141,013 branch-misses # 6.19% of all branches ( +- 0.04% ) (66.69%)
26.713823720 seconds time elapsed ( +- 0.27% )
==================================================================================================================
On PPC64le:
Before:
Performance counter stats for 'g++ -w -Ofast tramp3d-v4.cpp' (4 runs):
24281.894597 task-clock (msec) # 0.989 CPUs utilized ( +- 1.85% )
166 context-switches # 0.007 K/sec ( +- 2.46% )
5 cpu-migrations # 0.000 K/sec ( +- 18.03% )
52,908 page-faults # 0.002 M/sec ( +- 11.61% )
84,939,354,171 cycles # 3.498 GHz ( +- 1.82% ) (66.71%)
4,680,693,343 stalled-cycles-frontend # 5.51% frontend cycles idle ( +- 8.75% ) (49.98%)
46,697,372,688 stalled-cycles-backend # 54.98% backend cycles idle ( +- 2.06% ) (50.05%)
94,990,460,746 instructions # 1.12 insn per cycle
# 0.49 stalled cycles per insn ( +- 0.10% ) (66.72%)
19,562,344,992 branches # 805.635 M/sec ( +- 0.07% ) (50.06%)
807,701,262 branch-misses # 4.13% of all branches ( +- 0.45% ) (50.05%)
24.550558669 seconds time elapsed ( +- 1.83% )
After:
Performance counter stats for 'g++ -w -Ofast tramp3d-v4.cpp' (4 runs):
26383.472582 task-clock (msec) # 0.995 CPUs utilized ( +- 1.83% )
202 context-switches # 0.008 K/sec ( +- 1.68% )
5 cpu-migrations # 0.000 K/sec ( +- 14.29% )
53,114 page-faults # 0.002 M/sec ( +- 17.86% )
92,099,443,793 cycles # 3.491 GHz ( +- 0.96% ) (66.68%)
3,706,147,243 stalled-cycles-frontend # 4.02% frontend cycles idle ( +- 8.31% ) (50.00%)
51,376,299,749 stalled-cycles-backend # 55.78% backend cycles idle ( +- 0.83% ) (50.05%)
105,872,124,981 instructions # 1.15 insn per cycle
# 0.49 stalled cycles per insn ( +- 0.05% ) (66.74%)
22,348,839,937 branches # 847.077 M/sec ( +- 0.16% ) (50.04%)
847,288,219 branch-misses # 3.79% of all branches ( +- 0.06% ) (50.02%)
26.511790685 seconds time elapsed ( +- 1.84% )
--
Markus
More information about the Gcc-patches
mailing list