This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [4.0 and mainline] Fix multiplication by constant expansion
>
> I am going to try 3 address arechitecture (PPC) and P4 with decomposed leas, but things are somewhat contraintuitive here.
OK, on P4 with shift cost set to 1 I get 4% better results than cost of
4.
Computing cost of lea based on the decomposed cost - 1 brings
regression:
0: -33.99 -51.87 -16.96 -2.39 0.79 32.14 0.06 -0.01 -3.47 4.86 -1.00 -0.33 0.84 1.11 -0.39 4.49
16: 0.32 0.15 -0.02 0.06 0.01 -4.85 -2.22 -0.01 0.00 0.01 0.07 -1.39 -0.69 -0.09 0.38 1.40
32: -3.49 0.08 0.14 -0.07 -0.05 -0.95 -0.01 1.02 -0.45 -0.06 0.45 -1.35 -0.85 0.06 -1.55 -3.35
48: 30.83 15.29 0.14 1.49 0.67 0.40 0.39 -2.83 -0.76 -0.63 -1.21 0.28 -0.54 0.80 1.49 -5.69
64: -49.67 1.55 1.52 -0.03 -0.08 -5.09 1.60 -1.07 -1.11 1.21 -2.98 13.95 -0.96 -0.41 0.65 0.03
80: 0.68 0.72 -0.05 -0.35 0.10 -0.15 0.84 -0.08 -0.28 4.99 0.51 -0.13 0.88 0.03 10.53 0.07
96: -0.02 0.28 0.54 0.13 -10.68 0.02 -0.42 -0.29 10.04 -42.40 -9.86 1.62 1.06 -0.53 -0.37 0.77
112: 0.18 1.39 -1.44 1.45 -0.04 -0.60 -6.15 1.29 -0.06 14.09 -0.43 0.60 0.12 70.00 -0.04 1.67
128: -49.03 -37.23 1.32 -1.06 0.53 -1.31 1.47 -1.00 3.81 2.18 -1.87 -0.22 -1.14 0.47 2.17 1.88
144: 1.44 -1.36 0.56 -0.91 -0.01 -6.02 0.10 1.86 0.24 0.07 0.72 -1.83 0.79 -0.20 -2.06 -24.98
160: -0.73 -24.99 -1.32 -1.95 0.06 -1.56 -0.36 2.36 -0.74 -0.75 -0.82 6.00 -3.67 0.22 -0.74 1.60
176: 2.81 -0.05 1.96 10.89 0.63 -0.72 0.37 -2.23 -0.13 6.71 3.46 0.43 -12.53 0.56 -1.76 0.23
192: -0.11 -1.04 0.13 -0.11 -0.27 -1.12 -0.08 0.79 4.88 -1.13 1.04 11.34 1.46 9.99 0.77 1.81
208: -2.04 0.03 19.49 -11.27 -0.63 -30.79 0.55 -0.03 -0.20 -42.56 -3.43 -0.11 17.94 2.61 0.46 0.03
224: 0.16 0.14 0.25 1.96 1.71 -0.81 -1.68 -0.24 -0.36 1.02 -1.15 1.19 0.53 0.17 0.82 -0.80
240: 1.22 -0.90 0.86 -6.30 -1.15 -1.02 0.26 1.23 -0.39 0.99 54.98 1.16 0.77 -0.77 2.82 -36.54
avg:-1.095159
My latency hack:
0: -1.60 -51.84 -1.28 -2.03 -49.12 23.26 1.08 0.04 -1.62 9.26 -1.07 1.29 0.85 0.83 -28.07 2.84
16: -48.97 -5.47 -0.02 -0.06 0.00 0.01 0.83 -1.83 22.49 -0.06 1.06 -0.73 2.48 -0.65 -13.23 -1.19
32: 92.89 -1.61 -1.12 37.64 -0.14 -0.15 -0.02 1.08 0.11 -1.33 0.51 5.95 0.26 -0.84 1.10 -0.91
48: 34.65 12.72 39.69 -0.70 0.89 1.08 0.33 2.54 11.68 -0.17 0.90 20.76 34.60 -0.17 -14.03 -7.24
64: 0.65 -10.06 0.16 -1.46 -9.26 18.16 42.03 -1.76 -1.15 1.31 -1.85 14.05 -0.03 -0.72 1.08 -1.22
80: 1.80 0.19 0.34 -3.39 0.82 -0.02 -15.56 0.80 0.10 5.10 1.36 0.11 0.16 0.06 24.56 -0.82
96: -0.07 -2.74 0.18 -3.13 26.06 -0.77 -0.30 16.18 0.81 -6.34 0.86 -26.65 0.10 -3.62 -31.88 -4.47
112: -14.16 -3.43 -0.61 -0.21 0.76 0.10 0.15 22.06 32.27 14.11 -0.94 20.01 32.31 39.70 -15.37 2.97
128: -49.07 -0.01 -0.69 -0.94 -0.83 0.57 0.78 -1.86 3.78 2.52 -2.42 -2.42 34.06 33.12 1.61 3.19
144: 3.33 -1.24 1.11 -1.65 -0.39 -1.19 -0.63 17.90 0.72 0.03 -17.99 -0.02 14.16 0.69 0.71 0.04
160: 0.50 0.00 -0.65 0.88 -0.75 0.34 -15.14 2.36 -2.51 -0.05 -0.02 4.05 0.69 -0.06 -0.06 53.05
176: 0.84 -7.23 2.90 1.66 -7.20 -0.85 0.45 122.93 0.00 6.83 5.48 0.25 -0.32 1.47 13.39 1.59
192: -1.13 1.29 -2.80 -0.46 16.38 -0.90 0.02 19.46 30.05 -1.57 2.04 0.99 1.37 10.90 -0.81 -0.90
208: -0.63 1.11 -1.05 -5.64 0.73 -0.11 -20.19 -27.65 0.80 -10.69 5.52 0.02 18.04 2.00 -8.12 -4.75
224: -15.89 -4.09 -7.44 -11.51 1.70 -1.64 -25.97 0.24 1.64 -1.84 -21.04 0.64 21.21 -0.35 17.35 14.40
240: -2.28 14.29 -0.70 -2.14 0.08 -0.37 5.10 22.21 -0.36 -0.05 56.16 22.34 34.48 0.02 -15.47 -1.72
avg:1.631178
Combining latency hack and lea decomposition:
0: -11.11 -51.77 -25.40 -2.76 1.94 26.50 1.17 -1.23 0.11 10.97 -1.91 1.28 -0.27 2.10 -28.16 4.28
16: -43.20 0.17 0.07 0.01 -2.40 -1.28 0.89 0.17 0.08 0.07 1.09 -1.12 3.41 0.46 -14.25 1.31
32: 93.27 -3.80 1.51 36.93 1.08 -1.21 -0.19 -0.16 1.24 -2.87 -9.96 1.64 1.07 -1.15 -0.21 0.31
48: 34.76 14.31 39.59 1.24 -8.64 -8.73 0.31 -3.54 13.96 -0.37 0.42 22.03 31.67 0.84 -15.52 2.93
64: -49.64 -13.87 0.09 0.97 -25.57 17.13 43.25 -0.13 -1.06 1.38 -1.16 0.68 -0.05 -2.56 0.64 -0.55
80: 0.73 -1.66 -10.51 -6.74 0.79 -0.32 -0.23 0.93 0.01 5.95 1.01 -0.08 0.67 0.06 22.84 -1.29
96: -0.98 0.32 0.44 -0.24 26.04 0.11 -0.72 15.73 0.83 -42.73 -13.81 0.01 1.00 -0.76 2.56 -6.10
112: -14.61 -3.39 -0.81 -5.99 0.77 0.12 5.02 21.70 32.38 12.03 -0.53 22.87 32.56 71.79 -17.24 3.14
128: 1.77 -1.47 1.26 -0.98 0.52 -0.40 0.38 -0.98 2.73 2.15 -4.77 -7.48 32.79 24.64 2.52 0.83
144: 3.27 0.08 0.11 -0.88 0.73 -0.53 -0.29 16.03 0.70 -0.27 -0.41 -0.65 14.44 -0.10 1.07 -1.28
160: 1.53 -1.30 0.35 -0.03 -0.02 0.34 -0.59 -0.92 -1.30 -1.60 0.05 7.76 -5.74 -0.45 0.65 53.53
176: 0.00 -0.02 2.52 16.88 0.83 -0.06 0.41 123.52 -13.74 7.65 5.81 0.15 0.81 0.61 15.60 -0.94
192: -0.79 -11.68 0.42 0.00 16.50 -0.87 0.01 17.46 31.16 -0.85 0.73 15.15 1.41 -2.63 -9.24 1.80
208: 0.02 1.05 -40.35 -11.52 0.76 -30.78 0.56 -0.97 -0.38 -42.41 -2.78 0.11 20.90 3.58 -8.11 -5.11
224: -14.38 -5.42 -7.79 -21.75 -5.79 -0.59 -25.35 -0.44 0.04 -1.27 -21.09 0.77 22.41 -0.68 18.18 1.65
240: 2.17 1.13 -0.85 0.13 0.23 -1.05 10.55 22.84 33.66 -0.02 53.66 21.57 32.42 1.02 -13.99 0.00
avg:1.032998
Setting shift cost to 2:
0: -2.92 -34.39 47.90 0.92 1.77 30.93 17.04 -0.02 -49.99 7.44 -1.23 1.25 17.74 0.70 0.03 2.92
16: -48.97 -1.27 -0.05 1.35 -5.20 -49.71 -24.12 1.34 57.34 32.41 2.60 31.52 0.75 -3.38 -54.74 -0.31
32: 85.75 -1.38 22.98 27.19 -1.04 -1.41 1.08 -0.31 1.25 -1.41 -48.69 -20.61 17.21 -29.81 -1.14 -1.00
48: 59.10 14.23 44.14 11.42 -2.28 1.23 42.96 -3.14 -0.80 -0.38 1.22 0.02 0.30 -0.31 1.48 1.35
64: -27.30 -44.37 19.76 8.71 -0.11 15.89 2.26 -1.42 -0.04 -0.08 -0.06 12.90 0.54 -50.26 -9.68 -1.27
80: 1.59 -1.09 0.97 -3.28 -42.68 -0.11 0.61 -0.05 0.01 6.74 -32.45 69.34 3.15 -29.57 5.08 0.04
96: 17.53 -0.83 3.57 10.81 22.62 0.93 34.89 -0.39 0.94 -42.46 -9.93 -0.88 26.55 -0.72 -0.14 0.12
112: 0.07 1.98 -1.06 0.46 0.70 -0.80 -4.71 1.17 -0.17 10.78 2.51 0.65 0.02 69.41 -36.72 2.92
128: -17.46 -13.57 22.69 6.88 0.52 -1.47 1.42 -0.99 3.82 0.14 -2.26 0.08 -1.47 -32.94 1.67 3.28
144: 2.20 0.05 -0.23 -3.27 -0.88 0.31 -5.34 -0.04 1.55 0.03 -1.95 -11.66 -2.05 0.70 -10.21 0.04
160: -0.67 0.04 -1.77 -2.61 -0.53 -1.58 -0.25 -26.37 -42.99 -24.39 -1.37 -8.81 1.82 -50.49 1.33 -23.45
176: 14.65 -0.53 2.97 16.67 -32.82 0.62 0.34 123.94 -13.19 6.79 -29.00 29.61 1.17 -28.78 0.56 0.34
192: 18.03 1.30 1.03 10.92 -3.76 -6.21 34.02 2.20 29.97 -0.01 3.43 17.50 36.28 -1.73 -11.37 0.91
208: 0.86 0.08 19.60 7.02 1.54 -30.76 2.70 0.05 24.62 -42.44 -3.39 0.33 17.55 0.38 -1.99 -0.08
224: -1.39 0.17 0.96 1.46 0.15 -1.23 0.45 -0.18 -7.67 0.11 -0.02 0.19 1.06 0.13 0.41 -1.03
240: 2.15 -1.09 1.66 50.64 0.02 -0.18 -0.81 0.01 0.08 -0.10 54.39 0.29 -0.53 0.97 1.24 0.04
avg:-1.589189
I've verified that the 1.6% speedup is off noise (worst combination out
of 5 runs is 1.2% speedup)
Honza
>
> Honza