This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [4.0 and mainline] Fix multiplication by constant expansion


> 
> I am going to try 3 address arechitecture (PPC) and P4 with decomposed leas, but things are somewhat contraintuitive here.

OK, on P4 with shift cost set to 1 I get 4% better results than cost of
4.

Computing cost of lea based on the decomposed cost - 1 brings
regression:

  0:  -33.99  -51.87  -16.96   -2.39    0.79   32.14    0.06   -0.01   -3.47    4.86   -1.00   -0.33    0.84    1.11   -0.39    4.49 
 16:    0.32    0.15   -0.02    0.06    0.01   -4.85   -2.22   -0.01    0.00    0.01    0.07   -1.39   -0.69   -0.09    0.38    1.40 
 32:   -3.49    0.08    0.14   -0.07   -0.05   -0.95   -0.01    1.02   -0.45   -0.06    0.45   -1.35   -0.85    0.06   -1.55   -3.35 
 48:   30.83   15.29    0.14    1.49    0.67    0.40    0.39   -2.83   -0.76   -0.63   -1.21    0.28   -0.54    0.80    1.49   -5.69 
 64:  -49.67    1.55    1.52   -0.03   -0.08   -5.09    1.60   -1.07   -1.11    1.21   -2.98   13.95   -0.96   -0.41    0.65    0.03 
 80:    0.68    0.72   -0.05   -0.35    0.10   -0.15    0.84   -0.08   -0.28    4.99    0.51   -0.13    0.88    0.03   10.53    0.07 
 96:   -0.02    0.28    0.54    0.13  -10.68    0.02   -0.42   -0.29   10.04  -42.40   -9.86    1.62    1.06   -0.53   -0.37    0.77 
112:    0.18    1.39   -1.44    1.45   -0.04   -0.60   -6.15    1.29   -0.06   14.09   -0.43    0.60    0.12   70.00   -0.04    1.67 
128:  -49.03  -37.23    1.32   -1.06    0.53   -1.31    1.47   -1.00    3.81    2.18   -1.87   -0.22   -1.14    0.47    2.17    1.88 
144:    1.44   -1.36    0.56   -0.91   -0.01   -6.02    0.10    1.86    0.24    0.07    0.72   -1.83    0.79   -0.20   -2.06  -24.98 
160:   -0.73  -24.99   -1.32   -1.95    0.06   -1.56   -0.36    2.36   -0.74   -0.75   -0.82    6.00   -3.67    0.22   -0.74    1.60 
176:    2.81   -0.05    1.96   10.89    0.63   -0.72    0.37   -2.23   -0.13    6.71    3.46    0.43  -12.53    0.56   -1.76    0.23 
192:   -0.11   -1.04    0.13   -0.11   -0.27   -1.12   -0.08    0.79    4.88   -1.13    1.04   11.34    1.46    9.99    0.77    1.81 
208:   -2.04    0.03   19.49  -11.27   -0.63  -30.79    0.55   -0.03   -0.20  -42.56   -3.43   -0.11   17.94    2.61    0.46    0.03 
224:    0.16    0.14    0.25    1.96    1.71   -0.81   -1.68   -0.24   -0.36    1.02   -1.15    1.19    0.53    0.17    0.82   -0.80 
240:    1.22   -0.90    0.86   -6.30   -1.15   -1.02    0.26    1.23   -0.39    0.99   54.98    1.16    0.77   -0.77    2.82  -36.54 
avg:-1.095159

My latency hack:

  0:   -1.60  -51.84   -1.28   -2.03  -49.12   23.26    1.08    0.04   -1.62    9.26   -1.07    1.29    0.85    0.83  -28.07    2.84 
 16:  -48.97   -5.47   -0.02   -0.06    0.00    0.01    0.83   -1.83   22.49   -0.06    1.06   -0.73    2.48   -0.65  -13.23   -1.19 
 32:   92.89   -1.61   -1.12   37.64   -0.14   -0.15   -0.02    1.08    0.11   -1.33    0.51    5.95    0.26   -0.84    1.10   -0.91 
 48:   34.65   12.72   39.69   -0.70    0.89    1.08    0.33    2.54   11.68   -0.17    0.90   20.76   34.60   -0.17  -14.03   -7.24 
 64:    0.65  -10.06    0.16   -1.46   -9.26   18.16   42.03   -1.76   -1.15    1.31   -1.85   14.05   -0.03   -0.72    1.08   -1.22 
 80:    1.80    0.19    0.34   -3.39    0.82   -0.02  -15.56    0.80    0.10    5.10    1.36    0.11    0.16    0.06   24.56   -0.82 
 96:   -0.07   -2.74    0.18   -3.13   26.06   -0.77   -0.30   16.18    0.81   -6.34    0.86  -26.65    0.10   -3.62  -31.88   -4.47 
112:  -14.16   -3.43   -0.61   -0.21    0.76    0.10    0.15   22.06   32.27   14.11   -0.94   20.01   32.31   39.70  -15.37    2.97 
128:  -49.07   -0.01   -0.69   -0.94   -0.83    0.57    0.78   -1.86    3.78    2.52   -2.42   -2.42   34.06   33.12    1.61    3.19 
144:    3.33   -1.24    1.11   -1.65   -0.39   -1.19   -0.63   17.90    0.72    0.03  -17.99   -0.02   14.16    0.69    0.71    0.04 
160:    0.50    0.00   -0.65    0.88   -0.75    0.34  -15.14    2.36   -2.51   -0.05   -0.02    4.05    0.69   -0.06   -0.06   53.05 
176:    0.84   -7.23    2.90    1.66   -7.20   -0.85    0.45  122.93    0.00    6.83    5.48    0.25   -0.32    1.47   13.39    1.59 
192:   -1.13    1.29   -2.80   -0.46   16.38   -0.90    0.02   19.46   30.05   -1.57    2.04    0.99    1.37   10.90   -0.81   -0.90 
208:   -0.63    1.11   -1.05   -5.64    0.73   -0.11  -20.19  -27.65    0.80  -10.69    5.52    0.02   18.04    2.00   -8.12   -4.75 
224:  -15.89   -4.09   -7.44  -11.51    1.70   -1.64  -25.97    0.24    1.64   -1.84  -21.04    0.64   21.21   -0.35   17.35   14.40 
240:   -2.28   14.29   -0.70   -2.14    0.08   -0.37    5.10   22.21   -0.36   -0.05   56.16   22.34   34.48    0.02  -15.47   -1.72 
avg:1.631178

Combining latency hack and lea decomposition:

  0:  -11.11  -51.77  -25.40   -2.76    1.94   26.50    1.17   -1.23    0.11   10.97   -1.91    1.28   -0.27    2.10  -28.16    4.28 
 16:  -43.20    0.17    0.07    0.01   -2.40   -1.28    0.89    0.17    0.08    0.07    1.09   -1.12    3.41    0.46  -14.25    1.31 
 32:   93.27   -3.80    1.51   36.93    1.08   -1.21   -0.19   -0.16    1.24   -2.87   -9.96    1.64    1.07   -1.15   -0.21    0.31 
 48:   34.76   14.31   39.59    1.24   -8.64   -8.73    0.31   -3.54   13.96   -0.37    0.42   22.03   31.67    0.84  -15.52    2.93 
 64:  -49.64  -13.87    0.09    0.97  -25.57   17.13   43.25   -0.13   -1.06    1.38   -1.16    0.68   -0.05   -2.56    0.64   -0.55 
 80:    0.73   -1.66  -10.51   -6.74    0.79   -0.32   -0.23    0.93    0.01    5.95    1.01   -0.08    0.67    0.06   22.84   -1.29 
 96:   -0.98    0.32    0.44   -0.24   26.04    0.11   -0.72   15.73    0.83  -42.73  -13.81    0.01    1.00   -0.76    2.56   -6.10 
112:  -14.61   -3.39   -0.81   -5.99    0.77    0.12    5.02   21.70   32.38   12.03   -0.53   22.87   32.56   71.79  -17.24    3.14 
128:    1.77   -1.47    1.26   -0.98    0.52   -0.40    0.38   -0.98    2.73    2.15   -4.77   -7.48   32.79   24.64    2.52    0.83 
144:    3.27    0.08    0.11   -0.88    0.73   -0.53   -0.29   16.03    0.70   -0.27   -0.41   -0.65   14.44   -0.10    1.07   -1.28 
160:    1.53   -1.30    0.35   -0.03   -0.02    0.34   -0.59   -0.92   -1.30   -1.60    0.05    7.76   -5.74   -0.45    0.65   53.53 
176:    0.00   -0.02    2.52   16.88    0.83   -0.06    0.41  123.52  -13.74    7.65    5.81    0.15    0.81    0.61   15.60   -0.94 
192:   -0.79  -11.68    0.42    0.00   16.50   -0.87    0.01   17.46   31.16   -0.85    0.73   15.15    1.41   -2.63   -9.24    1.80 
208:    0.02    1.05  -40.35  -11.52    0.76  -30.78    0.56   -0.97   -0.38  -42.41   -2.78    0.11   20.90    3.58   -8.11   -5.11 
224:  -14.38   -5.42   -7.79  -21.75   -5.79   -0.59  -25.35   -0.44    0.04   -1.27  -21.09    0.77   22.41   -0.68   18.18    1.65 
240:    2.17    1.13   -0.85    0.13    0.23   -1.05   10.55   22.84   33.66   -0.02   53.66   21.57   32.42    1.02  -13.99    0.00 
avg:1.032998


Setting shift cost to 2:

  0:   -2.92  -34.39   47.90    0.92    1.77   30.93   17.04   -0.02  -49.99    7.44   -1.23    1.25   17.74    0.70    0.03    2.92 
 16:  -48.97   -1.27   -0.05    1.35   -5.20  -49.71  -24.12    1.34   57.34   32.41    2.60   31.52    0.75   -3.38  -54.74   -0.31 
 32:   85.75   -1.38   22.98   27.19   -1.04   -1.41    1.08   -0.31    1.25   -1.41  -48.69  -20.61   17.21  -29.81   -1.14   -1.00 
 48:   59.10   14.23   44.14   11.42   -2.28    1.23   42.96   -3.14   -0.80   -0.38    1.22    0.02    0.30   -0.31    1.48    1.35 
 64:  -27.30  -44.37   19.76    8.71   -0.11   15.89    2.26   -1.42   -0.04   -0.08   -0.06   12.90    0.54  -50.26   -9.68   -1.27 
 80:    1.59   -1.09    0.97   -3.28  -42.68   -0.11    0.61   -0.05    0.01    6.74  -32.45   69.34    3.15  -29.57    5.08    0.04 
 96:   17.53   -0.83    3.57   10.81   22.62    0.93   34.89   -0.39    0.94  -42.46   -9.93   -0.88   26.55   -0.72   -0.14    0.12 
112:    0.07    1.98   -1.06    0.46    0.70   -0.80   -4.71    1.17   -0.17   10.78    2.51    0.65    0.02   69.41  -36.72    2.92 
128:  -17.46  -13.57   22.69    6.88    0.52   -1.47    1.42   -0.99    3.82    0.14   -2.26    0.08   -1.47  -32.94    1.67    3.28 
144:    2.20    0.05   -0.23   -3.27   -0.88    0.31   -5.34   -0.04    1.55    0.03   -1.95  -11.66   -2.05    0.70  -10.21    0.04 
160:   -0.67    0.04   -1.77   -2.61   -0.53   -1.58   -0.25  -26.37  -42.99  -24.39   -1.37   -8.81    1.82  -50.49    1.33  -23.45 
176:   14.65   -0.53    2.97   16.67  -32.82    0.62    0.34  123.94  -13.19    6.79  -29.00   29.61    1.17  -28.78    0.56    0.34 
192:   18.03    1.30    1.03   10.92   -3.76   -6.21   34.02    2.20   29.97   -0.01    3.43   17.50   36.28   -1.73  -11.37    0.91 
208:    0.86    0.08   19.60    7.02    1.54  -30.76    2.70    0.05   24.62  -42.44   -3.39    0.33   17.55    0.38   -1.99   -0.08 
224:   -1.39    0.17    0.96    1.46    0.15   -1.23    0.45   -0.18   -7.67    0.11   -0.02    0.19    1.06    0.13    0.41   -1.03 
240:    2.15   -1.09    1.66   50.64    0.02   -0.18   -0.81    0.01    0.08   -0.10   54.39    0.29   -0.53    0.97    1.24    0.04 
avg:-1.589189

I've verified that the 1.6% speedup is off noise (worst combination out
of 5 runs is 1.2% speedup)

Honza
> 
> Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]