This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PING fwprop and pr/19653


Bernd Schmidt wrote:
Paolo Bonzini wrote:
fwprop merge, stage 2 project:
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01420.html
http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01421.html

I don't yet see a completely convincing argument that this really is a full replacement of cse's path following. I played with it a little and got a few cases of missed optimizations, but I have no clear data to indicate whether that's a problem.
If you can give them to me, I can play with them.

I must say I *don't* have a convincing argument. There are clues, however, that most things that CSE path following does have already been moved to other passes.

1) In 4.2, GCSE was modified to assign value numbers to REG_EQUAL notes. This, plus the change from 1 to 2 GCSE passes, allowed GCSE to eliminate redundant address computations. The last time David Edelsohn SPEC-tested these changes on PowerPC, the only serious regression was in bzip2 and it's fixed by this change (see http://gcc.gnu.org/ml/gcc/2005-09/msg00821.html for a description of the problem).

2) Steven Bosscher ran nullstone, and the only problems he found were fixed at the tree level (PR23911) and caused by the elimination of the fold_rtx code, *not* by the elimination of path following.

3) Other problems revealed by SPEC testing were caused by optimizations that CSE should not have done, for example by non-canonical RTL produced by expand (http://gcc.gnu.org/ml/gcc-patches/2005-09/msg00863.html). These have been fixed as well, the patches are already on mainline.
Benchmark results of baseline, (baseline + fwprop), and (baseline + fwprop - expensive-parts-of-cse), across as many targets as possible, would be welcome.
SPEC results are neutral on i686-pc-linux-gnu, but I cannot find them anymore. Here are David's old results for PowerPC, before the change to fix bzip2.

SPECint, -O2:
164.gzip    1400    242    578    1400    242    579
175.vpr    1400    199    703    1400    197    712
176.gcc    1100    --    X    1100    --    X
181.mcf    1800    183    986    1800    182    990
186.crafty    1000    143    700    1000    144    697
197.parser    1800    326    552    1800    326    552
252.eon    1300    --    X    1300    --    X
253.perlbmk    1800    349    516    1800    347    518
254.gap    1100    178    617    1100    177    622
255.vortex    1900    0,022    X    1900    0,022    X
256.bzip2    1500    197    760    1500    205    732
300.twolf    3000    297    1011    3000    289    1038
           695            696

SPECfp, -O2: 168.wupwise 1600 232 691 1600 233 686
171.swim 3100 395 785 3100 385 806
172.mgrid 1800 439 410 1800 438 411
173.applu 2100 372 565 2100 374 562
177.mesa 1400 198 707 1400 198 708
178.galgel 2900 -- X 2900 -- X
179.art 2600 210 1235 2600 211 1230
183.equake 1300 160 810 1300 158 823
187.facerec 1900 247 770 1900 248 766
188.ammp 2200 422 521 2200 425 518
189.lucas 2000 222 901 2000 222 899
191.fma3d 2100 372 564 2100 369 570
200.sixtrack 1100 261 422 1100 260 424
301.apsi 2600 388 670 2600 387 671
667 668


SPECint, -O3:
164.gzip 1400 248 565 1400 245 571
175.vpr 1400 198 706 1400 194 721
176.gcc 1100 -- X 1100 -- X
181.mcf 1800 174 1033 1800 174 1037
186.crafty 1000 140 712 1000 141 709
197.parser 1800 296 608 1800 296 608
252.eon 1300 -- X 1300 -- X
253.perlbmk 1800 345 522 1800 346 520
254.gap 1100 179 613 1100 181 607
255.vortex 1900 0,022 X 1900 0,022 X
256.bzip2 1500 196 767 1500 197 760
300.twolf 3000 285 1052 3000 296 1014
710 708
SPECfp, -O3:
168.wupwise 1600 223 717 1600 225 712
171.swim 3100 239 1296 3100 249 1244
172.mgrid 1800 321 561 1800 324 556
173.applu 2100 312 674 2100 312 673
177.mesa 1400 192 730 1400 193 727
178.galgel 2900 -- X 2900 -- X
179.art 2600 188 1383 2600 189 1376
183.equake 1300 142 913 1300 142 915
187.facerec 1900 182 1045 1900 183 1038
188.ammp 2200 403 546 2200 400 550
189.lucas 2000 219 913 2000 221 904
191.fma3d 2100 356 590 2100 351 598
200.sixtrack 1100 232 474 1100 229 481
301.apsi 2600 325 801 2600 331 785
777 773


If you can run EEMBC on blackfin, that'd be great.
You'll get bonus points if you can eliminate combine.c too :)
Only if you first eliminate reload.c :-)

Paolo


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]