This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Enable -fweb and -frename-registers when unrolling


Hello,

having these flags enabled causes givs to be split (unroller only splits
bivs, which is usually sufficient on architectures with rich addressing
modes, where cse then causes bivs to be propagated to addresses).
However, since noone really knows about these flags, I keep getting
bugreports of form "gcc does not perform iv splitting" (see PR 20376, for
example).

I hope one day these flags might be enabled by default always, but this
does not seem to be feasible at the moment due to compile time issues.
However, they are most useful when loops are unrolled, and with
-funroll-loops, the compile time is not that much of a problem, so
perhaps it could be enabled at least then?

Some benchmark results:  

On i686, there is clear improvement on specint, overall neutral on
specfp (but with some significant fluctutations).
Base -O2 -funroll-loops, peak -O2 -funroll-loops -fweb -frename-registers.

   164.gzip          1400   197       712    *     1400   197       711    *
   175.vpr           1400   397       353    *     1400   394       355    *
   176.gcc                                   X                             X
   181.mcf           1800   728       247    *     1800   736       245    *
   186.crafty        1000   124       806    *     1000   119       842    *
   197.parser        1800   396       454    *     1800   394       456    *
   252.eon           1300   143       911    *     1300   142       916    *
   253.perlbmk       1800   217       830    *     1800   206       872    *
   254.gap           1100   169       652    *     1100   169       649    *
   255.vortex        1900   236       807    *     1900   231       823    *
   256.bzip2         1500   328       457    *     1500   329       455    *
   300.twolf         3000   785       382    *     3000   781       384    *
   Est. SPECint_base2000              556    
   Est. SPECint2000                                                 562    

   168.wupwise       1600   278       575    *     1600   270       592    *
   171.swim          3100   652       476    *     3100   661       469    *
   172.mgrid         1800   499       361    *     1800   496       363    *
   173.applu         2100   501       419    *     2100   500       420    *
   177.mesa          1400   212       662    *     1400   199       703    *
   178.galgel                                X                             X
   179.art           2600  1418       183    *     2600  1417       183    *
   183.equake        1300   283       459    *     1300   286       455    *
   187.facerec       1900   467       407    *     1900   460       413    *
   188.ammp          2200   606       363    *     2200   611       360    *
   189.lucas         2000   425       470    *     2000   425       471    *
   191.fma3d         2100   397       529    *     2100   389       540    *
   200.sixtrack      1100   227       486    *     1100   238       462    *
   301.apsi          2600   660       394    *     2600   668       389    *
   Est. SPECfp_base2000               428    
   Est. SPECfp2000                                                  430    

On ia64, the results are in general positive:

	-O3 -funroll-loops	-fweb		-frename-regs	both	
			Base	Patch	Diff	Patch	Diff	Patch	Diff
   164.gzip          	662	651	-1.66%	704	6.34%	703	6.19%
   175.vpr           	835	829	-0.72%	852	2.04%	853	2.16%
   176.gcc           	X	X	X	X	X	X	X
   181.mcf           	700	701	0.14%	695	-0.71%	700	0.00%
   186.crafty        	859	854	-0.58%	890	3.61%	878	2.21%
   197.parser        	X	X	X	X	X	X	X
   252.eon           	787	785	-0.25%	810	2.92%	814	3.43%
   253.perlbmk       	807	809	0.25%	813	0.74%	812	0.62%
   254.gap           	536	541	0.93%	544	1.49%	543	1.31%
   255.vortex        	886	895	1.02%	901	1.69%	908	2.48%
   256.bzip2         	682	695	1.91%	717	5.13%	707	3.67%
   300.twolf         	992	992	0.00%	989	-0.30%	984	-0.81%
   SPECint2000  	764	765	0.13%	782	2.36%	780	2.09%

	-O3 -funroll-loops	-fweb		-frename-regs	both	
			Base	Patch	Diff	Patch	Diff	Patch	Diff
   168.wupwise       	472	484	2.54%	481	1.91%	496	5.08%
   171.swim          	712	735	3.23%	739	3.79%	737	3.51%
   172.mgrid         	376	377	0.27%	380	1.06%	379	0.80%
   173.applu         	507	510	0.59%	516	1.78%	514	1.38%
   177.mesa          	782	775	-0.90%	827	5.75%	825	5.50%
   178.galgel        	X	X	X	X	X	X	X
   179.art           	1947	2030	4.26%	2052	5.39%	2054	5.50%
   183.equake        	498	497	-0.20%	490	-1.61%	490	-1.61%
   187.facerec       	565	566	0.18%	567	0.35%	567	0.35%
   188.ammp          	677	677	0.00%	701	3.55%	701	3.55%
   189.lucas         	863	864	0.12%	869	0.70%	868	0.58%
   191.fma3d         	296	296	0.00%	288	-2.70%	288	-2.70%
   200.sixtrack      	320	333	4.06%	345	7.81%	342	6.88%
   301.apsi          	537	558	3.91%	593	10.43%	603	12.29%
   SPECfp2000   	579	586	1.21%	595	2.76%	597	3.11%

Bootstrapped & regtested on i686 and ia64 (with -funroll-loops enabled).

Zdenek

	PR rtl-optimization/20376
	* toplev.c (process_options): Enable -fweb and -frename-registers when
	unrolling.

Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.960
diff -c -3 -p -r1.960 toplev.c
*** toplev.c	4 Jun 2005 14:02:35 -0000	1.960
--- toplev.c	24 Jun 2005 17:02:44 -0000
*************** process_options (void)
*** 1507,1515 ****
    if (flag_unroll_all_loops)
      flag_unroll_loops = 1;
  
!   /* The loop unrolling code assumes that cse will be run after loop.  */
    if (flag_unroll_loops || flag_peel_loops)
!     flag_rerun_cse_after_loop = 1;
  
    /* If explicitly asked to run new loop optimizer, switch off the old
       one.  */
--- 1507,1521 ----
    if (flag_unroll_all_loops)
      flag_unroll_loops = 1;
  
!   /* The loop unrolling code assumes that cse will be run after loop.
!      Also enable -fweb and -frename-registers that help scheduling
!      the unrolled loop.  */
    if (flag_unroll_loops || flag_peel_loops)
!     {
!       flag_rerun_cse_after_loop = 1;
!       flag_web = 1;
!       flag_rename_registers = 1;
!     }
  
    /* If explicitly asked to run new loop optimizer, switch off the old
       one.  */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]