This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Enable -funroll-loops at -O3


Andrew Pinski wrote:
This seems like something which we should do as it is a huge boost for most
code now.

I ran SPEC95 and SPEC2000 on x86 with -funroll-loops. Attached are the comparisons with and without -funroll-loops:


SPECint2000 base: -0.66%
SPECint2000 peak: +0.77%

SPECfp2000 base: +2.34%
SPECfp2000 peak: +2.03%

SPECint95 base: +10.59%
SPECint95 peak: +8.59%

SPECfp95 base: +1.22%
SPECfp95 peak: +0.77%

Full comparisons attached.

252.eon does get a good jolt with -funroll-loops (+15% at -O2, +26% at -O3), while 253.perlbmk suffers quite a bit (-12% at -O2, -13% at -O3).

The only codegen problem I found is that we miscompile SPEC95's 129.compress with -funroll-loops.

Overall, it seems to be a wash, though. I guess I wouldn't mind if we enable this at -O3. After all, we make weaker speedup guarantees at -O3.


Diego.
Comparison between 20050111/spec-20050111.stats and 20050113/spec-20050113.stats (base peak)

Compiler used in 20050111/spec-20050111.stats (Before)

Compiler:   gcc version 4.0.0 20050111 (experimental)
Base flags: -O2 -ffast-math
Peak flags: -O3 -ffast-math
Processor:  Intel(R) Pentium(R) 4 CPU 2.26GHz (2259.264 Mhz)
Memory:     1034472 kB
Cache:      512 KB

Compiler used in 20050113/spec-20050113.stats (After)

Compiler:   gcc version 4.0.0 20050113 (experimental)
Base flags: -O2 -ffast-math -funroll-loops
Peak flags: -O3 -ffast-math -funroll-loops
Processor:  Intel(R) Pentium(R) 4 CPU 2.26GHz (2259.264 Mhz)
Memory:     1034472 kB
Cache:      512 KB


SPECint results for base

    Benchmark	Before	 After	% diff
     164.gzip	623.48	620.19	-  0.53%
      175.vpr	420.96	421.35	+  0.09%
      176.gcc	  0.00	  0.00	INF
      181.mcf	426.68	426.13	-  0.13%
   186.crafty	648.53	644.25	-  0.66%
   197.parser	548.63	550.07	+  0.26%
      252.eon	583.36	675.38	+ 15.77%
  253.perlbmk	806.60	711.21	- 11.83%
      254.gap	744.92	756.79	+  1.59%
   255.vortex	832.08	821.80	-  1.24%
    256.bzip2	521.85	513.57	-  1.59%
    300.twolf	529.74	493.22	-  6.89%
         mean	593.40	589.47	-  0.66%


SPECfp result for base

    Benchmark	Before	 After	% diff
  168.wupwise	574.44	692.98	+ 20.64%
     171.swim	497.17	493.13	-  0.81%
    172.mgrid	393.88	397.87	+  1.01%
    173.applu	558.89	566.16	+  1.30%
     177.mesa	467.30	466.14	-  0.25%
   178.galgel	481.65	484.66	+  0.63%
      179.art	204.65	204.64	-  0.00%
   183.equake	828.71	833.69	+  0.60%
  187.facerec	341.59	347.43	+  1.71%
     188.ammp	348.20	352.78	+  1.32%
    189.lucas	485.00	489.90	+  1.01%
    191.fma3d	463.99	430.09	-  7.31%
 200.sixtrack	310.93	351.14	+ 12.93%
     301.apsi	439.93	451.36	+  2.60%
         mean	435.35	445.54	+  2.34%


SPECint results for peak

    Benchmark	Before	 After	% diff
     164.gzip	633.08	628.64	-  0.70%
      175.vpr	430.42	428.28	-  0.50%
      176.gcc	  0.00	  0.00	INF
      181.mcf	435.71	434.27	-  0.33%
   186.crafty	639.42	654.68	+  2.39%
   197.parser	588.11	591.84	+  0.64%
      252.eon	535.33	673.52	+ 25.82%
  253.perlbmk	733.53	638.33	- 12.98%
      254.gap	737.85	768.31	+  4.13%
   255.vortex	842.93	831.93	-  1.30%
    256.bzip2	520.65	517.81	-  0.54%
    300.twolf	535.68	513.21	-  4.19%
         mean	590.45	594.99	+  0.77%


SPECfp result for peak

    Benchmark	Before	 After	% diff
  168.wupwise	610.20	692.62	+ 13.51%
     171.swim	500.30	498.04	-  0.45%
    172.mgrid	394.65	397.94	+  0.83%
    173.applu	557.09	578.54	+  3.85%
     177.mesa	479.95	476.19	-  0.78%
   178.galgel	  0.00	  0.00	INF
      179.art	203.40	205.26	+  0.91%
   183.equake	855.95	866.86	+  1.27%
  187.facerec	340.56	348.39	+  2.30%
     188.ammp	350.24	356.67	+  1.84%
    189.lucas	484.41	490.29	+  1.21%
    191.fma3d	463.72	431.90	-  6.86%
 200.sixtrack	311.74	339.95	+  9.05%
     301.apsi	443.17	447.93	+  1.07%
         mean	436.30	445.16	+  2.03%

Comparison between 20050112/spec-20050112.stats and 20050113/spec-20050113.stats (base peak)

Compiler used in 20050112/spec-20050112.stats (Before)

Compiler:   gcc version 4.0.0 20050112 (experimental)
Base flags: -O2 -ffast-math
Peak flags: -O3 -ffast-math
Processor:  Pentium III (Coppermine) (996.778 MHz) ( Mhz)
Memory:     254760 kB
Cache:      256 KB

Compiler used in 20050113/spec-20050113.stats (After)

Compiler:   gcc version 4.0.0 20050113 (experimental)
Base flags: -O2 -ffast-math -funroll-loops
Peak flags: -O3 -ffast-math -funroll-loops
Processor:  Pentium III (Coppermine) (996.778 MHz) ( Mhz)
Memory:     254760 kB
Cache:      256 KB


SPECint results for base

    Benchmark	Before	 After	% diff
       099.go	 43.07	 40.85	-  5.17%
  124.m88ksim	 30.27	 34.90	+ 15.32%
      126.gcc	 33.91	 33.43	-  1.44%
 129.compress	 20.14	  0.00	-100.00%
       130.li	 35.14	 36.85	+  4.89%
    132.ijpeg	  0.00	  0.00	INF
     134.perl	 40.88	 40.62	-  0.65%
   147.vortex	 32.72	 32.61	-  0.33%
         mean	 32.91	 36.40	+ 10.59%


SPECfp result for base

    Benchmark	Before	 After	% diff
  101.tomcatv	 28.87	 29.36	+  1.67%
     102.swim	 42.77	 42.70	-  0.17%
   103.su2cor	  9.87	  9.94	+  0.76%
  104.hydro2d	  9.69	  9.94	+  2.55%
    107.mgrid	 13.12	 13.26	+  1.09%
    110.applu	  0.00	  0.00	INF
   125.turb3d	 25.37	 26.07	+  2.73%
     141.apsi	  0.00	  0.00	INF
    145.fpppp	 54.63	 54.59	-  0.07%
    146.wave5	  0.00	  0.00	INF
         mean	 21.53	 21.80	+  1.22%


SPECint results for peak

    Benchmark	Before	 After	% diff
       099.go	 44.06	 39.57	- 10.21%
  124.m88ksim	 32.67	 38.31	+ 17.24%
      126.gcc	 34.57	 34.13	-  1.25%
 129.compress	 22.20	  0.00	-100.00%
       130.li	 36.98	 38.05	+  2.89%
    132.ijpeg	  0.00	  0.00	INF
     134.perl	 43.78	 43.09	-  1.58%
   147.vortex	 32.78	 32.81	+  0.11%
         mean	 34.54	 37.51	+  8.59%


SPECfp result for peak

    Benchmark	Before	 After	% diff
  101.tomcatv	 29.61	 30.03	+  1.42%
     102.swim	 43.23	 42.89	-  0.79%
   103.su2cor	  9.89	  9.98	+  0.91%
  104.hydro2d	  9.70	  9.94	+  2.47%
    107.mgrid	 13.25	 13.32	+  0.55%
    110.applu	  0.00	  0.00	INF
   125.turb3d	 26.11	 26.50	+  1.51%
     141.apsi	  0.00	  0.00	INF
    145.fpppp	 55.21	 54.85	-  0.66%
    146.wave5	  0.00	  0.00	INF
         mean	 21.81	 21.98	+  0.77%


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]