loop unrolling in the first loop pass

Jan Hubicka jh@suse.cz
Sat May 6 14:54:00 GMT 2000


Hi
This patch changes toplev.c to run loop unroller in the first pass.  First this
started as quick hack for my prefetching work, but then I found that it brings
speedups overall.  It seems to make sense to strength reduce the unrolled loop,
since the unrolling may bring more strength reduction oportunities and also it
works as nice cleanup after the unrolling.

This is temporary anyway, since I plan to convert loop invariant/induction
variables code to library and then run the passes independently (it probably
makes sense to run prefetching pass after sched1 when the expected timmings of
blocks are known).

Here is comparison of -funroll-all-loops:

Quicksort (tests/qsort.c)
108%    118%    108%    124% 
108%    118%    108%    124% 
Bresenham line drawing algorithm (tests/bresenham.c)
 97%    100%    102%    100% 
103%    100%    102%     99% 
Bell Labs Benchmark B7: (integer statistics) (nongpl/bell-labs/b7.c)
---%     93%     98%     94% 
---%     93%     98%     95% 
Unsorted tests
Bell Labs benchmark B2 (tree creation) (nongpl/bell-labs/b2.c):101% 
Bell Labs benchmark B3 (qsort for strings) (nongpl/bell-labs/b3.c):108% 
Bell Labs benchmark B4 (tty driver fragt) (nongpl/bell-labs/b4.c):101% 
Dhrystone (nongpl/dhry/dhry_1.c nongpl/dhry/dhry_2.c):      104% 
Palette approximation (tests/pal.c):                        103% 
XaoS internal loop (tests/xaos.c):                          105% 
Bzip2 block sorting loop (tests/bzip2.c):                    95% 

END

On byte benchmark I can measure speedup in fp emulation pass and no notable
changes in other passes.

Bootstrapped on i386.
Honza

Sun Apr 30 19:13:28 CEST 2000  Jan Hubicka  <jh@suse.cz>
	* toplev.c (rest_of_compilation):  Do loop unrolling in the first loop
	optimizer pass.

*** toplev.c.old1	Sun Apr 30 19:13:05 2000
--- toplev.c	Sun Apr 30 19:01:06 2000
*************** rest_of_compilation (decl)
*** 2998,3004 ****
  	{
  	  /* We only want to perform unrolling once.  */
  	       
! 	  loop_optimize (insns, rtl_dump_file, 0);
  
  	  /* The first call to loop_optimize makes some instructions
  	     trivially dead.  We delete those instructions now in the
--- 2998,3004 ----
  	{
  	  /* We only want to perform unrolling once.  */
  	       
! 	  loop_optimize (insns, rtl_dump_file, (flag_unroll_loops ? LOOP_UNROLL : 0));
  
  	  /* The first call to loop_optimize makes some instructions
  	     trivially dead.  We delete those instructions now in the
*************** rest_of_compilation (decl)
*** 3010,3016 ****
  		  analysis code depends on this information.  */
  	  reg_scan (insns, max_reg_num (), 1);
  	}
!       loop_optimize (insns, rtl_dump_file, (flag_unroll_loops ? LOOP_UNROLL : 0) | LOOP_BCT);
  
        close_dump_file (DFI_loop, print_rtl, insns);
        timevar_pop (TV_LOOP);
--- 3010,3017 ----
  		  analysis code depends on this information.  */
  	  reg_scan (insns, max_reg_num (), 1);
  	}
!       loop_optimize (insns, rtl_dump_file,
! 		     (flag_unroll_loops && !flag_rerun_loop_opt ? LOOP_UNROLL : 0) | LOOP_BCT);
  
        close_dump_file (DFI_loop, print_rtl, insns);
        timevar_pop (TV_LOOP);


More information about the Gcc-patches mailing list