This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Controlling code alignment on DEC Alpha


By default, GCC aligns Alpha functions to a 32-byte boundary and loops
and branch targets are aligned to a 16-byte boundary.  This is
intended to enhance code performance by aligning to typical cache line
boundaries and prefetch widths, but has the side effect of increasing
the code size.  The FreeBSD project is currently running into space
problems on the Alpha installation floppies and it would be useful to
be able to remove this over-alignment in order to shrink the code.  I
found that the FreeBSD kernel text segment shrank by 10% when compiled
using the minimum (4-byte) alignment, compared to the defaults.

On the i386 and SPARC targets, GCC allows the actual alignment to be
controlled via a set of "-malign-XXX=NNN" directives.  The following
patches implement -malign-functions=N, -malign-jumps=N and
-malign-loops=N as for the i386 (it is basically a cut-and-paste of
the existing i386 code).  (Note that the i386 and SPARC implement the
same set of directives, but the alignments are specified differently:
In bytes on the SPARC, and log[base2](bytes) on the i386 and in the
patches below).

This code has been tested in FreeBSD 5-CURRENT from early February,
using GCC 2.95.3 Release Candidate #1 as imported into FreeBSD.  The
following testing has been performed:
- Compile the FreeBSD kernel with both the stock gcc and my patched
  GCC with the same command lines.  The results were byte-for-byte
  identical apart from the build-time and kernel build number.
- Compile and run a FreeBSD kernel with options
  "-malign-jumps=2 -malign-loops=2 -malign-functions=2"
- I am currently running FreeBSD userland and kernel re-compiled with
  the "-malign-jumps=3 -malign-loops=3 -malign-functions=4".

The patches below are relative to GCC 2.95.3 Release Candidate #3
as imported into FreeBSD (the RC1 diffs applied cleanly to RC3).

Index: invoke.texi
===================================================================
RCS file: /home/CVSROOT/src/contrib/gcc.295/invoke.texi,v
retrieving revision 1.8
diff -c -p -r1.8 invoke.texi
*** invoke.texi	2001/02/17 09:04:50	1.8
--- invoke.texi	2001/03/25 21:46:25
*************** in the following sections.
*** 380,385 ****
--- 380,387 ----
  -mcpu=@var{cpu type}
  -mbwx -mno-bwx -mcix -mno-cix -mmax -mno-max
  -mmemory-latency=@var{time}
+ -malign-jumps=@var{num}  -malign-loops=@var{num}
+ -malign-functions=@var{num}
  
  @emph{Clipper Options}
  -mc300  -mc400
*************** The compiler contains estimates of the n
*** 5703,5708 ****
--- 5705,5726 ----
  Note that L3 is only valid for EV5.
  
  @end table
+ 
+ @item -malign-loops=@var{num}
+ Align loops to a 2 raised to a @var{num} byte boundary.  If
+ @samp{-malign-loops} is not specified, the default is 4 (16 bytes)
+ if optimising unless ECOFF symbols are selected, otherwise no
+ additional alignment is used.
+ 
+ @item -malign-jumps=@var{num}
+ Align instructions that are only jumped to to a 2 raised to a @var{num}
+ byte boundary.  If @samp{-malign-jumps} is not specified, the default is
+ 4 (16 bytes) if optimising unless ECOFF symbols are selected, otherwise no
+ additional alignment is used.
+ 
+ @item -malign-functions=@var{num}
+ Align the start of functions to a 2 raised to @var{num} byte boundary.
+ If @samp{-malign-functions} is not specified, the default is 5 (32 bytes).
  @end table
  
  @node Clipper Options
Index: config/alpha/alpha.c
===================================================================
RCS file: /home/CVSROOT/src/contrib/gcc.295/config/alpha/alpha.c,v
retrieving revision 1.2
diff -c -p -r1.2 alpha.c
*** config/alpha/alpha.c	2000/08/09 08:36:27	1.2
--- config/alpha/alpha.c	2001/03/25 21:46:24
*************** const char *alpha_fprm_string;	/* -mfp-r
*** 75,80 ****
--- 75,94 ----
  const char *alpha_fptm_string;	/* -mfp-trap-mode=[n|u|su|sui] */
  const char *alpha_mlat_string;	/* -mmemory-latency= */
  
+ /* Alignment to use for functions, loops and jumps:  */
+ 
+ /* Power of two alignment for functions. */
+ int alpha_align_funcs;
+ const char *alpha_align_funcs_string;
+ 
+ /* Power of two alignment for loops. */
+ int alpha_align_loops;
+ const char *alpha_align_loops_string;
+ 
+ /* Power of two alignment for non-loop jumps. */
+ int alpha_align_jumps;
+ const char *alpha_align_jumps_string;
+ 
  /* Save information from a "cmpxx" operation until the branch or scc is
     emitted.  */
  
*************** override_options ()
*** 320,325 ****
--- 334,373 ----
  
    /* Acquire a unique set number for our register saves and restores.  */
    alpha_sr_alias_set = new_alias_set ();
+ 
+   /* Validate -malign-loops= value, or provide default.
+ 
+    ??? The default is no alignment if we don't optimize and also if we
+    are writing ECOFF symbols to work around a bug in DEC's assembler,
+    otherwise we use octaword alignment.
+  */
+ 
+   alpha_align_loops = (optimize > 0 && write_symbols != SDB_DEBUG ? 4 : 0);
+   if (alpha_align_loops_string)
+     {
+       alpha_align_loops = atoi (alpha_align_loops_string);
+       if (alpha_align_loops < 2 || alpha_align_loops > 6)
+ 	fatal ("-malign-loops=%d is not between 2 and 6", alpha_align_loops);
+     }
+ 
+   /* Validate -malign-jumps= value, or provide default.  */
+   alpha_align_jumps = (optimize > 0 && write_symbols != SDB_DEBUG ? 4 : 0);
+   if (alpha_align_jumps_string)
+     {
+       alpha_align_jumps = atoi (alpha_align_jumps_string);
+       if (alpha_align_jumps < 2 || alpha_align_jumps > 6)
+ 	fatal ("-malign-jumps=%d is not between 2 and 6", alpha_align_jumps);
+     }
+ 
+   /* Validate -malign-functions= value, or provide default. */
+   alpha_align_funcs = 5;		/* default is 32-byte boundary */
+   if (alpha_align_funcs_string)
+     {
+       alpha_align_funcs = atoi (alpha_align_funcs_string);
+       if (alpha_align_funcs < 2 || alpha_align_funcs > 6)
+ 	fatal ("-malign-functions=%d is not between 2 and 6",
+ 		alpha_align_funcs);
+     }
  }
  
  /* Returns 1 if VALUE is a mask that contains full bytes of zero or ones.  */
Index: config/alpha/alpha.h
===================================================================
RCS file: /home/CVSROOT/src/contrib/gcc.295/config/alpha/alpha.h,v
retrieving revision 1.1.1.3
diff -c -p -r1.1.1.3 alpha.h
*** config/alpha/alpha.h	1999/10/16 06:07:49	1.1.1.3
--- config/alpha/alpha.h	2001/03/25 21:46:24
*************** extern const char *alpha_fprm_string;	/*
*** 243,248 ****
--- 243,251 ----
  extern const char *alpha_fptm_string;	/* For -mfp-trap-mode=[n|u|su|sui]  */
  extern const char *alpha_tp_string;	/* For -mtrap-precision=[p|f|i] */
  extern const char *alpha_mlat_string;	/* For -mmemory-latency= */
+ extern const char *alpha_align_loops_string;	/* For -malign-loops= */
+ extern const char *alpha_align_jumps_string;	/* For -malign-jumps= */
+ extern const char *alpha_align_funcs_string;	/* For -malign-functions= */
  
  #define TARGET_OPTIONS					\
  {							\
*************** extern const char *alpha_mlat_string;	/*
*** 256,261 ****
--- 259,270 ----
     "Control the precision given to fp exceptions"},	\
    {"memory-latency=",	&alpha_mlat_string,		\
     "Tune expected memory latency"},			\
+   { "align-loops=",	&alpha_align_loops_string, 	\
+     "Loop code aligned to this power of 2" },		\
+   { "align-jumps=",	&alpha_align_jumps_string,	\
+     "Jump targets are aligned to this power of 2" },	\
+   { "align-functions=",	&alpha_align_funcs_string,	\
+     "Function starts are aligned to this power of 2" },	\
  }
  
  /* Attempt to describe CPU characteristics to the preprocessor.  */
*************** extern void override_options ();
*** 475,481 ****
  #define STACK_BOUNDARY 64
  
  /* Allocation boundary (in *bits*) for the code of a function.  */
! #define FUNCTION_BOUNDARY 256
  
  /* Alignment of field after `int : 0' in a structure.  */
  #define EMPTY_FIELD_BOUNDARY 64
--- 484,491 ----
  #define STACK_BOUNDARY 64
  
  /* Allocation boundary (in *bits*) for the code of a function.  */
! extern int alpha_align_funcs;	/* power of two alignment for functions */
! #define FUNCTION_BOUNDARY (1 << (alpha_align_funcs + 3))
  
  /* Alignment of field after `int : 0' in a structure.  */
  #define EMPTY_FIELD_BOUNDARY 64
*************** extern void override_options ();
*** 486,506 ****
  /* A bitfield declared as `int' forces `int' alignment for the struct.  */
  #define PCC_BITFIELD_TYPE_MATTERS 1
  
! /* Align loop starts for optimal branching.  
! 
!    ??? Kludge this and the next macro for the moment by not doing anything if
!    we don't optimize and also if we are writing ECOFF symbols to work around
!    a bug in DEC's assembler. */
! 
! #define LOOP_ALIGN(LABEL) \
!   (optimize > 0 && write_symbols != SDB_DEBUG ? 4 : 0)
! 
! /* This is how to align an instruction for optimal branching.  On
!    Alpha we'll get better performance by aligning on an octaword
!    boundary.  */
! 
! #define LABEL_ALIGN_AFTER_BARRIER(FILE)	\
!   (optimize > 0 && write_symbols != SDB_DEBUG ? 4 : 0)
  
  /* No data type wants to be aligned rounder than this.  */
  #define BIGGEST_ALIGNMENT 64
--- 496,510 ----
  /* A bitfield declared as `int' forces `int' alignment for the struct.  */
  #define PCC_BITFIELD_TYPE_MATTERS 1
  
! /* Align loop starts for optimal branching.  */
! extern int alpha_align_loops;		/* power of two alignment for loops */
! #define LOOP_ALIGN(LABEL) (alpha_align_loops)
! 
! /* This is how to align an instruction for optimal branching.
!    On i486 we'll get better performance by aligning on a
!    cache line (i.e. 16 byte) boundary.  */
! extern int alpha_align_jumps;		/* power of two alignment for jumos */
! #define LABEL_ALIGN_AFTER_BARRIER(LABEL) (alpha_align_jumps)
  
  /* No data type wants to be aligned rounder than this.  */
  #define BIGGEST_ALIGNMENT 64


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]