This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: Remove SLOW_BYTE_ACCESS from get_best_mode


On Wed, Sep 13, 2006 at 11:26:36PM -0700, H. J. Lu wrote:
> On Wed, Sep 13, 2006 at 06:03:28PM -0700, H. J. Lu wrote:
> > On Thu, Sep 14, 2006 at 12:20:31AM +0100, Paul Brook wrote:
> > > > Here is the updated patch. I ran a micro benchmark with code difference
> > > > similar to twolf:
> > > >
> > > > -       movl    (%rdi), %eax
> > > > -       xorb    %ah, %ah
> > > > -       subl    $1, %eax

This is a partial register stall. There is a 60% penalty on Core 2.
I updated my patch only not to do it for -Os.

> > > > +       movq    (%rdi), %rax
> > > > +       andl    $4294902015, %eax
> > > > +       subq    $1, %rax
> > > >         jne     .L8
> > > >
> > > > There is no speed difference on Nocona. I checked my SPEC results.
> > > > twolf isn't very stable on my Nocona.
> > > 
> > > The former is significantly smaller though (7 bytes vs. 12 bytes).
> > > 
> > 
> 
> In general, widen mode of access to a full word is good. But x86-64 is
> a special case due to bigger code size of DImode. This updated patch
> won't widen SImode.
> 
> BTW, I didn't update SLOW_BYTE_ACCESS document since I am not clear
> what it is supposed to do.
> 
> 

Here is an updated patch.

H.J.
----
2006-09-13  H.J. Lu  <hongjiu.lu@intel.com>

	* config/i386/i386.h (SLOW_BYTE_ACCESS): Update comment.
	(WIDEN_MODE_ACCESS_BITFIELD): New.
	(SLOW_SHORT_ACCESS): Removed.

	* doc/tm.texi (SLOW_BYTE_ACCESS): Updated.
	(WIDEN_MODE_ACCESS_BITFIELD): Document.

	* defaults.h (WIDEN_MODE_ACCESS_BITFIELD): New. Default to
	SLOW_BYTE_ACCESS.

	* stor-layout.c (SLOW_BYTE_ACCESS): Renamed to ...
	(WIDEN_MODE_ACCESS_BITFIELD): This.

--- gcc/config/i386/i386.h.slow	2006-09-13 10:56:45.000000000 -0700
+++ gcc/config/i386/i386.h	2006-09-14 06:37:58.000000000 -0700
@@ -1834,7 +1834,7 @@ do {							\
    require more than one instruction or if there is no difference in
    cost between byte and (aligned) word loads.
 
-   When this macro is not defined, the compiler will access a field by
+   When this macro is zero, the compiler will access a field by
    finding the smallest containing object; when it is defined, a
    fullword load will be used if alignment permits.  Unless bytes
    accesses are faster than word accesses, using word accesses is
@@ -1844,8 +1844,11 @@ do {							\
 
 #define SLOW_BYTE_ACCESS 0
 
-/* Nonzero if access to memory by shorts is slow and undesirable.  */
-#define SLOW_SHORT_ACCESS 0
+/* Define this macro as a C expression which is nonzero if mode of
+   accessing a bitfield memory should be widened.  We don't widen
+   SImode if we optimize for size since it has smaller code size.  */
+#define WIDEN_MODE_ACCESS_BITFIELD(MODE) \
+  (!optimize_size || (MODE) != SImode)
 
 /* Define this macro to be the value 1 if unaligned accesses have a
    cost many times greater than aligned accesses, for example if they
--- gcc/defaults.h.slow	2006-03-22 07:17:20.000000000 -0800
+++ gcc/defaults.h	2006-09-13 23:10:35.000000000 -0700
@@ -895,4 +895,10 @@ Software Foundation, 51 Franklin Street,
 #define INCOMING_FRAME_SP_OFFSET 0
 #endif
 
+/* Determines whether we should widen mode of access for bitfield.  Default
+   to SLOW_BYTE_ACCESS if not specified.  */
+#ifndef WIDEN_MODE_ACCESS_BITFIELD
+#define WIDEN_MODE_ACCESS_BITFIELD(MODE) SLOW_BYTE_ACCESS
+#endif
+
 #endif  /* ! GCC_DEFAULTS_H */
--- gcc/doc/tm.texi.slow	2006-08-29 08:38:33.000000000 -0700
+++ gcc/doc/tm.texi	2006-09-13 23:16:21.000000000 -0700
@@ -5595,14 +5595,22 @@ faster than accessing a word of memory, 
 require more than one instruction or if there is no difference in cost
 between byte and (aligned) word loads.
 
-When this macro is not defined, the compiler will access a field by
-finding the smallest containing object; when it is defined, a fullword
+When this macro is zero, the compiler will access a field by
+finding the smallest containing object; when it is nonzero, a fullword
 load will be used if alignment permits.  Unless bytes accesses are
 faster than word accesses, using word accesses is preferable since it
 may eliminate subsequent memory access if subsequent accesses occur to
 other fields in the same word of the structure, but to different bytes.
 @end defmac
 
+@defmac WIDEN_MODE_ACCESS_BITFIELD (@var{mode})
+Define this macro as a C expression which is nonzero if @var{mode} of
+accessing a bitfield memory should be widened.
+
+When this macro is not defined, it will be defined as
+@code{SLOW_BYTE_ACCESS}.
+@end defmac
+
 @defmac SLOW_UNALIGNED_ACCESS (@var{mode}, @var{alignment})
 Define this macro to be the value 1 if memory accesses described by the
 @var{mode} and @var{alignment} parameters have a cost many times greater
--- gcc/stor-layout.c.slow	2006-09-07 11:13:09.000000000 -0700
+++ gcc/stor-layout.c	2006-09-13 23:11:56.000000000 -0700
@@ -2113,11 +2113,11 @@ fixup_unsigned_type (tree type)
 
    If no mode meets all these conditions, we return VOIDmode.
 
-   If VOLATILEP is false and SLOW_BYTE_ACCESS is false, we return the
-   smallest mode meeting these conditions.
+   If VOLATILEP is false and WIDEN_MODE_ACCESS_BITFIELD is false, we return
+   the smallest mode meeting these conditions.
 
-   If VOLATILEP is false and SLOW_BYTE_ACCESS is true, we return the
-   largest mode (but a mode no wider than UNITS_PER_WORD) that meets
+   If VOLATILEP is false and WIDEN_MODE_ACCESS_BITFIELD is true, we return
+   the largest mode (but a mode no wider than UNITS_PER_WORD) that meets
    all the conditions.
 
    If VOLATILEP is true the narrow_volatile_bitfields target hook is used to
@@ -2151,7 +2151,7 @@ get_best_mode (int bitsize, int bitpos, 
       || (largest_mode != VOIDmode && unit > GET_MODE_BITSIZE (largest_mode)))
     return VOIDmode;
 
-  if ((SLOW_BYTE_ACCESS && ! volatilep)
+  if ((WIDEN_MODE_ACCESS_BITFIELD (mode) && ! volatilep)
       || (volatilep && !targetm.narrow_volatile_bitfield()))
     {
       enum machine_mode wide_mode = VOIDmode, tmode;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]