This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: Remove SLOW_BYTE_ACCESS from get_best_mode
On Wed, Sep 13, 2006 at 06:03:28PM -0700, H. J. Lu wrote:
> On Thu, Sep 14, 2006 at 12:20:31AM +0100, Paul Brook wrote:
> > > Here is the updated patch. I ran a micro benchmark with code difference
> > > similar to twolf:
> > >
> > > - movl (%rdi), %eax
> > > - xorb %ah, %ah
> > > - subl $1, %eax
> > > + movq (%rdi), %rax
> > > + andl $4294902015, %eax
> > > + subq $1, %rax
> > > jne .L8
> > >
> > > There is no speed difference on Nocona. I checked my SPEC results.
> > > twolf isn't very stable on my Nocona.
> >
> > The former is significantly smaller though (7 bytes vs. 12 bytes).
> >
>
> The code is
>
> #define ARRAY_LENGTH 16
> typedef struct {
> char swYorN;
> short int key;
> short int nkey;
> long bot;
> } program;
>
> int
> foo (program* prog)
> {
> int i;
> for (i = 0 ; i < ARRAY_LENGTH; i++)
> {
> if (prog[i].swYorN == 1 && prog[i].key == 0)
> break;
> }
>
> return i;
> }
>
> get_best_mode is called with
>
> Breakpoint 1, get_best_mode (bitsize=32, bitpos=0, align=64,
> largest_mode=DImode, volatilep=0)
> at /net/gnu-13/export/gnu/src/gcc/gcc/gcc/stor-layout.c:2129
>
> Since it is aligned at 64, DImode is used. I think SImode should be
> used here.
>
>
In general, widen mode of access to a full word is good. But x86-64 is
a special case due to bigger code size of DImode. This updated patch
won't widen SImode.
BTW, I didn't update SLOW_BYTE_ACCESS document since I am not clear
what it is supposed to do.
H.J.
----
2006-09-13 H.J. Lu <hongjiu.lu@intel.com>
* config/i386/i386.h (SLOW_BYTE_ACCESS): Update comment.
(WIDEN_MODE_ACCESS_BITFIELD): New.
(SLOW_SHORT_ACCESS): Removed.
* doc/tm.texi (SLOW_BYTE_ACCESS): Updated.
(WIDEN_MODE_ACCESS_BITFIELD): Document.
* defaults.h (WIDEN_MODE_ACCESS_BITFIELD): New. Default to
SLOW_BYTE_ACCESS.
* stor-layout.c (SLOW_BYTE_ACCESS): Renamed to ...
(WIDEN_MODE_ACCESS_BITFIELD): This.
--- gcc/config/i386/i386.h.slow 2006-09-13 10:56:45.000000000 -0700
+++ gcc/config/i386/i386.h 2006-09-13 23:17:31.000000000 -0700
@@ -1834,7 +1834,7 @@ do { \
require more than one instruction or if there is no difference in
cost between byte and (aligned) word loads.
- When this macro is not defined, the compiler will access a field by
+ When this macro is zero, the compiler will access a field by
finding the smallest containing object; when it is defined, a
fullword load will be used if alignment permits. Unless bytes
accesses are faster than word accesses, using word accesses is
@@ -1844,8 +1844,10 @@ do { \
#define SLOW_BYTE_ACCESS 0
-/* Nonzero if access to memory by shorts is slow and undesirable. */
-#define SLOW_SHORT_ACCESS 0
+/* Define this macro as a C expression which is nonzero if mode of
+ accessing a bitfield memory should be widened. We don't widen
+ SImode since it has smaller code size. */
+#define WIDEN_MODE_ACCESS_BITFIELD(MODE) ((MODE) != SImode)
/* Define this macro to be the value 1 if unaligned accesses have a
cost many times greater than aligned accesses, for example if they
--- gcc/defaults.h.slow 2006-03-22 07:17:20.000000000 -0800
+++ gcc/defaults.h 2006-09-13 23:10:35.000000000 -0700
@@ -895,4 +895,10 @@ Software Foundation, 51 Franklin Street,
#define INCOMING_FRAME_SP_OFFSET 0
#endif
+/* Determines whether we should widen mode of access for bitfield. Default
+ to SLOW_BYTE_ACCESS if not specified. */
+#ifndef WIDEN_MODE_ACCESS_BITFIELD
+#define WIDEN_MODE_ACCESS_BITFIELD(MODE) SLOW_BYTE_ACCESS
+#endif
+
#endif /* ! GCC_DEFAULTS_H */
--- gcc/doc/tm.texi.slow 2006-08-29 08:38:33.000000000 -0700
+++ gcc/doc/tm.texi 2006-09-13 23:16:21.000000000 -0700
@@ -5595,14 +5595,22 @@ faster than accessing a word of memory,
require more than one instruction or if there is no difference in cost
between byte and (aligned) word loads.
-When this macro is not defined, the compiler will access a field by
-finding the smallest containing object; when it is defined, a fullword
+When this macro is zero, the compiler will access a field by
+finding the smallest containing object; when it is nonzero, a fullword
load will be used if alignment permits. Unless bytes accesses are
faster than word accesses, using word accesses is preferable since it
may eliminate subsequent memory access if subsequent accesses occur to
other fields in the same word of the structure, but to different bytes.
@end defmac
+@defmac WIDEN_MODE_ACCESS_BITFIELD (@var{mode})
+Define this macro as a C expression which is nonzero if @var{mode} of
+accessing a bitfield memory should be widened.
+
+When this macro is not defined, it will be defined as
+@code{SLOW_BYTE_ACCESS}.
+@end defmac
+
@defmac SLOW_UNALIGNED_ACCESS (@var{mode}, @var{alignment})
Define this macro to be the value 1 if memory accesses described by the
@var{mode} and @var{alignment} parameters have a cost many times greater
--- gcc/stor-layout.c.slow 2006-09-07 11:13:09.000000000 -0700
+++ gcc/stor-layout.c 2006-09-13 23:11:56.000000000 -0700
@@ -2113,11 +2113,11 @@ fixup_unsigned_type (tree type)
If no mode meets all these conditions, we return VOIDmode.
- If VOLATILEP is false and SLOW_BYTE_ACCESS is false, we return the
- smallest mode meeting these conditions.
+ If VOLATILEP is false and WIDEN_MODE_ACCESS_BITFIELD is false, we return
+ the smallest mode meeting these conditions.
- If VOLATILEP is false and SLOW_BYTE_ACCESS is true, we return the
- largest mode (but a mode no wider than UNITS_PER_WORD) that meets
+ If VOLATILEP is false and WIDEN_MODE_ACCESS_BITFIELD is true, we return
+ the largest mode (but a mode no wider than UNITS_PER_WORD) that meets
all the conditions.
If VOLATILEP is true the narrow_volatile_bitfields target hook is used to
@@ -2151,7 +2151,7 @@ get_best_mode (int bitsize, int bitpos,
|| (largest_mode != VOIDmode && unit > GET_MODE_BITSIZE (largest_mode)))
return VOIDmode;
- if ((SLOW_BYTE_ACCESS && ! volatilep)
+ if ((WIDEN_MODE_ACCESS_BITFIELD (mode) && ! volatilep)
|| (volatilep && !targetm.narrow_volatile_bitfield()))
{
enum machine_mode wide_mode = VOIDmode, tmode;